German

Nested
Audiobook contribution: "Das alte Haus" by Friedrich GERSTÄCKER
User: guenter
Date: 8/23/2014 5:53 am
Views: 12286
Rating: 0

Hi,

just uploaded these new files to voxforge FTP:

Hokuspokus-20140720-qah.tgz
Hokuspokus-20140723-qah.tgz
Hokuspokus-20140724-qah.tgz
Hokuspokus-20140730-qah.tgz
Hokuspokus-20140731-qah.tgz
Hokuspokus-20140802-qah.tgz
Hokuspokus-20140805-qah.tgz
Hokuspokus-20140808-qah.tgz
Hokuspokus-20140810-qah.tgz
Hokuspokus-20140812-qah.tgz
Karlsson-20140718-qah.tgz
Karlsson-20140722-qah.tgz
Karlsson-20140729-qah.tgz
Karlsson-20140731-qah.tgz
Karlsson-20140801-qah.tgz
Karlsson-20140803-qah.tgz
Karlsson-20140805-qah.tgz
Karlsson-20140809-qah.tgz
Karlsson-20140811-ftr.tgz
Karlsson-20140811-qah.tgz

The files were created by segmenting and aligning this audio book from librivox:

https://librivox.org/das-alte-haus-by-friedrich-gerstacker/

read by two people 'Hokuspokus' (female) and 'Karlsson' (male).

Could someone from voxforge please add them to the german speech corpus?

 

Thanks!

    guenter

Re: Audiobook contribution: "Das alte Haus" by Friedrich GERSTÄCKER
User: nsh
Date: 8/23/2014 11:06 am
Views: 37
Rating: 0

Looks great, what tools did you use for segmentation?

 

Re: Audiobook contribution: "Das alte Haus" by Friedrich GERSTÄCKER
User: guenter
Date: 8/23/2014 11:29 am
Views: 33
Rating: 0

I developed a few python scripts which rely on my review/db infrastructure and sphinx_align to semi-automate some of the tasks.

I have documented my workflow here:

https://github.com/gooofy/voxforge#audiobooks

I am planning to automate this process further, step by step - as the model grows I believe more and more tasks can be automated, at least to some degree. 

So, no sail-align yet but I am working towards it :)

 

Re: Audiobook contribution: "Das alte Haus" by Friedrich GERSTÄCKER
User: nsh
Date: 8/23/2014 11:43 am
Views: 47
Rating: 0

You need to try 

http://cmusphinx.sourceforge.net/2014/07/long-audio-aligner-landed-in-trunk/

It should automate most of your tasks, you can just feed in chapter audio and the whole book text from guttenberg and it will dump you an alignment.

 

 

Re: Audiobook contribution: "Das alte Haus" by Friedrich GERSTÄCKER
User: guenter
Date: 8/23/2014 2:40 pm
Views: 66
Rating: 0

Hey - always nice to see new tools in cmusphinx! :)

I am not entirely sure it will fit my workflow - from what I see it basically automates one of the tasks, audio-align.py, but I will definitely give it a try.

I am not conviced full automation is a good idea, at least in these early stages when the dictionary is small and the audio model not very robust. I am always worried I'd train too many systematic errors into my model (i.e. misspelled words, mispronounced audio, wrong transcripts, ...) so I tend to semi-automate things: let the computer do those 90% which it is really really sure of (think sphinx_align) and manually check the rest.

For semi-automating audio-align.py I was planning to simply run the sphinx recognizer on the segments of audio I have and match the result to the section text the audio originated from - and pick the first of the matches that have the smallest edit distance to the recognizer result. then, I would simply run my audio-sphinx-align.py tool hoping it will auto-accept 90% and manually check/fix the rest.

Anyway, from my experience with segmenting and aligning the audio book I have finished now, aligning text to audio segments is not very time-consuming even using my fully manual abook-align.py tool. A much larger effort is fixing the original text, i.e. spellcheck it, clean out numbers, abbreviations and - for old german texts this is of particular relevance since we had several major spelling reforms in the past decades - updating word spellings to the latest standards. some of it is simple search-and-replace, other cases are more tricky. I was wondering: is there a machine-learning based, trainable spell-fixer tool? 

Re: Audiobook contribution: "Das alte Haus" by Friedrich GERSTÄCKER
User: nsh
Date: 8/23/2014 2:50 pm
Views: 484
Rating: 0

> I am not entirely sure it will fit my workflow - from what I see it basically automates one of the tasks, audio-align.py, but I will definitely give it a try.

No, it automates full workflow. You can just give it raw text and audio and it does the job.

> I am not conviced full automation is a good idea, at least in these early stages when the dictionary is small and the audio model not very robust. I am always worried I'd train too many systematic errors into my model (i.e. misspelled words, mispronounced audio, wrong transcripts, ...) so I tend to semi-automate things: let the computer do those 90% which it is really really sure of (think sphinx_align) and manually check the rest.

Aligner handles that automatically
> then, I would simply run my audio-sphinx-align.py tool hoping it will auto-accept 90% and manually check/fix the rest.
You need to align about 40 books for a reasonable accurate database. It would be nice to make it automatically
 > I was wondering: is there a machine-learning based, trainable spell-fixer tool? 
Phonetisaurus. It can do spelling correction and it can learn on examples. You can use it for spelling correction or directly for g2p. 
Re: Audiobook contribution: "Das alte Haus" by Friedrich GERSTÄCKER
User: Binh
Date: 10/6/2014 10:15 am
Views: 35
Rating: 0

Hi everyone. I'm back.


I'm trying to use the audio aligner but I hit typical problems with german language.

http://cmusphinx.sourceforge.net/2014/07/long-audio-aligner-landed-in-trunk/


I looked at the call and analyzed what's need to be replaced. Feel free to correct me if I'm wrong somewhere.

java -cp sphinx4-samples/target/sphinx4-samples-1.0-SNAPSHOT-jar-with-dependencies.jar \
edu.cmu.sphinx.demo.aligner.AlignerDemo file.wav file.txt en-us-generic \
cmudict-5prealpha.dict cmudict-5prealpha.fst.ser

sphinx4-samples/target/sphinx4-samples-1.0-SNAPSHOT-jar-with-dependencies.jar
Position der Jar. No problem here

edu.cmu.sphinx.demo.aligner.AlignerDemo
Seems to be the path inside the jar. No problem either

file.wav
wav that should be analyzed

file.txt
text that should be aligned.

en-us-generic
Folder with model. I could put my german model here.

cmudict-5prealpha.dict
Seems to be the dictionary. Could replace it with my german dictionary

cmudict-5prealpha.fst.ser

This file is a really big problem since it seems to be created by some sort of g2p program. I am using a dictionary / Language Modell. So I don't really have a german g2p program. Although there is a german model.fst.ser file under downloads it is based on the voxforge dictionary not mine so it will certainly lead to problems.

Anyway to replace this file for german alignment?

Hi guenther. Hast du den Aligner schon zum laufen gebracht?


binh

Re: Audiobook contribution: "Das alte Haus" by Friedrich GERSTÄCKER
User: nsh
Date: 10/6/2014 10:28 am
Views: 42
Rating: 0

> Folder with model. I could put my german model here.

Those things are correct

> This file is a really big problem since it seems to be created by some sort of g2p program. I am using a dictionary / Language Modell. So I don't really have a german g2p program. Although there is a german model.fst.ser file under downloads it is based on the voxforge dictionary not mine so it will certainly lead to problems.

Well, if you share your dictionary I can generate you a g2p model to use quickly.

 

 

Re: Audiobook contribution: "Das alte Haus" by Friedrich GERSTÄCKER
User: guenter
Date: 10/6/2014 10:47 am
Views: 554
Rating: 0

Hi binh,

 

nein, den aligner habe ich noch nicht ausprobiert - ich habe ja meine eigenen tools fuer die audiobooks (siehe mein github repo), die tun fuer mich im moment alles was ich brauche.

 

Viele Gruesse,

 

   Guenter

Re: Audiobook contribution: "Das alte Haus" by Friedrich GERSTÃÂ�
User: Binh
Date: 10/21/2014 9:02 am
Views: 4321
Rating: 0

Sorry for the delay but I was called to another project which needed to be completed.


>Well, if you share your dictionary I can generate you a g2p model to use quickly.

Thanks I appreciate that but this could really difficult for you. We use a dictionary with a total of around 500k words. Based on google our system picks a subset of that an build a new dictionary for the recognizing.

Here is the link to the total

http://www.messe2media.com/files/messe2media.zip

The second problem is that is a automatic conversion of scotts german dictionary 2.8 plus words we added on our own. I already know Scott has some inconsistent parts.


Maybe the training dictionary is enough? Is it of course smaller.

http://www.messe2media.com/files/voxforge_training_FOLKextended.zip


Right now I reading a little bit about constructing g2p modells but sites I found so far are not really helpful.Wonder if I can automate the process of g2p modell generation? ^_^

Im not sure it this helps but most of our dictionary phonem words are identical with the output of this script

http://www.messe2media.com/files/espeak2phonesKorr.pl

It's a slighty modified verson of a script from Timo Baumann for espeak and german which we grabbed somewhere in this forum.


binh

PreviousNext