General Discussion

Flat
DFA/ sp-phone
User: dano
Date: 11/23/2008 9:55 am
Views: 6167
Rating: 6

I began a few days ago with some julius-testing. But I have a problem. I don't understand the syntax of the sample.dfa-file. My configuration can recognize just two words now (but that actually works great, 100% accuracy;)). Is there a tutorial or a readme about this?

And another question: in the dict file from the ubuntu package are many words, but some of theme have 'sp' phonemes. But Julius says they are not supported (the latest nightly build.) By removing the phoneme the words are recognized well.

--- (Edited on 23-11-2008 4:55 pm [GMT+0100] by dano) ---

Re: DFA/ sp-phone
User: kmaclean
Date: 11/25/2008 11:30 am
Views: 129
Rating: 4

Hi Dano,

>But I have a problem. I don't understand the syntax of the sample.dfa-file.

Sorry, I don't either...

>Is there a tutorial or a readme about this?

You might need to look at the code... why do you need to understand this?

>in the dict file from the ubuntu package are many words, but some of

>theme have 'sp' phonemes.

It's on my todo list: ticket #294.

Ken

--- (Edited on 11/25/2008 12:30 pm [GMT-0500] by kmaclean) ---

Re: DFA/ sp-phone
User: dano
Date: 11/26/2008 8:25 am
Views: 173
Rating: 7

Because when I add now more words with the same number, they  aren't recognized. Don't know why that is.

I can create a python script to get it to the right syntax.

This file is without sp phones, but it must be like sample.dict, right?

http://spraakherkenning.googlepages.com/dict2

So

2   [SEARCH] s er ch

?

 

--- (Edited on 26-11-2008 3:25 pm [GMT+0100] by dano) ---

Re: DFA/ sp-phone
User: dano
Date: 11/26/2008 8:48 am
Views: 105
Rating: 4

So I created the script

fileone = open('dict', 'r')
filetwo = open('dict2', 'w')
x = ""
y = 0
for line in fileone:
    line = line.replace(" sp","")
    first = 16
    second = line.index("]")-1
    if second > 17:
        wordone = line[first:second]
    else:
        wordone = line[first]
    y = second
    while line[y] == " ":
        y = y + 1
    word = line[y:]
    x = x + "2 "  + wordone + word
filetwo.write(x)
fileone.close()
filetwo.close()

here is the file

http://spraakherkenning.googlepages.com/dict3

--- (Edited on 26-11-2008 3:48 pm [GMT+0100] by dano) ---

Re: DFA/ sp-phone
User: dano
Date: 11/26/2008 8:51 am
Views: 79
Rating: 6

You can say it when there needs more to be changed :)

--- (Edited on 26-11-2008 3:51 pm [GMT+0100] by dano) ---

Re: DFA/ sp-phone
User: dano
Date: 11/26/2008 8:54 am
Views: 91
Rating: 6

No way, I did get the problem, I forgot to add the actual line in the dict file.

--- (Edited on 26-11-2008 3:54 pm [GMT+0100] by dano) ---

Re: DFA/ sp-phone
User: kmaclean
Date: 11/26/2008 10:07 am
Views: 117
Rating: 4

Hi Dano,

>Because when I add now more words with the same number

I'm a little confused here.... are you adding words directly to your ".dict" file?  If so, then I don't think this will work.

Julius assumes that you are compiling from a ".grammar" and ".voca" file using the mkdfa.pl script, which then generates your ".dfa", ".term" and ".dict" files.  If you have new words you want to recognize, you need to add them to your ".voca" file and then recompile with the mkdfa.pl script.

The ".dfa" (whose internal format I don't really understand...) is generated based on the contents of your ".grammar" and ".voca" files at compile time - i.e. it assumes that  the only words it needs to recognize are the ones in the ".voca" file when it was compiled.

The dfa file is a "deterministic finite automaton"... Google it for some more information.

Hope that clears things up.

Ken

--- (Edited on 11/26/2008 11:07 am [GMT-0500] by kmaclean) ---

Re: DFA/ sp-phone
User: dano
Date: 11/26/2008 12:07 pm
Views: 2103
Rating: 6

When I add words to sample.dict and sample.voca, it works.

Sample.voca

% NS_B
<s> sil

% NS_E
</s> sil

% WORD
SEARCH s er ch
OPEN ow p ax n
MUSIC m y uw z ix k
AFRAID ax f r ey d
IMAGE ix m ae jh

 

Sample.dict

0    [<s>]    sil
1    [</s>]    sil
3   [SEARCH] s er ch
3   [OPEN] ow p ax n
3   [MUSIC] m y uw z ix k
3   [AFRAID] ax f r ey d
3   [IMAGE] ix m ae jh

----

EDIT: ah, I understand I compiled it manually ;)

--- (Edited on 26-11-2008 7:09 pm [GMT+0100] by dano) ---

PreviousNext