General Discussion

Flat
Unclear cause of errors when using trigram LM in Julius
User: Wout
Date: 5/23/2007 8:17 am
Views: 4968
Rating: 20
 Hello,

I have been trying to use HMM models that I've created with HTK and an
ARPA LM in Julius (version 3.5.3, multipath compile option enabled,
under Linux). Although the LM works without problems in HTK, Julius
generates many warnings when parsing the LM and, finally, an error. All
warnings are of this type:
> Warning: context (z_y:_t_,z_j_@_n_) not exist in LR 2-gram (ignored)
and this is the error:
> Error: 2-gram has no upper 3-gram, but not 0.0 back-off weight
What I don't understand is why HTK doesn't complain and Julius generates
many warnings and an error. Does anyone know what these warnings mean? In
other words: what is wrong with the LM, and why does Julius complain?

I am loading the bigram and trigram LMs with the -nlr and -nrl options.
The trigram LM file also contains the unigram and bigram LM. I found out
that if I remove all unigrams and bigrams from the trigram file, Julius
*does* start up in interactive mode. However, Julius gives an extra warning:
> Reading in RL 3-gram...
> Warning: 1-gram total num differ! may cause read error
> Warning: 2-gram total num differ! may cause read error
> reading 1-gram part...
I am not sure if this warning is important or not. Can the results be
trusted if I start Julius in this way?

I have made one modification to the LM I used for HTK, in order to get
it working in Julius. The LM used for HTK was in so-called "modified
ARPA" (see HTK book) format, in which the back-off weights are optional.
Julius doesn't (seem to) support this, so I filled in '0' everywhere a
back-off weight was required but not filled in. Is this a good thing to do?

Additionally, Julius doesn't load all trigrams: it stops after loading
about 60% of all trigrams. Does anyone know why Julius would do this? Is it
possible to find out if a triphone is responsible for stopping the loading?
> 3-gram read 2500000 (57%)
> <cut>
> 3-gram read 2592555 end
I would really appreciate your help!

Best regards,

Wout

Output from Julius:

include config: conf.jconf
###### check configurations
###### build up system
Reading in HMM definition...(ascii)...limit check passed
defined HMMs: 6849
logical names: 130051 in HMMList
base phones: 51 used in logical
done
Making pseudo bi/mono-phone for IW-triphone...5150 added as logical...done
Reading in dictionary...
4996 words...done
Reading in LR 2-gram...
reading 1-gram part...
1-gram read 4996 end
reading 2-gram part...
2-gram read 0 (0%)
2-gram read 100000 (7%)
2-gram read 200000 (15%)
<some lines removed>
2-gram read 1300000 (99%)
2-gram read 1306905 end
done
Reading in RL 3-gram...
reading 1-gram part...
1-gram read 4996 end
reading 2-gram part...
Warning: (E_@_l_,2:_) not exist in LR 2-gram (ignored)
Warning: (E_@_n_,2:_) not exist in LR 2-gram (ignored)
Warning: (r_o:_,2:_) not exist in LR 2-gram (ignored)
Warning: (t_s_e:_,2:_) not exist in LR 2-gram (ignored)
<removed 100.000 lines>
Warning: (z_a_x_,z_y:_t_) not exist in LR 2-gram (ignored)
Warning: (z_i:_p_,z_y:_t_) not exist in LR 2-gram (ignored)
Warning: (z_u:_,z_y:_t_) not exist in LR 2-gram (ignored)
Warning: (z_y:_,z_y:_t_) not exist in LR 2-gram (ignored)
2-gram read 1306905 end
reading 3-gram part...
3-gram read 0 (0%)
Warning: context (2:_,Q_u:_6:_) not exist in LR 2-gram (ignored)
Warning: context (2:_,Q_u:_6:_) not exist in LR 2-gram (ignored)
Warning: context (2:_,Q_u:_6:_) not exist in LR 2-gram (ignored)
Warning: context (2:_,d_i:_) not exist in LR 2-gram (ignored)
Warning: context (2:_,r_) not exist in LR 2-gram (ignored)
Warning: context (6:_,@_) not exist in LR 2-gram (ignored)
Warning: context (6:_,Q_I_u:_) not exist in LR 2-gram (ignored)
Warning: context (6:_,Q_I_u:_) not exist in LR 2-gram (ignored)
Warning: context (6:_,Q_I_u:_) not exist in LR 2-gram (ignored)
Warning: context (6:_,Q_I_u:_) not exist in LR 2-gram (ignored)
Warning: context (6:_,Q_I_u:_) not exist in LR 2-gram (ignored)
Warning: context (6:_,Q_I_u:_) not exist in LR 2-gram (ignored)
Warning: context (6:_,Q_I_u:_) not exist in LR 2-gram (ignored)
Warning: context (6:_,Q_I_u:_) not exist in LR 2-gram (ignored)
<removed 1.500.000 lines>
Warning: context (z_y:_t_,z_aI_n_) not exist in LR 2-gram (ignored)
Warning: context (z_y:_t_,z_aI_n_) not exist in LR 2-gram (ignored)
Warning: context (z_y:_t_,z_e:_6:_n_) not exist in LR 2-gram (ignored)
Warning: context (z_y:_t_,z_i:_t_) not exist in LR 2-gram (ignored)
Warning: context (z_y:_t_,z_j_@_n_) not exist in LR 2-gram (ignored)
Warning: context (z_y:_t_,z_j_@_n_) not exist in LR 2-gram (ignored)
Warning: context (z_y:_t_,z_j_@_n_) not exist in LR 2-gram (ignored)
Warning: context (z_y:_t_,z_j_@_n_) not exist in LR 2-gram (ignored)
Warning: context (z_y:_t_,z_o:_n_) not exist in LR 2-gram (ignored)
3-gram read 2592555 end
Error: 2-gram has no upper 3-gram, but not 0.0 back-off weight
Terminated

--- (Edited on 5/23/2007 8:17 am [GMT-0500] by Visitor) ---

Re: Unclear cause of errors when using trigram LM in Julius
User: kmaclean
Date: 5/23/2007 10:18 am
Views: 2060
Rating: 14

Hi Wout,

Are you using a *reverse* trigram with Julius?  From the Julius website (my emphasis added):

Julius adopts acoustic models in HTK ascii format, pronunciation dictionary in HTK-like format, and word 3-gram language models in ARPA standard format (forward 2-gram and reverse 3-gram as trained from corpus with reversed word order).

see this post for more info.

Hope this helps, 

Ken 

--- (Edited on 5/23/2007 11:18 am [GMT-0400] by kmaclean) ---

PreviousNext