Speech Recognition Engines

Nested
my solution for the error [+2226]
User: imene
Date: 4/21/2009 12:21 pm
Views: 7418
Rating: 7

Hello

I tray in this post to answer myself about a question in which I found the solution recently

 

I work on integrate the auxiliaries features (like: pitch, and formants) in standard HMM based on MFCC

The problem is what I train my HMM with a long acoustic vector which contain MFCC+pitch+3 formant and theirs derivatives first and second, I meet also the error

[+2226] no training data

With In the HTK tool HREST.

 

Two day ago I found a solution, and I wand to shard it with you

 The error, it happened because the diagonal covariance component falls below 0:00001, then the corresponding mixture weight is set to zero.

 

The solution it has giving in the page 284 of htk3.3 book

“If any diagonal covariance component falls below 0:00001, then the corresponding mixture weight is set to zero. A warning is issued if the number of mixtures is greater than one, otherwise an error occurs. Applying a variance floor via the -v option or a variance floor macro can be used to prevent this.”

 

I used  option –v in the tool HRest with the value –v 0.000001 instead the default value 0.00001 and I haven’t any error

I hop that I help someone with this explication .and sorry for my bad English

 

--- (Edited on 4/21/2009 12:21 pm [GMT-0500] by imene) ---

--- (Edited on 4/21/2009 12:23 pm [GMT-0500] by imene) ---

Re: my solution for the error [+2226]
User: tpavelka
Date: 4/22/2009 2:28 am
Views: 117
Rating: 6

Hi,

interesting, I have tried working with formants a long time ago but without any success. Did you get any improvements in accuracy? How did you deal with the correlation of the formants?

--- (Edited on 22.04.2009 09:28 [GMT+0200] by tpavelka) ---

Re: my solution for the error [+2226]
User: imene
Date: 4/23/2009 3:47 pm
Views: 145
Rating: 6

<!-- /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-parent:""; margin:0cm; margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:"Times New Roman"; mso-fareast-font-family:"Times New Roman";} a:link, span.MsoHyperlink {color:blue; text-decoration:underline; text-underline:single;} a:visited, span.MsoHyperlinkFollowed {color:purple; text-decoration:underline; text-underline:single;} p {mso-margin-top-alt:auto; margin-right:0cm; mso-margin-bottom-alt:auto; margin-left:0cm; mso-pagination:widow-orphan; font-size:12.0pt; font-family:"Times New Roman"; mso-fareast-font-family:"Times New Roman";} @page Section1 {size:612.0pt 792.0pt; margin:72.0pt 90.0pt 72.0pt 90.0pt; mso-header-margin:36.0pt; mso-footer-margin:36.0pt; mso-paper-source:0;} div.Section1 {page:Section1;} -->

hi

i' am happy to find someone who intersected by my works.

My experiments were done using isolated data base of Arabic digits. And I have significant improvements of the recognition system in noisy environment, while I have any improvement in the clean environment.

 I try now to do experiment with a continues data base, TIMIT, but I haven’t any improvement yet.

 

I doesn’t use any deal on the formants, I use the log of the 3 first formants, extracted by  praat. I connected them directly in the acoustic vector with the MFCC parameter.

 

Sorry for my English, if you know French my mail is:

[email protected].

If you would like we can shard ideas

--- (Edited on 4/23/2009 3:47 pm [GMT-0500] by Visitor) ---

Re: my solution for the error [+2226]
User: tpavelka
Date: 4/24/2009 3:40 am
Views: 338
Rating: 5

> I connected them directly in the acoustic vector with the MFCC parameter.

I think the formant values may be correlated. Usually you use a diagonal covariance matrix (then the covariance matrix is just a variance vector) in the GMM definitions which requires that the individual elements of the feature vector are not correlated.

For correlated features you should either use full covariance matrix or add mixtures. But a covariance matrix for the whole vector would be very large so my suggestion is to split the feature vector into streams (HTK allows this) and use diagonal cov. matrix for the MFCCs and full covraiance for the formants, it may improve performance.

But, I have never actually tried streams or full covariance matrices so take this as just a suggestion, it is not guaranteed to work.

> Sorry for my English, if you know French

Unfortunatelly I do not speak French, I only know English and a bit of German.

--- (Edited on 24.04.2009 10:40 [GMT+0200] by tpavelka) ---

Re: my solution for the error [+2226]
User: Visitor
Date: 6/7/2010 1:10 am
Views: 1235
Rating: 4

Hi,

I had this problem, you help me a lot. :)

Best regards,

--- (Edited on 6/7/2010 1:10 am [GMT-0500] by Visitor) ---

Re: my solution for the error [+2226]
User: Visitor
Date: 7/31/2014 2:55 am
Views: 1560
Rating: 3

Hi..

Can u kindly tell me how to integrate pitch etc with mfcc features...

--- (Edited on 7/31/2014 2:55 am [GMT-0500] by Visitor) ---

PreviousNext