German

Flat
New 27k words 70h german model released
User: guenter
Date: 9/4/2016 4:53 am
Views: 7226
Rating: 0

The latest release of my audio models built from voxforge submissions is up to 70 hours of audio and 27k dictionary entries, available for download here

http://goofy.zamia.org/voxforge/de/

This release includes:

  • A CMU Sphinx audio model
  • Several Kaldi models (still very experimental)
  • A Sequitur g2p model
  • Language models created using cmuclmtk and srilm

 

For the first time, the audio models include small portions of openpento und german-speechdata-package-v2.tar.gz - reviewing and transcribing those is quite laborious, so it will take some time until they are fully reviewed and integrated into the models. 

Also note that this model includes more distant-microphone recordings than older releases which means the word error rate has increased accordingly.

Some stats:

27625 lexicon entries.
total duration of all good submissions:  71:01:23
cmusphinx model: SENTENCE ERROR: 15.8% (408/2575)   WORD ERROR RATE: 4.3% (891/20534)
kaldi models: 
%WER 10.11 [ 2075 / 20520, 156 ins, 807 del, 1112 sub ] exp/tri3b_mmi/decode_it4/wer_12
%WER 9.97 [ 2046 / 20520, 170 ins, 731 del, 1145 sub ] exp/tri3b_mmi_b0.05/decode_it4/wer_11
%WER 9.86 [ 2023 / 20520, 155 ins, 738 del, 1130 sub ] exp/tri3b_mmi/decode_it3/wer_12
%WER 9.77 [ 2005 / 20520, 151 ins, 736 del, 1118 sub ] exp/tri3b_mmi_b0.05/decode_it3/wer_12
%WER 9.04 [ 1854 / 20520, 160 ins, 509 del, 1185 sub ] exp/tri3b_mpe/decode_it3/wer_16
%WER 8.85 [ 1817 / 20520, 200 ins, 390 del, 1227 sub ] exp/tri3b_mpe/decode_it4/wer_13
%WER 8.62 [ 1769 / 20520, 334 ins, 292 del, 1143 sub ] exp/tri3b/decode.si/wer_15
%WER 6.73 [ 1381 / 20520, 203 ins, 291 del, 887 sub ] exp/tri2b/decode/wer_14
%WER 6.42 [ 1317 / 20520, 141 ins, 311 del, 865 sub ] exp/tri1/decode/wer_12
%WER 6.27 [ 1287 / 20520, 153 ins, 301 del, 833 sub ] exp/tri2a/decode/wer_13
%WER 5.49 [ 1126 / 20520, 185 ins, 222 del, 719 sub ] exp/tri3b/decode/wer_16
%WER 5.47 [ 1123 / 20520, 159 ins, 260 del, 704 sub ] exp/tri2b_mpe/decode_it3/wer_14
%WER 5.41 [ 1110 / 20520, 136 ins, 268 del, 706 sub ] exp/tri2b_mpe/decode_it4/wer_15
%WER 5.19 [ 1065 / 20520, 126 ins, 272 del, 667 sub ] exp/tri2b_mmi/decode_it3/wer_13
%WER 5.16 [ 1058 / 20520, 123 ins, 270 del, 665 sub ] exp/tri2b_mmi_b0.05/decode_it3/wer_13
%WER 5.13 [ 1053 / 20520, 116 ins, 276 del, 661 sub ] exp/tri2b_mmi/decode_it4/wer_14
%WER 5.05 [ 1037 / 20520, 111 ins, 271 del, 655 sub ] exp/tri2b_mmi_b0.05/decode_it4/wer_13
The scripts used to create those models can be found here:
https://github.com/gooofy/nlp
Re: New 27k words 70h german model released
User: guenter
Date: 9/22/2016 1:47 pm
Views: 3412
Rating: 1

Just uploaded the latest 20160922 release of the german voxforge model to

http://goofy.zamia.org/voxforge/de/

review for this release covers all openpento submissions as well as the whole "dev" subset of german-speechdata-package-v2.tar.gz - all in all over 80 hours of material.

stats:

29327 lexicon entries.
total duration of all good submissions:  81:47:37

cmusphinx model: SENTENCE ERROR: 19.9% (579/2910)   WORD ERROR RATE: 7.1% (1732/24264)
kaldi models: 
%WER 9.90 [ 2399 / 24240, 432 ins, 330 del, 1637 sub ] exp/tri3b/decode.si/wer_14
%WER 9.60 [ 2326 / 24240, 223 ins, 548 del, 1555 sub ] exp/tri1/decode/wer_12
%WER 9.29 [ 2251 / 24240, 208 ins, 487 del, 1556 sub ] exp/tri3b_mmi/decode_it4/wer_14
%WER 9.20 [ 2231 / 24240, 182 ins, 530 del, 1519 sub ] exp/tri3b_mmi_b0.05/decode_it4/wer_15
%WER 9.06 [ 2197 / 24240, 195 ins, 493 del, 1509 sub ] exp/tri3b_mmi/decode_it3/wer_14
%WER 9.04 [ 2192 / 24240, 190 ins, 477 del, 1525 sub ] exp/tri3b_mpe/decode_it3/wer_14
%WER 8.93 [ 2165 / 24240, 193 ins, 460 del, 1512 sub ] exp/tri3b_mpe/decode_it4/wer_15
%WER 8.91 [ 2160 / 24240, 187 ins, 490 del, 1483 sub ] exp/tri3b_mmi_b0.05/decode_it3/wer_14
%WER 8.87 [ 2151 / 24240, 262 ins, 453 del, 1436 sub ] exp/tri2a/decode/wer_14
%WER 7.74 [ 1877 / 24240, 220 ins, 399 del, 1258 sub ] exp/tri2b/decode/wer_16
%WER 6.33 [ 1535 / 24240, 228 ins, 282 del, 1025 sub ] exp/tri3b/decode/wer_17
%WER 6.16 [ 1492 / 24240, 153 ins, 350 del, 989 sub ] exp/tri2b_mpe/decode_it3/wer_15
%WER 6.10 [ 1478 / 24240, 144 ins, 350 del, 984 sub ] exp/tri2b_mpe/decode_it4/wer_15
%WER 5.59 [ 1356 / 24240, 135 ins, 333 del, 888 sub ] exp/tri2b_mmi_b0.05/decode_it3/wer_13
%WER 5.59 [ 1355 / 24240, 118 ins, 350 del, 887 sub ] exp/tri2b_mmi/decode_it3/wer_14
%WER 5.49 [ 1330 / 24240, 122 ins, 335 del, 873 sub ] exp/tri2b_mmi/decode_it4/wer_14
%WER 5.42 [ 1313 / 24240, 117 ins, 339 del, 857 sub ] exp/tri2b_mmi_b0.05/decode_it4/wer_14
sequitur g2p model:
    total: 2933 strings, 34151 symbols
    successfully translated: 2932 (99.97%) strings, 34144 (99.98%) symbols
        string errors:       1240 (42.29%)
        symbol errors:       2756 (8.07%)
            insertions:      927 (2.71%)
            deletions:       956 (2.80%)
            substitutions:   873 (2.56%)
    translation failed:      1 (0.03%) strings, 7 (0.02%) symbols
    total string errors:     1241 (42.31%)
    total symbol errors:     2763 (8.09%)

 

PreviousNext