Speech Recognition Engines

Flat
Howto use PocketSphinx
User: royerfa
Date: 3/6/2008 4:19 am
Views: 90714
Rating: 38

Hi everybody,

 

I know that this topic should be in the CMU sphinx forum, but I enjoye this one so.

Well, my problem is. On the pocketSphinxs web page

http://www.speech.cs.cmu.edu/pocketsphinx/

it is well explain how to compile and to install the sources.

But main problem, I don't find any examples or demos of the Sphinxpocket .!! and I know that they are somewhere !!

So does someone point me on the good website.

DOes it exist for LInux only executable with demo and a file for explaining howto use the sooftware ??

 

THx a lotSmile

regards 

--- (Edited on 3/6/2008 4:20 am [GMT-0600] by royerfa) ---

Re: Howto use PocketSphinx
User: royerfa
Date: 3/6/2008 6:30 am
Views: 261
Rating: 27

OK,

I saw that in the directories there are some model such as Turtle or an4.

 

But how to test ?? 

--- (Edited on 3/6/2008 6:30 am [GMT-0600] by royerfa) ---

Re: Howto use PocketSphinx
User: nsh
Date: 3/6/2008 11:02 am
Views: 285
Rating: 25

You have to build it first, download sphinxbase, unpack it, run

./configure && make && sudo make install

 

then download pocketsphinx, unpack it, run

 ./configure && make && sudo make install 

 then run pocketsphinx_test or pocketsphinx_tidigits. If you'll have troubles with build, google for some documentation on building Linux programs first. Something like 

  http://www.tuxfiles.org/linuxhelp/softinstall.html

 will help you. Also check sphinx wiki, it has tutorials but they are partially out-of-date.

 http://sphinx.subwiki.com

--- (Edited on 3/6/2008 11:02 am [GMT-0600] by nsh) ---

Re: Howto use PocketSphinx
User: royerfa
Date: 3/6/2008 11:43 am
Views: 254
Rating: 34

hello;

thanx you for the feedback.
It is no problem for me to install soft on Linux. PocketSphinkx and sphinxbase are already installed.

But, In fact I don't know how to use the soft. What command do Ihave to type in the shell to execute the programm; and How to use the example given in the directorie Model ??

I would like to test my voice and I don't know in the example given such as turtle in model/lm ; what sentences to say according to which grammar file ??

Do you understand my problem?

I can't appreciate the software without executing it !!

Sealed

--- (Edited on 3/6/2008 11:43 am [GMT-0600] by royerfa) ---

Re: Howto use PocketSphinx
User: nsh
Date: 3/6/2008 12:04 pm
Views: 409
Rating: 34

Run pocketsphinx_tidigits, say any sequence of numbers. Open this script with text editor and learn how to use pocketsphinx_continuous.

Also run pocketsphinx_swb, say any simple phrase. Swb is a language model for limited but rather big vocabulary (3000 words). You can also open this script if you like

Run any pocketsphinx binary like batch and continuous without arguments and get help on options. If you have any troubles just ask. Also you can ask on #cmusphinx irc channel on freenode.net.

--- (Edited on 3/6/2008 12:04 pm [GMT-0600] by nsh) ---

Re: Howto use PocketSphinx
User: Ishan
Date: 5/16/2008 12:05 pm
Views: 302
Rating: 21

Hi,

 I am getting the follwing error :(

/usr/local/bin/pocketsphinx_continuous: error while loading shared libraries: libpocketsphinx.so.0: cannot open shared object file: No such file or directory 

 

--- (Edited on 5/16/2008 12:05 pm [GMT-0500] by Visitor) ---

Re: Howto use PocketSphinx
User: Eric
Date: 5/17/2008 12:51 am
Views: 378
Rating: 20

 

 Hi All,

I want to install pocketPhinix to sharp zaurus PDA but pocketPhinix can’t compile on PDA since its low resources. Then source should be cross compile to PDA. Can anyone help me to build cross compiler to host that cross compile PocketPhinix into target of zaurus. hope you comment it on soon.

 

 

--- (Edited on 5/17/2008 12:51 am [GMT-0500] by Visitor) ---

Re: Howto use PocketSphinx
User: hb
Date: 8/25/2008 12:09 am
Views: 422
Rating: 13

set your LD_LIBRARY_PATH environment variable to point to the directory that contains libpocketsphinx.so.0

example: 

 LD_LIBRARY_PATH=/path/to/pocketsphinxlibs /usr/local/bin/pocketsphinx_continuous 

--- (Edited on 8/25/2008 12:09 am [GMT-0500] by Visitor) ---

Re: Howto use PocketSphinx
User: loic
Date: 4/11/2009 1:46 am
Views: 228
Rating: 9

Hello,


The installation of sphinxbase and pocketsphinx run fine and D_LIBRARY_PATH environment variable is set.

When I use the pocketsphinx_tdigits script, it seems to run well but after outputing the configuration (pasted at the end of the post), it outputs no single word since I am speaking (microphone was tested with audacity and works)


Thanks for your help

 

 
pocketsphinx_tidigits:
  Demo CMU PocketSphinx decoder with connected digit recognition.
 
<executing /home/loic/tutorial/pocketsphinx/build/bin/pocketsphinx_continuous, please wait>
INFO: cmd_ln.c(510): Parsing command line:
/home/loic/tutorial/pocketsphinx/build/bin/pocketsphinx_continuous \
        -fwdflat no \
        -bestpath no \
        -lm /home/loic/tutorial/pocketsphinx/build/share/pocketsphinx/model/lm/tidigits/tidigits.lm \
        -dict /home/loic/tutorial/pocketsphinx/build/share/pocketsphinx/model/lm/tidigits/tidigits.dic \
        -hmm /home/loic/tutorial/pocketsphinx/build/share/pocketsphinx/model/hmm/tidigits \
        -samprate 8000 \
        -nfft 256

Current configuration:
[NAME]          [DEFLT]         [VALUE]
-adcdev
-agc            none            none
-agcthresh      2.0             2.000000e+00
-alpha          0.97            9.700000e-01
-argfile
-ascale         20.0            2.000000e+01
-backtrace      no              no
-beam           1e-48           1.000000e-48
-bestpath       yes             no
-bestpathlw     9.5             9.500000e+00
-cep2spec       no              no
-ceplen         13              13
-cmn            current         current
-cmninit        8.0             8.0
-compallsen     no              no
-dict                           /home/loic/tutorial/pocketsphinx/build/share/pocketsphinx/model/lm/tidigits/tidigits.dic
-dictcase       no              no
-dither         no              no
-doublebw    no        no
-ds        1        1
-fdict               
-feat        1s_c_d_dd    1s_c_d_dd
-featparams           
-fillprob    1e-8        1.000000e-08
-frate        100        100
-fsg               
-fsgusealtpron    yes        yes
-fsgusefiller    yes        yes
-fwdflat    yes        no
-fwdflatbeam    1e-64        1.000000e-64
-fwdflatefwid    4        4
-fwdflatlw    8.5        8.500000e+00
-fwdflatsfwin    25        25
-fwdflatwbeam    7e-29        7.000000e-29
-fwdtree    yes        yes
-hmm                /home/loic/tutorial/pocketsphinx/build/share/pocketsphinx/model/hmm/tidigits
-input_endian    little        little
-jsgf               
-kdmaxbbi    -1        -1
-kdmaxdepth    0        0
-kdtree               
-latsize    5000        5000
-lda               
-ldadim        0        0
-lifter        0        0
-lm                /home/loic/tutorial/pocketsphinx/build/share/pocketsphinx/model/lm/tidigits/tidigits.lm
-lmctl               
-lmname        default        default
-logbase    1.0001        1.000100e+00
-logfn               
-logspec    no        no
-lowerf        133.33334    1.333333e+02
-lpbeam        1e-40        1.000000e-40
-lponlybeam    7e-29        7.000000e-29
-lw        6.5        6.500000e+00
-maxhistpf    100        100
-maxhmmpf    -1        -1
-maxnewoov    20        20
-maxwpf        -1        -1
-mdef               
-mean               
-mfclogdir           
-mixw               
-mixwfloor    0.0000001    1.000000e-07
-mllr               
-mmap        yes        yes
-ncep        13        13
-nfft        512        256
-nfilt        40        40
-nwpen        1.0        1.000000e+00
-pbeam        1e-48        1.000000e-48
-pip        1.0        1.000000e+00
-pl_beam    1e-10        1.000000e-10
-pl_pbeam    1e-5        1.000000e-05
-pl_window    0        0
-rawlogdir           
-remove_dc    no        no
-round_filters    yes        yes
-samprate    16000        8.000000e+03
-seed        -1        -1
-sendump           
-silprob    0.005        5.000000e-03
-smoothspec    no        no
-spec2cep    no        no
-svspec               
-tmat               
-tmatfloor    0.0001        1.000000e-04
-topn        4        4
-topn_beam    0        0
-toprule           
-transform    legacy        legacy
-unit_area    yes        yes
-upperf        6855.4976    6.855498e+03
-usewdphones    no        no
-uw        1.0        1.000000e+00
-var               
-varfloor    0.0001        1.000000e-04
-varnorm    no        no
-verbose    no        no
-warp_params           
-warp_type    inverse_linear    inverse_linear
-wbeam        7e-29        7.000000e-29
-wip        0.65        6.500000e-01
-wlen        0.025625    2.562500e-02

INFO: cmd_ln.c(510): Parsing command line:
\
    -dither yes \
    -lowerf 1 \
    -upperf 4000 \
    -nfilt 20 \
    -transform dct \
    -round_filters no \
    -remove_dc yes \
    -wlen 0.025 \
    -feat s2_4x \
    -agc none \
    -cmn current \
    -varnorm no

Current configuration:
[NAME]        [DEFLT]        [VALUE]
-agc        none        none
-agcthresh    2.0        2.000000e+00
-alpha        0.97        9.700000e-01
-cep2spec    no        no
-ceplen        13        13
-cmn        current        current
-cmninit    8.0        8.0
-dither        no        yes
-doublebw    no        no
-feat        1s_c_d_dd    s2_4x
-frate        100        100
-input_endian    little        little
-lda               
-ldadim        0        0
-lifter        0        0
-logspec    no        no
-lowerf        133.33334    1.000000e+00
-ncep        13        13
-nfft        512        256
-nfilt        40        20
-remove_dc    no        yes
-round_filters    yes        no
-samprate    16000        8.000000e+03
-seed        -1        -1
-smoothspec    no        no
-spec2cep    no        no
-svspec               
-transform    legacy        dct
-unit_area    yes        yes
-upperf        6855.4976    4.000000e+03
-varnorm    no        no
-verbose    no        no
-warp_params           
-warp_type    inverse_linear    inverse_linear
-wlen        0.025625    2.500000e-02

INFO: acmod.c(232): Parsed model-specific feature parameters from /home/loic/tutorial/pocketsphinx/build/share/pocketsphinx/model/hmm/tidigits/feat.params
INFO: fe_interface.c(288): You are using the internal mechanism to generate the seed.
INFO: feat.c(848): Initializing feature stream to type: 's2_4x', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: mdef.c(520): Reading model definition: /home/loic/tutorial/pocketsphinx/build/share/pocketsphinx/model/hmm/tidigits/mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(316): Reading binary model definition: /home/loic/tutorial/pocketsphinx/build/share/pocketsphinx/model/hmm/tidigits/mdef
INFO: bin_mdef.c(495): 34 CI-phone, 396 CD-phone, 5 emitstate/phone, 170 CI-sen, 670 Sen, 222 Sen-Seq
INFO: tmat.c(205): Reading HMM transition probability matrices: /home/loic/tutorial/pocketsphinx/build/share/pocketsphinx/model/hmm/tidigits/transition_matrices
INFO: acmod.c(109): Attempting to use SCHMM computation module
INFO: s2_semi_mgau.c(888): Reading S3 mixture gaussian file '/home/loic/tutorial/pocketsphinx/build/share/pocketsphinx/model/hmm/tidigits/means'
INFO: s2_semi_mgau.c(985): 1 mixture Gaussians, 256 components, 4 feature streams, veclen 51
INFO: s2_semi_mgau.c(888): Reading S3 mixture gaussian file '/home/loic/tutorial/pocketsphinx/build/share/pocketsphinx/model/hmm/tidigits/variances'
INFO: s2_semi_mgau.c(985): 1 mixture Gaussians, 256 components, 4 feature streams, veclen 51
INFO: s2_semi_mgau.c(664): Loading senones from dump file /home/loic/tutorial/pocketsphinx/build/share/pocketsphinx/model/hmm/tidigits/sendump
INFO: s2_semi_mgau.c(680): BEGIN FILE FORMAT DESCRIPTION
INFO: s2_semi_mgau.c(709): Rows: 256, Columns: 672
INFO: s2_semi_mgau.c(717): Using memory-mapped I/O for senones
INFO: s2_semi_mgau.c(1121): Maximum top-N: 4 Top-N beams: 0 0 0 0
INFO: dict.c(232): Allocating 20 placeholders for new OOVs
INFO: dict.c(491):     11 = words in file [/home/loic/tutorial/pocketsphinx/build/share/pocketsphinx/model/lm/tidigits/tidigits.dic]
INFO: dict.c(349): LEFT CONTEXT TABLES
INFO: dict.c(1010): Entry Context table contains
        12 entries
INFO: dict.c(1011):        408 possible cross word triphones.
INFO: dict.c(1049):        132 triphonesn       242 pseudo diphones
        34 uniphones
INFO: dict.c(1096): Exit Context table contains
        12 entries
INFO: dict.c(1097):        408 possible cross word triphones.
INFO: dict.c(1163):        132 triphones
       242 pseudo diphones
        34 uniphones
INFO: dict.c(1165):         79 right context entries
INFO: dict.c(1166):          6 ave entries per exit context
INFO: dict.c(355): RIGHT CONTEXT TABLES
INFO: dict.c(1096): Exit Context table contains
        12 entries
INFO: dict.c(1097):        408 possible cross word triphones.
INFO: dict.c(1163):        132 triphones
       242 pseudo diphones
        34 uniphones
INFO: dict.c(1165):         76 right context entries
INFO: dict.c(1166):          6 ave entries per exit context
INFO: ngram_model_arpa.c(539): ngrams 1=14, 2=1, 3=0
INFO: ngram_model_arpa.c(204): Reading unigrams
INFO: ngram_model_arpa.c(578):       14 = #unigrams created
INFO: ngram_model_arpa.c(260): Reading bigrams
INFO: ngram_model_arpa.c(594):        1 = #bigrams created
INFO: ngram_model_arpa.c(595):        2 = #prob2 entries
INFO: ngram_search_fwdtree.c(158): 0 root, 0 non-root channels, 24 single-phone words
INFO: ngram_search_fwdtree.c(197): Creating search tree
INFO: ngram_search_fwdtree.c(205): 0 root, 0 non-root channels, 24 single-phone words
INFO: ngram_search_fwdtree.c(327): max nonroot chan increased to 140
INFO: ngram_search_fwdtree.c(336): 10 root, 12 non-root channels, 4 single-phone words
INFO: continuous.c(261): /home/loic/tutorial/pocketsphinx/build/bin/pocketsphinx_continuous COMPILED ON: Apr 11 2009, AT: 08:18:56

--- (Edited on 4/11/2009 1:46 am [GMT-0500] by Visitor) ---

Re: Howto use PocketSphinx
User: nsh
Date: 4/11/2009 6:06 pm
Views: 296
Rating: 8

Audacity test means nothing. Test your microphone with

arecord -f S16_LE -r 8000 -D default > record.wav

instead

--- (Edited on 4/11/2009 6:06 pm [GMT-0500] by nsh) ---

PreviousNext