Frequently Asked Questions

Flat
What is a Dialog Manager?
User: kmaclean
Date: 1/1/2010 11:45 am
Views: 41454
Rating: -12

A Dialog Manager is one component of a Speech Recognition System.

Telephony and Command & Control Dialog Managers

A Dialog Manager used in Telephony applications (IVR - Interactive Voice Response), and in some desktop Command and Control Application, assigns meaning to the words recognized by the Speech Recognition Engine, determines how the utterance fits into the dialog spoken so far,and decides what to do next.  It might need to retrieve information from an external source.  If a response to the user is required, it will choose the words and phrases to be used in its response to the user, and transmit these to the Text-to-Speech System to speak the response to the user.

Dictation Dialog Manager 

A Dictation Dialog Manager will typically take the words recognized by the Speech Recognition Engine and type out the corresponding text on your computer screen.  It may also have some Command and Control elements, but these are usually limited to the types of commands typically used in a word processing program.  It usually responds to the user using text (i.e. it might not use Text to Speech to respond to the user).

Examples 

Examples of Telephony Dialog Managers include: 

Examples of Command & Control Dialog Managers:

Examples of Dictation Dialog Managers, with Command & Control elements, would be:

  • xVoice (needs IBM's ViaVoice engine for Linux - no longer available)
  • Evaldictator (uses Sphinx4)
  • speechoo (uses Julius) a dictation pad for LibreOffice
  • freespeech-vr Free streaming voice recognition with dynamic language learning

You can also write a domain specific application to perform Dialog Manager-like tasks using a traditional programming language (C, C++, Java, etc.) or a scripting Language (Perl, Python, Ruby, etc.). For example:

Re: What is a Dialog Manager?
User: kmaclean
Date: 3/9/2010 9:14 am
Views: 3154
Rating: -71

Here is video that describes an approach (Linux – remote controll with a voice) that uses Voximp as the dialog manager (which uses pocketsphinx), xbindkeys to bind program to a key and zenity to display notifications.

From the Voximp home page:

Voximp is an application which allows simple voice commands to be bound to spawn programs or simulate key/mouse presses. It's written in python and uses pocketsphinx for voice-recognition.

From the xbindkeys web page:

xbindkeys is a program that allows you to launch shell commands with your keyboard or your mouse under X Window. It links commands to keys or mouse buttons, using a configuration file. It's independant of the window manager and can capture all keyboard keys (ex: Power, Wake...).

From the zenity web page

Zenity is a tool that allows you to display Gtk+ dialog boxes from the command line and through shell scripts. It is similar to gdialog, but is intended to be saner. It comes from the same family as dialog, Xdialog, and cdialog, but it surpasses those projects by having a cooler name.

 

Re: What is a Dialog Manager?
User: kmaclean
Date: 10/2/2010 10:47 am
Views: 1664
Rating: -46

Here is an article (Google translated from Russian) that gives another example of using Julius with Python:

$ vi sample.voca

% NS_B
<s> sil

% NS_E
</s> sil

% ID
DO d uw

% COMMAND
PLAY pl ey
NEXT n eh kst
PREV pr iy v
SILENCE s ay l ax ns

$ vi sample.grammar

S: NS_B ID COMMAND NS_E

Create your grammar:

$ mkdfa sample

test with Julius

$ julius -input mic -C julian.jconf

$ vi command.py

def parse(line):
   params = [param.lower() for param in line.split() if param]
   commands = {
   'play': 'audacious2 -p',
   'silence': 'audacious2 -u',
   'next': 'audacious2 -f',
   'prev': 'audacious2 -r',
   }
   if params[1] in commands: os.popen(commands[params[1]])

Run as follows:

$ julius -quiet -input mic -C julian.jconf 2>/dev/null | ./command.py

PreviousNext