Added new utilities related to phonetic spellings and our internal dictionary of words. You can now use LV_SRE_CheckWordsInDictionary (C API) / CheckWordsInDictionary (C++) to determine whether a word or string of words is in the dictionary of a given language. LV_SRE_GetPhoneticPronunciationCount (C API) / GetPhoneticPronunciationCount (C++) returns the number of pronunciations the Engine has for a string of words, and LV_SRE_GetPhoneticPronunciation (C API) / GetPhoneticPronunciation (C++ API) returns the actual phonemes for a string of words.
Improved the Engine's performance in a number of ways, particularly relating to memory use on Linux. Fixed some potential memory leaks.
The directory /opt/lumenvox/engine/Lang/Responses/ is now created during Linux installation. This should alleviate problems some users encountered with creating .callsre files. This directory should be modifiable by all users on a system. If you are having problems creating response files on Linux, please ensure that you have permission to write to this directory (and all of its subfolders).
Improved memory performance for the MRCP Server.
The MRCP Server has also had the barge-in-begin-delay and init-time parameters removed, and now has an initial-silence-trimmed setting that is equivalent to our STREAM_PARM_VAD_INIT_MODE API setting. See Recognizer Properties in the MRCP section for more information.
To help clarify equivalent parameters in VXML, MRCP, and the LumenVox API, we have added a help page describing how these properties relate to one another.
A new property has been added to the Engine, allowing developers to optimize for greater speed or greater accuracy. This property is PROP_EX_DECODE_OPTIMIZATION, set by either LV_SRE_SetPropertyEx or LVSpeechPort::SetPropertyEx. By default, the Engine switches between the modes, optimizing for accuracy unless the server becomes too busy, at which point it optimizes speed. Our tests show decodes to be about 33 percent faster when optimized for speed, at the expense of 0.2 percentage points of accuracy.
This release of the Engine contains an improved pronunciation generator. This means that when a grammar contains words not in the Engine's internal dictionary for a given language, it should be more accurate at determining how that word is pronounced. This most useful for foreign names, and our accuracy tests have shown improvements as high as 10 percentage points in large names tests.
64-bit versions of the speech client are now available on Linux. We currently offer builds for Fedora Core 5, Fedora Core 7, and Red Hat Enterprise Linux 5 (which includes CentOS 5). We will be creating 64-bit packages for all of our supported Linux distributions.
Our documentation on Writing SRGS Grammars has been updated. We have also added a page on SRGS Best Practices that should help guide developers when designing grammars.
The Speech Engine will now save cached copies of grammars after compiling them. This should save significant time when loading large grammars that have been previously loaded, as it means they do not need to be compiled again. The caching settings may be changed. See Grammar Caching for details.
Grammar compilation has also been generally optimized. Decode times are now up to 10 percent faster for large grammars.
Added a decode property called PROP_EX_LOAD_GRAMMAR_TIMEOUT that specifies how long the client should wait for a grammar to be loaded. See LV_SRE_SetPropertyEx or LVSpeechPort::SetPropertyEx for details.
We have improved confidence scores on the natural numbers, dates, and currency domains. This means correct results should have higher confidence scores than before and incorrect results will have lower confidence scores.
How the Engine handles concept/phrase grammars has been revamped. Internally, it is converting all concept/phrase grammars to SRGS files. This should not affect your applications. It fixes several bugs developers were encountering when using concept/phrase grammars with languages other than US English. Note that while viewing response files in the LumenVox Speech Tuner, you will now see all grammars as SRGS grammars.
This help file has had its entire Semantic Interpretation for Speech Recognition section rewritten to reflect the current SISR standard and LumenVox's implementation. See Intro to Semantic Interpretation for full details.
The Voice Activity Detection has been completely revamped. The Engine should now be more accurate at filtering out background noise, barging in correctly, etc. These changes, however, have caused us to alter several streaming parameters.
The following stream parameters have been removed: VAD_BARGEIN_LVL, BARGE_IN_DYNAMIC_ADJUST, BARGE_IN_NOISE_COUNT_LOW_THRESHOLD, USE_FREQ_VAD, NOTIFY_OF_BEEPS, VAD_NOISE_FLOOR, and VAD_BURST_THLD.
BARGE_IN_BEGIN_DELAY has been changed from using increments of 1/8 seconds to milliseconds.
Two new parameters have been added: VAD_VOLUME_SENSITIVITY and VAD_SNR_SENSITIVITY. They are similar to the older Barge-In Level and Noise Floor parameters, but have some important differences.
Please be certain to take a look at LV_SRE_StreamSetParameter (C API) or StreamSetParameter (C++ API) for more details. You can also see Sensitivity Settings for more information about volume and SNR sensitivity.
Fixed a bug that was causing problems if active grammars were unloaded before being deactivated. You may now unload active grammars without deactivating them first.
There is a new Colombian Spanish model that should provide better accuracy. It has replaced the old model. Please note that this model requires a bit more memory to use, but it is more robust.
An error was fixed that was causing .callsre files to get out-of-synch if logging was suppressed while the call ended.
The Engine now ships with several acoustic models, including new and improved models for Spanish, French, and Australian English. See Working With Languages for information about using the different languages.
Our Voice Activity Detection has been revamped to be more accurate. In particular, the Engine should be better at dealing with leading silence, breaths, and background noise when streaming audio.
The Engine now supports the current SISR specification. To maintain backwards compatibility, the old SISR will still function if you specify lumenvox/1.0 or semantics/1.0 as the tag format. To use the current implementation, specify semantics/1.0.2006 to use script-tag syntax and semantics/1.0.2006-literals for the string-literal syntax. You can read more information about the standard at http://www.w3.org/TR/semantic-interpretation/
Please note that in this release you cannot mix the tag formats. If you have a grammar with a tag format that uses the old lumenvox/1.0, you cannot reference a rule in a different grammar that uses the string-literal syntax.
You can now call a function to get the last error that occurred when doing a decode. See LV_SRE_GetLastDecodeError or GetLastDecodeError.
Added lists of Australian English Phonemes and Canadian French Phonemes for those acoustic models.
If you load a grammar using a # reference, the root rule for that grammar will be set to the rule name following the # symbol.
The built-in currency grammar is now more robust, as it can handle values of hundreds, thousands, millions, fractions of dollars, and positive and negative values.
Fixed a bug that was causing problems when using apostrophes with GrXML grammars.
Added a page documenting the list of Spanish phonemes used in our Mexican Spanish acoustic model.
Precompiled grammars are now named properly.
Grammars used in the unit test are now properly deactivated after use to prevent two grammars from different languages being active at the same time.
Fixed a bug in the Linux version that was causing it to take too long to load grammars.
XML output from the Engine now puts quotation marks around attribute values.
You may now only have one acoustic model active at a time. This means you cannot mix two languages at once, e.g. get recognitions in Spanish and English on the same decode. It also means you cannot use the regular American English language model at the same you use the American English digits-only model. Because this error is determined on the server side, if you are not using LV_DECODE_BLOCK at decode time, or if you are not setting a logging callback function on the client when opening a port, you will not receive an error message.
The language models included with this release are more accurate in several domains, such as identifying digits. However, they also require more memory than previous releases. If you wish to use the less accurate, but smaller and less memory intensive models, we have instructions on using those models.
Added a grammar compilation console application to make it easier to precompile grammars. Compiling large grammars will allow the Speech Engine to load them significantly faster.
When loading a grammar, you may now specify a URL that points to a compiled grammar
Added new functions called InterpretText (for the C++ API) and LV_SRE_InterpretText (C API). These functions take a string and return a semantic interpretation using the loaded grammar.
Errors related to grammars will now provide more verbose messages, including the line and column numbers in the grammar file where the error has occurred.
XML output from the engine no longer contains new lines.
Fixed an error where all global grammars were defaulting to the American English language model.
Fixed an error that occurred when using grXML DTMF grammars.
The Engine now supports $recognized.text as an SISR tag. It allows you to get the raw text back from the Engine.
ABNF grammars now support the use of apostrophes in words to be recognized (e.g. you can "John's" or "We're" in grammars). However, the use of apostrophes in place of quotation marks within rules is no longer valid syntax. Please note that if you were using a pair of apostrophes instead of double quotes in your grammars, they will no longer parse correctly.
When a client sets properties with a server, it can now set the priority of the decode request. This allows for smarter processing in a distributed environment.
The Engine will now correctly implement the _attributes and _values tags as required by the SISR standard.
Fixed a small bug with Spanish digit phonemes.
Empty tags in GrXML grammars now return as $null rather than empty to comply with the standard.
Improvements in built-in date grammars have yielded an approximately 3% gain in accuracy tests.
A beta version of our upcoming Spanish language acoustic model is available. It includes built-in Spanish. This model, however, is not included in this release. If you would be interested in testing it, e-mail support@lumenvox.com for more information.
Using the built-in digit grammars (both English and Spanish) should now provide better results, as they are using special digits-only acoustic models. If you want to build custom digits grammars (e.g. you wanted to apply weights within a digits-only grammar), you can use the digits-only acoustic models by specifying the language as en-US-di for American English or es-di for Spanish.
The phonetic speller now has pronunciation information for Spanish, French, and Italian characters. If a word cannot be found in the normal dictionary, the Engine will attempt to get the phonemes for a word using the phonetic speller.
The phonetic speller now accepts non-English characters.
For changes older than a year, see Older Release Notes.
Complete Help Topic List | Speech Engine Product Information