LV_SRE_StreamSetParameter
Sets a new value for a stream property. Also adds the parameter and value pair to the request BTS.
int LV_SRE_StreamSetParameter(HPORT hport, int StreamParameter, unsigned long StreamParameterValue);
Return Values
LV_SUCCESS
No errors; parameter changed correctly.
A negative result indicates a specific error:
-11: LV_INVALID_PROPERTY_VALUE: The StreamParameterValue is out of range for the specified parameter.
-17: INVALID_PROPERTY: The specified parameter does not exist.
Remarks
The streaming parameters set by this function are very important for the Engine to correctly determine when speech begins and ends, which in turn is important for getting accurate recognition. A common problem in speech applications is that the Engine may cut off the start or end of an utterance if these settings are incorrect.
See our recommended Engine settings for some ideas of how to alter these settings for different types of applications.
Note About VAD Settings
In the July 2007 release of the Engine (7.5.600), the voice activity detection (VAD) parameters have been overhauled as a result of significant changes to the way VAD functions. There are now fewer parameters to adjust, as the Engine will now be smarter at separating voice activity from other noises.
The following stream parameters have been removed: VAD_BARGEIN_LVL, BARGE_IN_DYNAMIC_ADJUST, BARGE_IN_NOISE_COUNT_LOW_THRESHOLD, USE_FREQ_VAD, NOTIFY_OF_BEEPS, VAD_NOISE_FLOOR, and VAD_BURST_THLD.
BARGE_IN_BEGIN_DELAY has been changed from using increments of 1/8 seconds to milliseconds.
Two new parameters have been added: VAD_VOLUME_SENSITIVITY and VAD_SNR_SENSITIVITY. They are similar to the older Barge-In Level and Noise Floor parameters, but have some important differences.
Parameters
HPort
The port's handle.
StreamParameter
Stream parameter to change. See Properties, below.
StreamParameterValue
New stream parameter value.
Properties
Decode Properties:
STREAM_PARM_DECODE_FLAGS
- Description: Allows for the setting of multiple flags. These are the same flags that would be entered in to the system via the Decode Method. Currently, the only flag that can be set is LV_DECODE_SEMANTIC_INTERPRETATION, which tells the Engine to return a semantic interpretation. Without this tag the engine will ignore what is contained in SISR tag elements.
- Scope: Port
- Possible Values: LV_DECODE_SEMANTIC_INTERPRETATION
- Default Value: null
STREAM_PARM_VAD_WIND_BACK
- Description: The length of audio to be wound back at the beginning of voice activity. This is used primarily to counter instances where barge-in does not accurately capture the very start of speech. The resolution of this parameter is 1/8 of a second.
- Scope: Port
- Possible Values: Time (milliseconds).
- Default Value: 125ms
STREAM_PARM_VAD_EOS_DELAY
- Description: This is the amount of time, specified in milliseconds, that the Engine must detect silence after speech before it begins processing the utterance.
- Scope: Port
- Possible Values: Time (in milliseconds)
- Default Value: 500ms
STREAM_PARM_VAD_INIT_mode
- Description: This parameter tells the Engine's voice activity detection technology whether the audio stream contains leading silence or not. By default, the Engine expects to receive an audio stream that contains silence before speech. If your hardware is trimming the silence and sending audio data that just includes voice, you need to set this to SILENCE_TRIMMED.
- Scope: Port
- Possible Values: SILENCE_TRIMMED or SILENCE_UNTRIMMED.
- Default Value: SILENCE_UNTRIMMED
Streaming Properties:
STREAM_PARM_AUTO_DECODE
- Description: If active, the decode will start immediately on end-of-speech detection or a call to StopStream(). Otherwise, the application needs to call Decode to begin a decode.
- Scope: Port
- Possible Values: 0 or 1
- Default Value: 0 (off)
STREAM_PARM_BARGE_IN_TIMEOUT
- Description: The streaming interface will flag STREAM_STATUS_BARGE_IN_TIMEOUT, if no speech was detected in the time frame specify by this property.
- Scope: Port
- Possible Values: Time in milliseconds
- Default Value: -1 (infinite)
STREAM_PARM_DETECT_BARGE_IN
- Description: If active, the speech port will discard stream data until barge-in is detected.
- Scope: Port
- Possible Values: 0 or 1
- Default Value: 0 (off)
STREAM_PARM_vad_volume_sensitivity
- Description: The volume required to trigger barge-in. The smaller the value, the more sensitive barge-in will be. This is primarily used to deal with poor echo cancellation. By setting this value higher (less sensitive) prompts that are not properly cancelled will be less likely to falsely cancel barge-in.
- Scope: Port
- Possible Values: 1 to 100.
- Default Value: 50.
STREAM_PARM_vad_snr_sensitivity
- Description: Determines how much louder the speaker must be than the background noise in order to trigger barge-in. The smaller this value, the easier it will be to trigger barge-in.
- Scope: Port
- Possible Values: 1 to 100.
- Default Value: 50.
STREAM_PARM_DETECT_END_OF_SPEECH
- Description: Specifies if the stream interface will start discarding sound data once silence has been detected.
- Scope: Port
- Possible Values: 0 or 1
- Default Value: 0 (off)
STREAM_PARM_END_OF_SPEECH_DETECTION
- Description: Changes voice activity detection delay for end of speech detection based to one of four modes.
- Scope: Port
- Possible Values: Possible values are STREAM_END_OF_SPEECH_DETECTION_SINGLE_WORDS (500ms) STREAM_END_OF_SPEECH_DETECTION_PHRASES_NO_PAUSES (800ms), STREAM_END_OF_SPEECH_DETECTION_PHRASES_WITH_PAUSES (1200ms), and STREAM_END_OF_SPEECH_DETECTION_NORMAL (800ms).
- Default Value: STREAM_END_OF_SPEECH_DETECTION_NORMAL
STREAM_PARM_END_OF_SPEECH_TIMEOUT
- Description: After barge-in, the streaming interface will flag STREAM_STATUS_END_SPEECH_TIMEOUT, if it did detect end-of-speech in the time specified by this property. This is different from the end of speech delay; STREAM_PARM_END_OF_SPEECH_TIMEOUT represents the total amount of time a caller has to speak after barge-in is detected.
- Scope: Port
- Possible Values: Time in milliseconds
- Default Value: -1 (infinite)
STREAM_PARM_SOUND_FORMAT
- Description: The sound format handled by the stream.
- Scope: Port
- Possible Values: ULAW_8KHZ, PCM_8KHZ, PCM_16KHZ, ALAW_8KHZ, SPX_8KHZ, SPX_16KHZ
- Default Value: ULAW_8KHZ
See Also
Complete Help Topic List | Speech Engine Product Information