Streaming

The Engine can accept a stream of sound data (instead of a single buffer in the standard interface). This interface can also handle barge-in detection and end-of-speech detection. There are a number of parameters that need to be set to configure the type of streaming the user application needs to do.

These parameters are set using LV_SRE_StreamSetParameter (C API) and LVSpeechPort::SetStreamParameter (C++). The documentation for those functions has a total list of possible parameters.

The primary parameters  that control how the stream works are:

STREAM_PARM_DETECT_BARGE_IN : if active the stream will detect barge-in, data streamed before the barge-in is discarded.  if not active all streamed data is used for recognition.

STREAM_PARM_DETECT_END_OF_SPEECH : if active Speech Engine will detect end-of-speech (period of silence) in stream and stop accepting further sound data when detected

STREAM_PARM_AUTO_DECODE : if active the STREAM_PARM_GRAMMAR_SET and STREAM_PARM_DECODE_FLAGS need to be set as well.  When end-of-speech is detected or LV_SRE_StreamStop() is called, the decode will begin immediatly.

You may also want to see our Recommended Engine Settings.

Code Snippet:  Set up a new stream, with barge-in and end-of-speech detection

#include <LVSRE2.h>

...

//sound data will be u-law
LV_SRE_StreamSetParameter(hPort, STREAM_PARM_SOUND_FORMAT, ULAW_8KHZ);

//use voice channel 0
LV_SRE_StreamSetParameter(hPort, STREAM_PARM_VOICE_CHANNEL, 0);

//use grammar set 2
LV_SRE_StreamSetParameter(hPort, STREAM_PARM_GRAMMAR_SET, 2);  


LV_SRE_StreamSetParameter(hPort, STREAM_PARM_DETECT_BARGE_IN, 1);
LV_SRE_StreamSetParameter(hPort, STREAM_PARM_DETECT_END_OF_SPEECH, 1);
LV_SRE_StreamSetParameter(hPort, STREAM_PARM_AUTO_DECODE, 1);  

 

After setting up the type of stream (using settings above and other parameters as necessary), call LV_SRE_StreamStart (or LVSpeechPort::StreamStart) . Then for each buffer of sound date call LV_SRE_StreamSendData.  The application can call LV_SRE_StreamGetStatus() to determine when barge-in and end-of-speech takes place or use LV_SRE_StreamSetStateChangeCallBack() to set up a call back, which is called each time the state changes.  Remember a background thread is handling the detection so a slight delay may take place after sending a buffer and the detection of barge-in or end-of-speech.

 

Code Snippet:  Streaming audio data and waiting for barge-in and end-of-speech

#include <LVSRE2.h>

...

LV_SRE_StreamSetStateChangeCallBack(hPort, PortCallBack, (void*) hSoundHardware);
//we're sending the handle to the hardware interface as the user data so the //call back knows which hardware device to call functions from
 

 

//hardware start playing prompt and record audio
//wait for barge-in
//wait for end-of-speech detection

...
 

 

//a function called by hardware with more sound data
void NewDataFromHardware(unsigned long Length, void* SoundData)
{
LV_SRE_StreamSendData(hPort, SoundData, Length);
}
 

//the function called by speech port with status changes
void PortCallBack(long NewState, unsigned long TotalBytes, unsigned long RecordedBytes, void* UserData)
{
switch (NewState)
{
case STREAM_STATUS_BARGE_IN:
//tell hardware to stop playback (could use UserData to find hardware)
break;
case STREAM_STATUS_END_SPEECH:
//tell hardware to stop recording  (decode has started)
break;
}
}

 

 

After end-of-speech detection,
if STREAM_PARM_AUTO_DECODE is active, the code can call LV_SRE_GetNumberOfConceptsReturned() to get begin getting decode results.

Or  if STREAM_PARM_AUTO_DECODE is not active, call LV_SRE_Decode() to begin recognition of audio data.

 


Complete Help Topic List | Speech Engine Product Information