Using SRGS Grammars with LVSpeechPort

#include <LVSpeechPort2.h>

You may now use SRGS ABNF-style grammars with LVSpeechPort by accessing the function calls provided in LVSpeechPortABNF.h

Grammars may be specified dynamically via API calls, all at once  by providing an SRGS-ABNF file, or a mixture of both.  In addition, the return value of an ABNF decode can be semantically interpreted to become a structured data object represented in XML.

Code Example in C++:

#include <LVSpeechPort2.h>

#include <fcntl.h>

#include <io.h>

#include <stdio.h>

 

#define MAX_AUDIOLENGTH 371184

 

void SPLogging(const char* String, void* p)

{

    /* if p had some definition, then we could do some stuff with it */

    printf("%s\n", String);

}

int main()

{

/*  Messages about the LumenVox Speech Grammars are routed to the

 *  application level callback function.

 *  Messages specific to the speech port are routed to the port's callback function

 */

    LVSpeechPort::RegisterAppLogMsg(SPLogging,NULL,4000);

    LVSpeechPort port;

    port.OpenPort(SPLogging,NULL,4000);

 

    int voice_channel = 1;

    const char* audio_filename = "c:\\public\\cfg\\8587070707.ulaw";

    const char* topnav_gram= "topnav";

    const char* phone_number_gram = "PhoneNumber";

 

/*  Loading a grammar pre-compiles it for use in the Speech Engine.

 */

    LVSpeechPort::LoadGlobalGrammar(topnav_gram,"c:\\public\\cfg\\top_level_navigation.gram");

    port.LoadGrammar(phone_number_gram,"c:\\public\\cfg\\phone_number.gram");

 

/*  When a grammar is activated, it becomes one of the grammars that the speechport uses to decode.

 *  You may place as many grammars in the active grammar set as you'd like.  If a grammar was not

 *  pre-loaded, it will be fetched and compiled during this call.

 */

    port.ActivateGrammar(phone_number_gram);

    port.ActivateGlobalGrammar(topnav_gram);

    

/*  In this simple example, we load all our audio from a file into the voicechannel.

 *  A more realistic example might get the audio to the speechport using the LumenVox Streaming API

 */

    int audio_handle = _open(audio_filename, _O_BINARY | _O_RDONLY);

    char buffer[MAX_AUDIOLENGTH];

    int length = _read(audio_handle, buffer, MAX_AUDIOLENGTH);

    port.LoadVoiceChannel(voice_channel,buffer,length,ULAW_8KHZ);

 

/*  Decode the audio.  The LV_ACTIVE_GRAMMAR_SET tells the speechport to use the grammars we activated.

 *  The flag LV_DECODE_BLOCK tells the decode function to not return until decoding is complete.

 *  The flag LV_DECODE_SEMANTIC_INTERPRETATION tells the speechport to perform semantic interpretation on

 *  the return result.

 */

    port.Decode(voice_channel,LV_ACTIVE_GRAMMAR_SET, LV_DECODE_BLOCK |

                                                            LV_DECODE_SEMANTIC_INTERPRETATION );

 

    int num_parses = port.GetNumberOfParses(voice_channel);

    int utt_score = port.GetUtteranceScore(voice_channel);

    printf("Number of Parses: %i\n",num_parses);

    printf("Utterance Score: %i\n", utt_score);

    int i;

    for (i = 0; i < num_parses; ++i) {

        printf("Parse %i:\n", i+1);

 

    /*  The following code prints out the Speech Parse Tree.  The Speech Parse Tree is the raw

     *  decode result returned by the Speech Engine.  It contains the name of the grammar that it matched,

     *  and a sentence diagram of the match, including all tagdata that was encountered.

     */

        printf("Speech Parse Tree:\n");

        LVParseTree Tree = port.GetParseTree(voice_channel,i);

        printf("This Tree matched the grammar: %s\n\n",port.GetMatchedGrammarLabel(voice_channel,i));

        LVParseTree::PreOrderIterator Itr = Tree.PreOrderItr();

 

        for (; !Itr.IsPastEnd(); Itr.Advance()) {

            int level = Itr.Level();

            for (int x = 0; x < level; ++x) printf("    ");

            if (Itr.IsRule())

                printf("$%s:",Itr.Label());

            else if (Itr.IsTag())

                printf("{!{ %s }!}",Itr.Label());

            else

                printf("%s begin frame:%i end frame:%i score:%i phonemes:%s",
                   Itr.Label(),
                   Itr.BeginFrame(),
                  
 Itr.EndFrame(),
                   Itr.Score(),
                   Itr.Phonemes());

            printf("\n");

        }

        printf("\n");

 

    /*  This code prints out the Interpretation result as XML */

        printf("The Interpretation Result:\n");

        printf("<%s>",port.GetMatchedGrammarLabel(voice_channel,i));

        printf("%s",port.GetInterpretationString(voice_channel,i));

        printf("</%s>",port.GetMatchedGrammarLabel(voice_channel,i));

 

    }

    port.DeactivateGrammars();

    port.ClosePort();

    return 0;

}

Using this small application, if the Speech Engine detects the output "eight five eight seven o seven o seven o seven" then this might be the output:

Number of Parses: 1
Utterance Score: 818
Parse 1:
Speech Parse Tree:
This Tree matched the grammar: PhoneNumber

$PhoneNumber:
$AreaCode:
{!{  $ = ""  }!}
$Digit:
 EIGHT begin frame:46 end frame:64 score:642 phonemes:EY T
 {!{ $="8" }!}
{!{  $ += $Digit  }!}
$Digit:
 FIVE begin frame:65 end frame:88 score:744 phonemes:F AY V
 {!{ $="5" }!}
{!{  $ += $Digit  }!}
$Digit:
 EIGHT begin frame:89 end frame:108 score:682 phonemes:EY T
{!{ $="8" }!}
{!{  $ += $Digit  }!}
{!{  $.areacode = $$  }!}
$Number:
{!{  $ = ""  }!}
$Digit:
 SEVEN begin frame:109 end frame:145 score:997 phonemes:S EH V AX N
 {!{ $="7" }!}
{!{  $ += $$  }!}
$Digit:
 O begin frame:146 end frame:156 score:998 phonemes:OW
 {!{ $="0" }!}
{!{  $ += $$  }!}
$Digit:
 SEVEN begin frame:157 end frame:200 score:974 phonemes:S EH V AX N
{!{ $="7" }!}
{!{  $ += $$  }!}
$Digit:
 O begin frame:201 end frame:219 score:1000 phonemes:OW
 {!{ $="0" }!}
{!{  $ += $$  }!}
$Digit:
 SEVEN begin frame:220 end frame:263 score:999 phonemes:S EH V AX N
 {!{ $="7" }!}
{!{  $ += $$  }!}
$Digit:
 O begin frame:264 end frame:277 score:1000 phonemes:OW
 {!{ $="0" }!}
{!{  $ += $$  }!}
$Digit:
 SEVEN begin frame:278 end frame:328 score:939 phonemes:S EH V AX N
 {!{ $="7" }!}
{!{  $ += $$  }!}
{!{  $.number = $$  }!}

The Interpretation Result:
<PhoneNumber>
<areacode>
             858
</areacode>
<number>
             7070707
</number>
</PhoneNumber>

 

Pre Decode Functions

 

LV_SRE_LoadGrammar

Loads an SRGS-ABNF grammar file.

LV_SRE_LoadGlobalGrammar

Loads a grammar into the global space, so it is accessible to all speechports

LV_SRE_UnloadGlobalGrammar

Unloads a global grammar.

LV_SRE_ActivateGrammar

Adds one of the speechport's grammars to the active grammar set.

LV_SRE_ActivateGlobalGrammar

Adds a global grammar to the speechport's active grammar set.

LV_SRE_DeactivateGrammar

Removes a grammar from the speechports active grammar set.

LV_SRE_DeactivateGlobalGrammar

Removes a global grammar from the speechports active grammar set.

LV_SRE_DeactivateGrammars

Clears the speechport's active grammar set.

Decode Flags

 

LV_DECODE_USE_ABNF_GRAMMAR

Add this flag to a call to LVSpeechPort::Decode to activate the ABNF grammar for decoding.

LV_DECODE_SEMANTIC_INTERPRETATION

Add this flag to a call to LVSpeechPort::Decode to turn on semantic interpretation.

LV_ACTIVE_GRAMMAR_SET

Add this variable to the grammarset argument of  LVSpeechPort::Decode to use the active grammar set for decoding.

Post Decode Functions

 

LV_SRE_GetNumberOfParses

Returns the number of valid parses for the Speech Engine decode output.

LV_SRE_GetParseString

Returns an SRGS style parse string for the Speech Engine decode output.

LV_SRE_GetParseTreeHandle

Returns a handle to a parse tree for the Speech Engine decode output.

LV_SRE_GetInterpretationString

Returns an XML representation of the semantic interpretation of the parse tree.

LV_SRE_GetInterpretationData

Returns a data structure representation of the semantic interpretation of  the parse tree.

LV_SRE_GetGrammarLabel

Returns the label of the grammar that was matched.

LV_SRE_GetUtteranceScore

Returns a confidence score for the decoded utterance.


Complete Help Topic List | Speech Engine Product Information