MOBILE

Windows Phone 8 : Phone Hardware - Voice Commands

10/10/2013 8:51:45 PM

When a user holds the home button for more than a couple of seconds, the speech recognition subsystem launches and enables a user to perform searches and other commands for the phone. She can say anything and the “Listening” screen (as shown in Figure 1) waits for the user to say something.

Image

FIGURE 1 The “Listening” Screen

Although users can say anything they want to search, they can also use voice commands like “Call Mom” or “Open Twitter.” By default the voice commands are based on the app name, but your app might want to have its own voice commands for specific functionality.

Before your application can add its own voice commands, it needs to first request certain capabilities for your app, including ID_CAP_MICROPHONE, ID_CAP_NETWORKING, and ID_CAP_SPEECH_RECOGNITION. You can see these capabilities added to the WMAppManifest.xml file in Figure 2.

Image

FIGURE 2 Adding the voice command capabilities

Next, you will need to add a new file called a Voice Command Definition (VCD) file. This is just an XML file, but Visual Studio for the Windows Phone includes an item template for it (as shown in Figure 3).

Image

FIGURE 3 The Voice Command Definition item template

The XML file has a simple structure as shown here:

<?xml version="1.0" encoding="utf-8"?>
<VoiceCommands xmlns="http://schemas.microsoft.com/
voicecommands/1.0">
  <CommandSet >

    <CommandPrefix>Facey</CommandPrefix>
    <Example> show me a smile </Example>

    <Command Name="ShowFace">
      <Example> show me a smile </Example>
      <ListenFor> show [me] [a] {facetype} </ListenFor>
      <Feedback> Your Face is Coming </Feedback>
      <Navigate Target="pages/facepage.xaml" />
    </Command>

    <PhraseList Label="facetype">
      <Item> smile </Item>
      <Item> frown </Item>
      <Item> grimace </Item>
    </PhraseList>

  </CommandSet>
</VoiceCommands>

The CommandSet element is used to define a set of commands. The different sections are used to define the command metadata. For example, the CommandPrefix and Example are used for all the commands. Next, you will have one or more Command elements that define a type of command you support. Each command must have a unique name. Finally, the PhraseList element is used to define replaceable elements. You should think of the PhraseList as an enumerated list of possible values. Let’s see how these work in a simple example.

All your commands need to start with a common prefix. This is how the phone knows it should match your application’s commands. The prefix defined in the CommandPrefix element will be used to determine the speech recognition phrase to use to start your command. For instance, in this example we have an app that wants a prefix that says “Facey.” This is so users can say “Facey show me a smile,” and this app will show a smiley face to them. The Example element after the CommandPrefix is used to define a string that is shown to users to show them how to use the app’s voice commands.

Next we need to set up the command itself. The Command element in the example shows that we need a name to identify the Command. Then inside the element, you can specify the following:

Example: Like the main example, this is a command-specific example.

ListenFor: One or more patterns for the speech recognition engine to search for. The words in brackets (for example, [a]) are optional. The words in curly braces are words in a PhraseList element (for example, {facetype}).

Navigate: This is an option URI for the part of your app to navigate to. If this is omitted, your first page (for example, MainPage.xaml) will be used.

The last element is the PhraseList. As the example shows, the PhraseList has a Label that identifies it as is used in the ListenFor elements to match the two. This list does not have to be hard-coded; we’ll show you later how to programmatically specify the contents of a PhraseList.

After you have the VCD file created, you must register it with the VoiceCommandService class. This is done as an asynchronous call that can be handled during the first launch of your application. For example, to register the VCD file during the navigation to your main page:

protected async override void OnNavigatedTo(NavigationEventArgs e)
{
  base.OnNavigatedTo(e);

  await VoiceCommandService.InstallCommandSetsFromFileAsync(
    new Uri("ms-appx:///MyVoiceCommands.xml"));


}

The VoiceCommandService class has a static method called InstallCommandSetsFromFileAsync method. This method uses the new async support; therefore, you need to specify the async keyword in the containing method signature and use the await keyword to make the page wait until this method completes to continue.

You will specify the path to your VCD file using a new URI. You might notice that it uses a new URI moniker called “ms-appx.” This specifies that the file path is in the installation directory for your phone application. The ms-appx:// is the moniker and the /MyVoiceCommands.xml is the path to the VCD file. If you have placed this as a subdirectory of your project, make sure that the path is included here.

Now that you have created commands and registered them, the user will be able to launch your app using the commands you’ve defined. When your commands are executed, the Voice Command system launches your application and notifies the user that the voice command was accepted by showing your app name, icon, and voice command description, as shown in Figure 4.

Image

FIGURE 4 Voice Command launching your app

The next part of the puzzle is to react to the navigation to the page you want the voice command to execute. In the VCD, you can specify a path to a page in the option Navigate element (inside the Command element). When the voice command is executed, your page will be shown and the navigation will include information in the query string for the command. So, on your page, just override the OnNavigatedTo method and first check to make sure the NavigationMode is New:

protected override void OnNavigatedTo(NavigationEventArgs e)
{
  base.OnNavigatedTo(e);

  // Only check for voice command on fresh navigation,
  // not tombstoning
  if (e.NavigationMode == NavigationMode.New)
  {
    // ...
  }
}

This will ensure that this is checked only when the page is launched from some external source (for example, the Voice Command). Next, you should check the NavigationContext.QueryString for the actual command name sent:

// Is there a voice command in the query string?
if (NavigationContext.QueryString.ContainsKey("voiceCommandName"))
{
  // If so, get the name of the Voice Command.
  var cmdName = NavigationContext.QueryString["voiceCommandName"];

  // If it's the command we expect,
  // then find the type of face to show
  if (cmdName == "ShowFace")
  {
    // ...
  }
}

The voiceCommandName query string parameter will be set to the Name of the Command element in the VCD that was matched. If so, you can retrieve the voiceCommandName and test it against the expected commands. This is useful if you need to have a specific page handle more than one type of command. After you’ve determined that it is the right command, then you can retrieve the query string parameter that matched the PhraseList item. In this case the facetype PhraseList supports three face types, as shown here:

var faceType = NavigationContext.QueryString["facetype"].ToLower();

// Show supported face types
switch (faceType)
{
  case "smile":
    VisualStateManager.GoToState(this, "Smile", false);
    break;
  case "grimace":
    VisualStateManager.GoToState(this, "Grimace", false);
    break;
  case "frown":
    VisualStateManager.GoToState(this, "Frown", false);
    break;
}

What you do with the data provided is completely up to you, but you can see an example here where we use the VisualStateManager to show the different smiles for us.

Using voice commands is simple, but sometimes you need to handle voice-based control within your app—and that’s where speech recognition comes in.

Other  
 
Top 10
Review : Sigma 24mm f/1.4 DG HSM Art
Review : Canon EF11-24mm f/4L USM
Review : Creative Sound Blaster Roar 2
Review : Philips Fidelio M2L
Review : Alienware 17 - Dell's Alienware laptops
Review Smartwatch : Wellograph
Review : Xiaomi Redmi 2
Extending LINQ to Objects : Writing a Single Element Operator (part 2) - Building the RandomElement Operator
Extending LINQ to Objects : Writing a Single Element Operator (part 1) - Building Our Own Last Operator
3 Tips for Maintaining Your Cell Phone Battery (part 2) - Discharge Smart, Use Smart
REVIEW
- First look: Apple Watch

- 3 Tips for Maintaining Your Cell Phone Battery (part 1)

- 3 Tips for Maintaining Your Cell Phone Battery (part 2)
VIDEO TUTORIAL
- How to create your first Swimlane Diagram or Cross-Functional Flowchart Diagram by using Microsoft Visio 2010 (Part 1)

- How to create your first Swimlane Diagram or Cross-Functional Flowchart Diagram by using Microsoft Visio 2010 (Part 2)

- How to create your first Swimlane Diagram or Cross-Functional Flowchart Diagram by using Microsoft Visio 2010 (Part 3)
Popular Tags
Microsoft Access Microsoft Excel Microsoft OneNote Microsoft PowerPoint Microsoft Project Microsoft Visio Microsoft Word Active Directory Biztalk Exchange Server Microsoft LynC Server Microsoft Dynamic Sharepoint Sql Server Windows Server 2008 Windows Server 2012 Windows 7 Windows 8 Adobe Indesign Adobe Flash Professional Dreamweaver Adobe Illustrator Adobe After Effects Adobe Photoshop Adobe Fireworks Adobe Flash Catalyst Corel Painter X CorelDRAW X5 CorelDraw 10 QuarkXPress 8 windows Phone 7 windows Phone 8