MOBILE

Windows Phone 8 : Phone Hardware - Speech Recognition

10/10/2013 8:53:02 PM

Being able to let your users control your application with their voice can be really useful. That’s where the Speech Recognition comes in. Before you can get started, you need the capabilities to include the network, speech recognition, and microphone support (for example, ID_CAP_NETWORKING, ID_CAP_SPEECH_RECOGNITION, and ID_CAP_MICROPHONE) in the WMAppManifest.xml file.

Now that you have the right capabilities, you can start to use the speech recognition system. There are two main ways to use speech recognition: with and without the system user interface. Using the system’s user interface is the easiest method, but it might conflict with your branding or vision for your application.

The class responsible for speech recognition using the system UI is the SpeechRecognizerUI class. This class supports the IDisposable interface, which means you should take care to clean up its resources when you’re done using it. Typically, you would create the object on the Loaded event of a page/control and dispose of it on the Unloaded event as shown:

public partial class MainPage : PhoneApplicationPage
{
  SpeechRecognizerUI _ui;

  // Constructor
  public MainPage()
  {
    InitializeComponent();

    Loaded += MainPage_Loaded;
    Unloaded += MainPage_Unloaded;
  }

  void MainPage_Loaded(object sender, RoutedEventArgs e)
  {
    _ui = new SpeechRecognizerUI();
  }

  void MainPage_Unloaded(object sender, RoutedEventArgs e)
  {
    _ui.Dispose();
  }
  ...
}

After you have the creation and tear-down specified, you can use the class. The method for listening for speech is called RecognizeWithUIAsync, and it uses the async pattern. So to use it, you should mark your method with async and use await to let the UI be shown and get the speech for recognition:

private async void speechUIButton_Click(object sender, EventArgs e)
{
  var result = await _ui.RecognizeWithUIAsync();

  if (result.ResultStatus == SpeechRecognitionUIStatus.Succeeded)
  {
    var recognized = result.RecognitionResult.Text;

    AddItem(recognized);
  }
}

When you call the RecognizeWithUIAsync method, it shows the UI and listens for any speech. You can see this in Figure 1.

Image

FIGURE 1 Using SpeechRecognitionUI

The result of the speech recognition is a structure that contains two properties. The first one is the ResultStatus, which is an enumeration of whether the operation succeeded. You can test for success (as shown previously). If the operation did not succeed, the result’s RecognitionResult property (a SpeechRecognitionResult object) will not be valid. If it is, you can use the result to see the text that was recognized.

The speech recognition engine also calculates its confidence in the result. This is helpful to figure out whether the text it recognized could possibly be correct. You can use the RecognitionResult’s TextConfidence property to determine this:

private async void speechUIButton_Click(object sender, EventArgs e)
{
  var result = await _ui.RecognizeWithUIAsync();

  if (result.ResultStatus == SpeechRecognitionUIStatus.Succeeded &&
      result.RecognitionResult.TextConfidence >=
        SpeechRecognitionConfidence.Medium)
  {
    var confidence = result.RecognitionResult.TextConfidence;
    var recognized = result.RecognitionResult.Text;

    AddItem(string.Concat(confidence, " - ", recognized));
  }
}

In this case we are using the result only if the confidence is at least medium. This will reduce the number of false positives.

The other method for using the speech recognition engine is to use it without the user interface. Unsurprisingly, the class involved is called SpeechRecognizer (note no “UI” suffix). The pattern for using it is very much the same as the UI class:

public partial class MainPage : PhoneApplicationPage
{
  SpeechRecognizer _rec;

  // Constructor
  public MainPage()
  {
    InitializeComponent();

    Loaded += MainPage_Loaded;
    Unloaded += MainPage_Unloaded;
  }

  void MainPage_Loaded(object sender, RoutedEventArgs e)
  {
    _rec = new SpeechRecognizer();
  }

  void MainPage_Unloaded(object sender, RoutedEventArgs e)
  {
    _rec.Dispose();
  }
  ...
}

When actually using the speech recognition, you will again do it with the async pattern, but it is up to you to make your users aware that you are listening:

private async void speechButton_Click(object sender, EventArgs e)
{

  // Show User you are listening
  VisualStateManager.GoToState(this, "Listening", true);

  // Listen for speech
  var result = await _rec.RecognizeAsync();

  if (result.TextConfidence >= SpeechRecognitionConfidence.Medium)
  {
    var confidence = result.TextConfidence;
    var recognized = result.Text;

    AddItem(string.Concat(confidence, " - ", recognized));
  }
}

The only real difference in using the RecognizeAsync method is that it returns a SpeechRecognitionResult object directly. You can then just test the confidence as you did before. If the operation failed, the confidence will be the value of SpeechRecognitionConfidence.Rejected, which you can test for a failure with.

Other  
 
Top 10
Review : Sigma 24mm f/1.4 DG HSM Art
Review : Canon EF11-24mm f/4L USM
Review : Creative Sound Blaster Roar 2
Review : Philips Fidelio M2L
Review : Alienware 17 - Dell's Alienware laptops
Review Smartwatch : Wellograph
Review : Xiaomi Redmi 2
Extending LINQ to Objects : Writing a Single Element Operator (part 2) - Building the RandomElement Operator
Extending LINQ to Objects : Writing a Single Element Operator (part 1) - Building Our Own Last Operator
3 Tips for Maintaining Your Cell Phone Battery (part 2) - Discharge Smart, Use Smart
REVIEW
- First look: Apple Watch

- 3 Tips for Maintaining Your Cell Phone Battery (part 1)

- 3 Tips for Maintaining Your Cell Phone Battery (part 2)
VIDEO TUTORIAL
- How to create your first Swimlane Diagram or Cross-Functional Flowchart Diagram by using Microsoft Visio 2010 (Part 1)

- How to create your first Swimlane Diagram or Cross-Functional Flowchart Diagram by using Microsoft Visio 2010 (Part 2)

- How to create your first Swimlane Diagram or Cross-Functional Flowchart Diagram by using Microsoft Visio 2010 (Part 3)
Popular Tags
Microsoft Access Microsoft Excel Microsoft OneNote Microsoft PowerPoint Microsoft Project Microsoft Visio Microsoft Word Active Directory Biztalk Exchange Server Microsoft LynC Server Microsoft Dynamic Sharepoint Sql Server Windows Server 2008 Windows Server 2012 Windows 7 Windows 8 Adobe Indesign Adobe Flash Professional Dreamweaver Adobe Illustrator Adobe After Effects Adobe Photoshop Adobe Fireworks Adobe Flash Catalyst Corel Painter X CorelDRAW X5 CorelDraw 10 QuarkXPress 8 windows Phone 7 windows Phone 8