Using SSML for advanced text-to-speech on Windows Phone 8
Delete instructional text in italic
This article demonstrates how the Synthesis Markup Language (SSML) can be used to provide advanced text-to-speech (TTS) functionality on Windows Phone 8 applications.
Article Metadata
Contents |
Introduction
Microsoft introduced a number of interesting APIs and features in the field of speech in Windows Phone 8, which include voice commands, speech recognition and text-to-speech (TTS).
Although this article explains how to add basic TTS functionality to an application, it mainly focuses on the use of the Speech Synthesis Markup Language (SSML) that is supported by Windows Phone 8, to allow a more advanced use of the text-to-speech engine.
Prerequisites
This article assumes a basic knowledge of the Visual Studio IDE, which includes knowledge on how to create, compile and run applications, how to add subdirectories and create and/or add new or existing items to them, and a basic knowledge of the C# language and object oriented programming in general
TTS on Windows Phone 8
The first thing that needs to be done is to include a using statement for the speech synthesis namespace. This is necessary in any application with TTS functionality:
using Windows.Phone.Speech.Synthesis;
Then a new object of the SpeechSynthesizer class must be instantiated
SpeechSynthesizer synth = new SpeechSynthesizer();
And finally, the SpeakTextAsync method must be called, to speak the text.
await synth.SpeakTextAsync("Testing WP8 TTS");
This basic use of the TTS engine might be enough for some applications, but Windows Phone 8 provides additional mechanisms and methods to add more advanced TTS functionality, by utilizing SSML.
Speech Synthesis Markup Language (SSML)
SSML is an XML-based markup language for speech synthesis applications, and is a recommendation of the W3C's voice browser working group. It allows application developers to control various characteristics of synthesized speech, such as voice, language, pronunciation, etc.
The Microsoft implementation of SSML is based on World Wide Web Consortium Speech Synthesis Markup Language (SSML) Version 1.0.
The SpeechSynthesizer class provides 2 methods to speak text that includes SSML markup. The first one, SpeakSsmlAsync, receives the text to speak as a parameter, and the second one, SpeakSsmlFromUriAsync, speaks text from a standalone SSML document. The first one will be used to show the different speech characteristics that can be controlled by using SSML, and then a short explanation of how to speak the contents of a standalone document will be included.
Each SSML document or string requires a speak element. It is the root element of the document, and can be used without any other elements. The speak element also specifies the language to be used.
This is the syntax:
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="string"> </speak>
The "platform categories" will be displayed here in preview only - Copy paste relevant categories into text here
Add categories below using category selector.

