The World Wide Web Consortium Tuesday announced new work on extensions to components of the Speech Interface Framework that will extend the Speech Synthesis Markup Language to support Asian and other languages and extend speaker verification features into the next version of VoiceXML, version 3.0.
Based in part on its first technical Workshop in Beijing and input from the VoiceXML Forum, the W3C is building on the Speech Interface Framework, the organization said.
“On Nov. 2-3, we held a two-day workshop in Beijing discussing extensions to SSML to better support Chinese and other languages,” said Jim Larson, co-chair, W3C Voice Browser Working Group.
The Speech Synthesis Markup Language is designed to provide an XML-based markup language for creating synthetic speech in Web and other applications, W3C officials said. SSMLs role is to provide authors of synthesizable content a standard way to control aspects of speech such as pronunciation, volume, pitch, rate, etc. across different synthesis-capable platforms.
With that foundation, researchers from IBMs China Research Lab presented a paper on using Chinese Romanization to annotate Chinese pronunciation. “We also propose SSML to use diverse predefined and widely used pronunciation annotation standards for different languages, at least as a complement to the created general standard,” the IBM paper said. “Thus SSML can be more easily accepted and used around the world.”
Moreover, W3C officials said Mandarin Chinese is the most widely spoken language in the world, and it has the notion of tones, in that the same written character can have multiple pronunciations and meanings based on the tone used. Thus, given the profusion of cell phones in China—some estimate there are more than one billion—the case for extending SSML for Mandarin is strong based on sheer market forces, the W3C said.
Meanwhile, the W3C group also called for the Speaker Verification Extension to be included in VoiceXML 3.0.
“Identity theft, fraud, phishing, terrorism, and even the high cost of resetting passwords have heightened interest in deploying biometric security for all communication channels, including the telephone,” said Ken Rehor, of Vocalocity, newly elected Chairman of the VoiceXML Forum and participant in the W3C Voice Browser Working Group, in a statement. “Speaker verification and identification is not only the best biometric for securing telephone transactions and communications, it can work seamlessly with speech recognition and speech synthesis in VoiceXML deployments.”