Officials at Cambridge, Mass-based W3C said VoiceXML 2.0 is targeted at combining Web-based development and content delivery with interactive voice response capabilities, while the Speech Recognition Grammar Specification (SRGS) is needed to ensure VoiceXMLs support for speech recognition.
The W3Cs Speech Interface Framework will enable users to use ordinary telephones to interact with Web services by keying in or speaking commands, W3C officials said.
In addition to VoiceXML and SRGS, the W3Cs Speech Interface Framework includes the Speech Synthesis Markup Language (SSML) for handling spoken prompts, SRGS for supporting speech recognizer elements, and Voice Browser Call Control (also known as Call Control XML or CCXML) for delivering telephony call control support for XML, W3C officials said.
SRGS enables applications to establish what words users will be prompted to say in voice systems, but also has been applied to other applications such as handwriting recognition, the W3C said.
Meanwhile, although VoiceXML 2.0 is a significant milestone, the leaders of the effort are not planning on letting up.
"VoiceXML work doesnt stop with Version 2.0," said Janet Daly, a spokeswoman for the W3C. "The Voice Browser Working Group already has a 2.1 spec in the pipeline and theyll be working on [Version] 3.0. Theyre looking toward including new work that takes into account contributions from W3C members, like XHTML plus Voice [Extensible Hypertext Markup Language plus Voice or XHTML+Voice] and SALT [Speech Application Language Tags]."
Microsoft Corp. supports SALT and IBM Corp. supports XHTML+Voice, although both are members of the W3C Voice Browser Working Group and IBM is a founder of the work on VoiceXML.
Meanwhile, at next weeks SpeechTEK show in San Francisco, which will run in conjunction with VSLive! and the Microsoft Mobile Developers Conference, Microsoft chairman Bill Gates is slated to announce Microsoft Speech Server 2004 in his keynote speech.
"W3C Speech Interface Framework is crucial to Microsofts vision of making speech mainstream," said Xuedong Huang, general manager of Microsofts Speech Technologies Group, in a statement. "We have implemented SRGS, SSML and SI into Microsoft Speech Server 2004 that integrates speech into HTML through SALT. Such a seamless integration with HTML has enabled our customers to extend their existing investments from desktop to multimodal and telephony voice access in a single, cost effective step. Microsoft Speech Server also provides development tools in the popular Visual Studio .NET environment, paving the way for Speech Interface Framework to be adopted by the mainstream Web developers," Huang continued.
Igor Jablokov, program director for pervasive computing at IBM, also in a statement, said "VoiceXML 2.0, which has been key in the growth of speech applications by providing a standards-based framework, allows businesses to deploy applications today that leverage existing development skills and resources. Because it allows speech deployments to be built over a standard Web-application infrastructure, VoiceXML also provides a clear upgrade path as applications grow—unlike closed, proprietary languages.