In that vein, leading vendors plan to highlight their continued push toward improved accurate speech recognition and fuller support for the languages and speaking styles.
ScanSoft, of Peabody, Mass., which recently acquired leading speech recognition vendor SpeechWorks, will be launching Version 3.0 of the SpeechWorks Speechify text-to-speech engine. The update will include a new voice for reading back data in a database and new dictionary management capabilities for looking up information. It also improves pronunciation with a focus on distinguishing the way a set of numbers is read such as between a phone number and a Social Security number or ZIP code, said spokeswoman Marie Ruzzo.
Intel on Monday opened the show by announcing its key software piece for Microsoft Corp.s upcoming Speech Server release expected in the first quarter of 2004. Intel announced that beta testing has begun on the Intel NetMerge Call Manager, one of two choices Microsoft will offer for integrating into an existing telephony network through its Speech Server. The software allows application developers to construct telephone services without focusing on the telephone networks complexities.
The Santa Clara, Calif., company also announced its NetStructure Host Media Processing 1.1 software for voice-enabling enterprise applications. Developers can use it to build IP media servers for interactive voice response services, voice mail, conferencing, fax servers and other telephony applications, and the software will support as many as 120 ports per server. Available now, it costs between $20 and $112 per port depending on the functionality needed. IBM, of Somers, N.Y., will be demonstrating at the show the newest versions of its WebSphere speech products, all with support for VoiceXML 2.0, one of two proposed standards in the speech market. They are speech recognition and text-to-speech engine WebSphere Voice Server 4.2, WebSphere Voice Application Access 4.1 for integrating enterprise application for voice access and WebSphere Voice Response 3.1.5 for integrated voice response. All were announced last week.
The release builds natural language extensions into Voice Server and Voice Application Access to provide context in conversation so that a customer asking for a stock quote, for example, could also buy that stock without having to restate the stocks name a second time. Voice Server also has added Korean and Dutch to its stable of 18 supported languages.
Cepstral LLC, of Pittsburgh, is expanding its base of voices for text-to-speech this week by introducing two French Canadian ones, the male Jean-Pierre and the female Isabelle, and two playful U.S English voices, Damien and Duchess, said Chief Technology Officer and Co-Founder Kevin Lenzo. Cepstral specializes in creating voices for text-to-speech systems, particularly for running on smaller devices such as handhelds and smart phones with lower memory and processing requirements. The company expects to launch at least one new voice per quarter and to focus on North American voices, Lenzo said.
Cepstral also will be announcing the latest version of its voice engine, Theta 2.4. It will include improved voices, better support for markup languages for more natural sounding pronunciation of addresses, updated lexicons and new APIs including support for Microsofts Speech Application Programming Interface (SAPI).
Speech recognition vendor LumenVox LLC, of San Diego, this week is launching Version 4.0 of its Speech Driven Information System and Speech Recognition Engine. LumenVox is adding VoiceXML export capabilities to its Speech Driven Information System so an application can be ported to a VoiceXML file for Web accessibility. The latest version of the software also has improved client/server functionality and new support for Spanish, among other features, the company said. In addition, the Speech Recognition Engine adds support for Spanish as well as improved performance and support for the a-law 8-kilohertz audio format.