Executives from Nuance Communications Inc., of Menlo Park, Calif., and ScanSoft Inc.s SpeechWorks division, which together account for about 75 percent of the speech market, trumpeted their leading positions in the market and the need for continued improvement in speech-recognition technology and the way speech applications interact with humans.
Nuance CEO and President Chuck Berger said that the speech market cant be won simply by competing on price or offering a new standard—a reference to Microsofts focus on aggressive, per-processor pricing and its support for the Speech Application Language Tags (SALT) specification rather than VoiceXML.
In an interview with eWEEK.com following the keynote, Berger likened the difference between Nuance and Microsofts speech technology to the difference between an Oracle Corp. database and one from FileMaker Inc.
"Enterprise speech is still very, very foreign to them," Berger said in the interview. "Its just a whole different league of industrial-strength speech versus what Microsoft is doing."
For enterprises, speech has become less of a passing interest and more of a necessity as they look to reduce costs and improve customer experiences in call centers, Berger said. Two years ago, however, the speech industry was still questioning whether speech had entered the mainstream.
"The last year has proven that the debate is over," Berger said during his keynote. "What we hear from customers is no longer if theyre going to do speech but when and how theyre going to do speech."
Speech applications require high levels of accuracy in the core speech-recognition and text-to-speech engines to be successful, Berger said, and such improvements require vendors focused on speech for years.
"Every 1 percent improvement in core recognition saves companies millions of dollars," he said.
Steve Chambers, president of ScanSofts SpeechWorks division, stressed that being a leading speech vendors requires not only accurate core technology but the ability to put speech applications in production. Peabody, Mass.-based ScanSoft sells almost 90 percent of its recognition and text-to-speech engines through partnerships rather than directly to enterprises.
Microsoft is one such partner. ScanSofts Speechify text-to-speech engine is part of Microsoft Speech Server 2004 and its speech-recognition engine is an option for Speech Servers enterprise edition
"The reality is this industry works if we deliver better applications, deliver caller success and enterprise benefits," Chambers said. "Its not enough to talk about technology and services but you need to focus on the applications and the caller."
Both Nuance and ScanSoft brought out customers, who agreed that speech is becoming more central to their call-center and customer service operations. Nuance customers included Expedia Inc. and Bell Canada, while ScanSofts included Verizon Communications, United Air Lines Inc. and eBay Inc.s PayPal. But challenges remain in meeting consumers expectations for speech recognition and interaction.
Expedia has seen improvements since launching a speech application for call routing about a year, said Sachin Jhunjhunwala, Expedia director of voice services. While misrouted calls have dropped 66 percent, callers overall still get anxious when faced with voice-automated systems because of bad past experiences. Faced with a machine on the other end, they often want to hit "0" to try to immediately get an agent, Jhunjhunwala said.
"Its going to take some time in the industry to build good user interfaces," he said. "Its not just about the developer or the engine—its about building interfaces that are really compelling for the user to use."