Speech Recognition Finally Finding Its Voice in Mobile Technology - Mobile and Wireless - News & Reviews - eWeek.com

Speech Recognition Finally Finding Its Voice in Mobile Technology

Aug 21, 2012
3 minute read
eWeek content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

PALO ALTO, Calif. – If speech-recognition technology were a human, it would be like a 5- or 6-year-old child. At the age of 1, you can speak to a child, but you have to speak slowly and simply using small words. By 5 or 6, it starts to better understand your words and, more importantly, your meaning.

The comparison of computer speech development to human speech development came up during a panel discussion Aug. 20 at a forum hosted by the Churchill Club of Silicon Valley in Palo Alto, Calif. Representatives of a speech-recognition software company, an automaker and Apple co-founder Steve Wozniak discussed where speech recognition has been and where it’s going.

Speech is becoming the new computer user interface, said Quentin Hardy, deputy technology editor of The New York Times and moderator of the panel, continuing a long line of UI evolution from the punch card and the command line interface to the mouse and the touch-screen.

With each advance, the interaction shifts became less machine and more human. When we want to get someone’s attention, we tap them on the shoulder like we tap on a screen, said Wozniak, and when we want to talk to someone, we speak.

“We love our computers; we love our phones. We are getting that feeling we get from another person,” he said.

Speech-recognition technology has evolved from the machine understanding voice commands to understanding meaning and context, said Ron Kaplan, senior director and distinguished scientist at Nuance, whose voice-recognition technology has been licensed to Apple for use in its Siri personal assistant feature on the iPhone 4S and to the Ford Motor Co., for its MyFordTouch system that is also based on Microsoft Sync.

“One of the enabling technological advances that makes more accurate speech recognition possible and makes more accurate understanding of intent possible, is the ability to accumulate large amounts of data from lots of user experiences and to sift and organize and build models from it,” Kaplan said.

In other words, like a child, its vocabulary and understanding grows the more it hears what people say to it.

Ford opened a lab in Silicon Valley at the beginning of the year and the unit is organized as a startup far away from the bureaucracy at Ford headquarters in Dearborn, Mich. The lab is continually working to improve the accuracy of MyFordTouch, which was introduced in 2007. Drivers use voice commands to get directions, adjust the heating and air conditioning or change radio stations. Ford parked a 2012 Focus Electric sedan equipped with MyFordTouch in the hotel ballroom where the event was held.

While constantly improving, speech recognition is complicated, said Sheryl Connelly, a futurist at Ford. Because of that complexity, drivers don’t yet talk to their cars like they’re talking to a person.

“We talk to it as if we’re talking to a foreigner. We talk very slowly and stilted and we have unnatural pauses,” Connelly said. “That’s why we still see hiccups.”

There are expected to be about 4 million Fords and Lincolns on the road equipped with MyFordTouch by the end of this year, a number expected to reach 13 million by 2015, she said. But as such vehicles roll out across Europe and elsewhere, the system has to improve to understand different languages and dialects, which adds to the complexity.

Still, Silicon Valley has emerged as a center for development of speech recognition, as it is for other technology, noted Dan Miller, senior analyst and founder of Opus Research. He knows of several startups in the last eight or nine years that have pursued speech recognition as a business plan. With every iteration, the goal is for the computer to understand “natural language” like the actress Zooey Deschanel in a Siri TV ad who simply says “Let’s get tomato soup delivered,” and the application understands what she means.

“We’ve moved along a maturity path where … people’s expectations about what they can do with today’s technologies … create its own demand,” for improved speech-recognition technology, said Miller. “And then these energetic and imaginative people can come and try to fulfill that.”

eWeek Logo

eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site's focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.