INSIDE MOBILE: Voice Recognition in Mobile Phones
INSIDE MOBILE: Voice Recognition in Mobile Phones
Companies such as Nuance, Vlingo, Yap and others have developed some sophisticated technology to process spoken commands and convert speech to text in your mobile phone. The following seven statements are the kind of commands that new mobile voice recognition applications can do, or will soon be able to do, in your smartphone:
1. "Call Alicia's mobile number."
2. "Text message to Bryan and Jason. Congrats on the Giants winning the World Series."
3. "Find the nearest Starbucks to meet with Kristi."
4. "Get directions and map from home to Houston's restaurant near Lenox Square."
5. "Make a reservation at BJ's to have dinner with Jennifer Thursday at 7:30pm."
6. "E-mail message to Jill. Love to the grandkids. Hug each of them for me."
7. "Look up Bruce Grant in Contacts."
Processing spoken commands and converting the speech into text in mobile phones is a difficult problem to solve. It wasn't too long ago that companies such as IBM were using large mainframe computers to process speech. Then, fairly good processing came to the PC. Over the past few years, very good speech recognition systems have been created for mobile devices. To be sure, many systems use a "client/cloud" model in which the speech is recorded and preprocessed on the phone. Then it's sent off to the supplier's more powerful system that does the "heavy lifting" and the results are sent back to the phone for display and use by the subscriber.
Voice recognition and processing applications
Today, most smartphone users can download a voice recognition and processing application that can either do a lot of things (Nuance) or do more specific things such as search or navigation. Here are eight typical capabilities of today's voice recognition and processing in mobile phones:
1. Speak a reply to a text message that will automatically convert speech to text.
2. Find someone in the Contact list and dial that person's cell phone number.
3. Search for something (uses speech analysis plus a search engine).
4. Find out the temperature in New York.
5. See the last closing price for Exxon.
6. Look up who won the World Series in 1925.
7. Request navigation information (turn-by-turn directions and maps).
8. Compose an e-mail message and have it sent to a number of people.
While I think that voice processing is very useful and beneficial in mobile devices (as well as adding to safety while driving), it's important to remember that there are some situations in which voice processing isn't appropriate. Three examples: 1) with a date at a restaurant, 2) with a group of people at a party, and 3) in a meeting at the office. It simply wouldn't be appropriate to start a voice request using your phone in these situations.
Voice-Activated Applications in Smartphones
Voice-activated applications in smartphones
Most voice recognition services start by pressing a button to begin the voice-activated application. This allows the application to "pay attention" only when necessary. I recently had a briefing update with Todd Mozer of Sensory that has developed a low-power logic that can sit in the background and listen for the key activation phrases. This enables voice recognition to be always available and eliminates having to take the time to begin the voice processing application.
Voice processing is now accepted by most smartphone users. While it might have been thought "odd" to give commands via voice to your phone a few years ago, most people today realize that there are real convenience benefits in asking the phone for information or to do something. In this way, smartphones are going to become virtual assistants that may get very intelligent over the coming years.
If you haven't tried out a voice-activated service on a smartphone, I recommend that you do so. You'll find that it works well most of the time and will definitely save time. Before long, you'll be telling all your friends how easy to use and productive this capability really is.
J. Gerry Purdy, Ph.D. is Principal Analyst of Mobile & Wireless at MobileTrax LLC. As a nationally recognized industry authority, Dr. Purdy focuses on monitoring and analyzing emerging trends, technologies and market behavior in the mobile computing and wireless data communications industry in North America. Dr. Purdy is an "edge of network" analyst looking at devices, applications and services, as well as wireless connectivity to those devices. Dr. Purdy provides critical insights regarding mobile and wireless devices, wireless data communications and connection to the infrastructure that powers the data in the wireless handheld. He is author of the column Inside Mobile & Wireless that provides industry insights and is read by over 100,000 people a month.
Dr. Purdy continues to be affiliated with the venture capital industry as well. He currently is Managing Director at Yosemite Ventures. And he spent five years as a Venture Advisor for Diamondhead Ventures in Menlo Park where he identified, attracted and recommended investments in emerging companies in mobile and wireless. He has had a prior affiliation with East Peak Advisors and, subsequently, following their acquisition, with FBR Capital Markets. For more than 16 years, Dr. Purdy has been consulting, speaking, researching, networking, writing and developing state-of-the-art concepts that challenge people's mind-sets, as well as developing new ways of thinking and forecasting in the mobile computing and wireless data arenas. Often quoted, Dr. Purdy's ideas and opinions are followed closely by thought leaders in the mobile and wireless industry. He is author of three books as well.
Dr. Purdy currently is a member of the Program Advisory Board of the Consumer Electronics Association (CEA) which produces CES, one of the largest trade shows in the world. He is a frequent moderator at CTIA conferences and GSM Mobile World Congress. He also is a member of the Board of the Atlanta Wireless Technology Forum. Dr. Purdy has a B.S. degree in Engineering Physics from University of Tennessee, a M.S. degree in Computer Science from UCLA, and a Ph.D. in Computer Science and Exercise Physiology from Stanford University. He can be reached at firstname.lastname@example.org.
Disclosure Statement: From time to time, I may have a direct or indirect equity position in a company that is mentioned in this column. If that situation happens, then I'll disclose it at that time.