Rob High believes that for too long, the relationship between humans and computers has been too uneven, with users constantly having to make adjustments to accommodate to the systems.
People have had to adapt to the computer’s interface, to figure out what keys to hit to get the result they’re looking for and to manipulate the software to get the information they need. Even in these relatively early days of artificial intelligence (AI), users are still having to respond to their systems.
High wants that to change. He is an IBM Fellow and CTO of Big Blue’s Watson Group, the unit charged with developing and improving the company’s highest-profile and most famous effort in what it calls cognitive computing. Speaking at this week’s GPU Technology Conference (GTC) in San Jose, Calif., High said researchers and engineers with the Watson Group will spend much of their time this year working to make the computer more humanlike.
Watson and other systems, as they become more intelligent, “will have to communicate with us on our terms,” he said. “They will have to adapt to our needs, rather than us needing to interpret and adapt to them.”
They will have to not only understand the questions humans ask and the statements they say, but will have to be able to pick up on all the visual and other non-verbal cues—such as facial expressions, the emphasis placed on words in a sentence and the tone of the voice—that people do in the normal course of interactions they have with each other. High wants to “change the role between humans and computers.”
A growing number of the top tech companies—including Google, Facebook, IBM and Microsoft—are driving hard into the area of artificial intelligence, with broad applications in everything from connected cars and search to research, security, and speech and facial recognition. AI was a central theme during Nvidia’s GTC conference—company officials are pushing their GPUs as a foundational technology for driving the development and performance the massive computing capabilities needed for AI.
IBM has adopted GPU Tesla K80 accelerators for Watson, enabling the company to develop deep learning capabilities that can be used to further enhance the system’s cognitive computing features.
Watson burst onto the scene five years ago when it beat longtime champion Ken Jennings on the game show “Jeopardy.” The event showed a computer that had taken in vast amounts of data and could understand the questions posed by the host, navigating around the subtle nuances, puns and other wordplay. Watson was able to respond to natural-language questions.
Since that time, IBM researchers have worked to vastly expand what Watson and its cognitive computing capabilities can do for businesses. High outlined the features that an AI system needs to have, such as the ability to learn and to understand and express itself via natural language. There’s also a level of expertise they need to have, he said.
“We want them to be right, as right as a human being,” High said. “But we also want them to be trustworthy.”
A key capability also is being able to handle the huge amounts of data being generated these days from a vast number of sources. High noted that 2.5 exabytes of data—text, images and videos—are being created on a daily basis; some industry officials expect that by 2020, 44 zettabytes of data will be generated.
Since “Jeopardy,” IBM has added capabilities and features to Watson. In 2011, the company created what High called a “factoid pipeline” with general-domain information in a broad range of areas. The year after—2012—engineers rolled out Watson Discovery Advisor for particular verticals that not only offered a factoid pipeline, but also was programmed to help users figure out questions they might not have thought to ask.
One focus was health care, where the engineers created the Oncology Treatment Advisor. Through this, Watson could help doctors sift through vast amounts of information to help them create much more personalized treatment plans that not only addressed the particular cancer a person had but also the unique features of the patient themselves. During a question-and-answer session after his talk, High said it is in such ways that Watson will offer its greatest value—doctors now have a tool that can sort through vast amounts of data and enable them to determine more effective treatments for the individual rather than just being able to apply a broad plan based on the particular disease.
IBM Looks to Make Watson More Humanlike
That same capability to rapidly run through, collect and return massive amounts of information can help in a broad array of areas, from health care to other verticals, High said. Doctors would need 160 hours every week to keep up with the new information in their fields being generated. Watson can go through all that information for them and return with the information they need.
Over the past several years, IBM has added other capabilities to Watson, from expanding Discover Advisor to offering Watson services that customers can leverage for their own businesses. The company now offers 32 services on its Watson Developer Cloud, which runs on the Bluemix platform-as-a-service (PaaS) platform.
Much of the work this year will be around making Watson more humanlike, which will change its relationship with users. If Watson—and eventually other systems—can better understand the nuances of human communication, then it can handle more dynamic changes. Humans will no longer have to adapt and change to the system’s interface. Instead, it will be the systems that have to adapt to humans.
With this in mind, IBM researchers are developing a series of programming interfaces that will be able to analyze text for everything from emotions to tone. A beta version of an emotion analysis capability in Watson can take text and analyze it for emotions from anger to fear to happiness. The idea is to build more “sympathetic systems,” High said. Similarly, Tone Analyzer, an API designed to better understand and measure emotions in written text, can detect emotions in the text—such as anger—and suggest new wording that might send a better emotional message.
Through the Watson Developer Cloud, IBM also offers a personality insights service that can analyze text and offer insights into the writer’s personality. It can determine if the person is introverted or extroverted, for example.
Such capabilities not only can help businesses—a restaurant owner can derive information from online comments from customers to determine what he can do better, or recruiters can better match a candidate with a job—but will lead to robots in the future that can more naturally interact with humans, High said. As an example, he pointed to the work IBM has done with Softbank’s Aldebaran NAO robots. Through such APIs, the robots show more humanlike characteristics, such as not only responding to questions from humans in similar natural language, but also gesturing as they do.
He showed a NAO robot not only singing a Taylor Swift song, but also performing a Gangnam Style dance.