LAS VEGAS—As it gears up for what it calls the cognitive era of computing, IBM announced new and expanded cognitive APIs for developers that enhance Watson’s emotional and visual senses.
The three new APIs further extend the capabilities of the Watson cognitive computing system and enable developers to add new emotional functionality to their cognitive applications. IBM announced the new APIs at its InterConnect 2016 conference here.
Three APIs, Tone Analyzer, Emotion Analysis and Visual Recognition, are now available in beta. Additionally, Text to Speech (TTS) has been updated with new emotional capabilities and is being re-released as Expressive TTS for general availability. These APIs are pushing the sensory boundaries of how humans and machines interact, and they are designed to improve how developers embed these technologies to create solutions that can think, perceive and empathize.
“We continue to advance the capabilities we offer developers on IBM’s Watson platform to help this community create dynamic AI infused apps and services,” said David Kenny, general manager of IBM Watson, in a statement. “We are also simplifying the platform, making it easier to build, teach and deploy the technology. Together, these efforts will enable Watson to be applied in many more ways to address societal challenges.”
IBM is also adding tooling capabilities and enhancing its SDKs—Node, Java, Python, and newly introduced iOS Swift and Unity—across the Watson portfolio and adding Application Starter Kits to make it easy and fast for developers to customize and build with Watson. All APIs are available through the IBM Watson Developer Cloud on Bluemix.
The Tone Analyzer gives users better insights about their own tone in a piece of text. Adding to its previous experimental understanding of nine traits across three tones—emotion (negative, cheerful, angry), social propensities (open, agreeable, conscientious) and writing style (analytical, confident, tentative)—Tone Analyzer now analyzes new emotions, including joy, disgust, fear and sadness, as well as new social propensities, including extraversion and emotional range.
Also new to the beta version, Tone Analyzer is moving from analyzing single words to analyzing entire sentences. This analysis is helpful in situations that require nuanced understanding. For example, in speech writing it can indicate how different remarks might come across to the audience, from exhibiting confidence and agreeableness to showing fear. In customer service, it can help analyze a variety of social, emotional and writing tones that influence the effectiveness of an exchange.
IBM partner and Watson user Alpha Modus uses the Tone Analyzer as part of its investment management solution to help get a more precise indication of what’s going on in the stock market, said Prashant Bhuyan, co-founder and CTO of the company.
Also, Watson Ecosystem Partner Connectidy has developed an innovative relationship science platform that uses the Tone Analyzer beta to intuitively help users understand how messages to potential matches may come across.
“Through the analysis of authentic language in real time, Tone Analyzer provides people with an unprecedented level of perspective into how their emotions and social propensities play out in their written word,” said Dineen Tallering, president of Connectidy, in a statement. “This is a critical piece of emotional intelligence because it enables us to continually educate users on how they appear to others. We are able to advance past static algorithms to achieve a level of cognitive insight that continuously learns and helps guide our users towards greater self awareness and better choices.”
IBM Adds New Watson Emotional, Visual APIs to Bluemix
Meanwhile, IBM has added Emotion Analysis as a new beta function within the AlchemyLanguage suite of APIs. Emotion Analysis uses sophisticated natural language processing techniques to analyze external content and help users better understand the emotions of others. Developers can now go beyond identifying positive and negative sentiments and distinguish a broader range of emotions, including joy, fear, sadness, disgust and anger. By gaining this deeper understanding, Emotion Analysis can help identify new insights in areas like customer reviews, surveys and social media posts. For example, in addition to knowing if product reviews are negative or positive, businesses can now identify if a change in a product feature prompted reactions of joy, anger or sadness among customers.
Also, moving beyond visual capabilities that enable systems to understand and tag an image, Visual Recognition is available now in beta and can be trained to recognize and classify images based on training material.
While other visual search engines can tag images with a fixed set of classifiers or generic terms, Visual Recognition allows developers to train Watson around custom classifiers for images—the same way users can teach Watson natural language classification—and build apps that visually identify unique concepts and ideas. This means that Visual Recognition is now customizable with results tailored to each user’s specific needs. For example, a retailer might create a tag specific to a style of its pants in the new spring line so it can identify when an image appears in social media of someone wearing those pants.
Braxton Jarrett, general manager of IBM’s Cloud Video Services unit, said Watson’s visual recognition API has been in high demand among developers building video applications. As video has become a first-class data type in business as well as for consumers, IBM’s new Cloud Video Services unit is going after the $105 billion opportunity in cloud-based video services and software.
Moreover, to further advance emotional capabilities for cognitive systems, IBM has also incorporated emotional IQ into its existing Text to Speech API and is releasing Expressive TTS for general availability. Expressive TTS is now generally available to help cognitive systems generate and deliver an advanced level of adaptive emotion in vocal interactions, meaning computers can not only understand natural language, tone and context, but respond with the appropriate inflection.
Previously, automated systems relied on a predetermined, rules-based corpus of words. This has been categorized by limited emotional queues, such as “good news equals a raised tone” or “bad news equals a slowed tone.” In creating Expressive TTS, IBM studied and decided on a specific set of expressive styles to frame this speech capability. To do this, the research team made significant enhancements to IBM’s existing synthesis engine incorporating ideas from machine learning to allow for seamless switching across expressive styles. Developers now have more flexibility in building cognitive systems that can demonstrate sensitivity in human interactions.
These new and expanded services are part of IBM’s open Watson platform that now includes more than 30 Watson services.