Project Oxford, a collection of machine-learning application programming interfaces (APIs) from Microsoft, has learned a new trick.
The technology made a splash early this year after Microsoft’s age- and gender-guessing Website, How-Old.net, went viral. While not always accurate—in fact, it was sometimes comically off base—it hinted at the potential of computer systems that can make sense of the world around them.
Today, the software announced that it was adding emotion recognition to the mix.
“The emotion tool released today can be used to create systems that recognize eight core emotional states—anger, contempt, fear, disgust, happiness, neutral, sadness or surprise—based on universal facial expressions that reflect those feelings,” blogged Allison Linn, a senior writer at Microsoft Research, today. Those capabilities may usher in applications that respond to a user’s mood.
Ryan Galgon, a Microsoft Technology and Research senior manager, envisions creating “systems that marketers can use to gauge people’s reaction to a store display, movie or food,” Linn reported. “Or, they might find them valuable for creating a consumer tool, such as a messaging app, that offers up different options based on what emotion it recognizes in a photo.”
According to Microsoft’s online documentation, it uses the company’s cloud-based emotion recognition algorithm to determine how people in a picture are feeling at the moment it was taken.
“The Emotion API beta takes an image as an input, and returns the confidence across a set of emotions for each face in the image, as well as bounding box for the face, from the Face API,” Microsoft explained. “These emotions are communicated cross-culturally and universally via the same basic facial expressions, where are identified by Emotion API. In interpreting results from the Emotion API, the emotion detected should be interpreted as the emotion with the highest score, as scores are normalized to sum to one.” Developers can set higher confidence thresholds to suit their needs.
The Project Oxford Emotion API is available now as a free public beta. Also available today is a new spell check tool.
When added to a mobile or cloud app, the tool can be used to recognize “slang words such as ‘gonna,’ as well as brand names, common name errors and difficult-to-spot errors such as ‘four’ and ‘for.’ It also adds new brand names and expressions as they are coined and become popular,” explained Linn.
The technology behind How-Old.net is also being upgraded. Linn revealed that “Project Oxford’s existing face detection tool will be updated to include facial hair and smile prediction tools, and the tool also has improved visual age estimation and gender identification.”
Finally, Linn teased three upcoming Project Oxford betas scheduled for release later this year: Video, Speaker Recognition and Custom Recognition Intelligent Services.
Project Oxford Video, based in part on the work the company has done on Microsoft Hyperlapse, can be used to analyze and automatically edit videos. Speaker recognition identifies people by their unique voice characteristics. Custom Recognition Intelligent Services (CRIS) enables customizable speech recognition for noisy environments and challenging conditions.