Microsoft Machine Learning Tech Adds Captions to Images

Microsoft Machine Learning Tech Adds Captions to Images

Microsoft machine learning
Nov 20, 2014
2 minute read
eWeek content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

Microsoft is applying its machine learning research to images and the unspoken information they may contain.

Microsoft Research Distinguished Scientist and Deputy Director John Platt and his team have been working on software that automatically captions images. The project began this summer, when a multi-disciplinary team of experts decided to tackle the challenge of distilling photos into a sentence that makes sense instead of a jumble of words.

The system devised by Platt and his group has beaten human-generated image captioning in tests although not always, he explained in a blog post. “I’m happy to report that, in terms of BLEU score, we actually beat humans,” he wrote. “Our system achieved 21.05 percent BLEU score while the human ‘system’ scored 19.32 percent.”

BLEU, or Bilingual Evaluation Understudy, is an algorithm used to determine the quality of a machine-translated text. “BLEU breaks the captions into chunks of length (one to four words), and then measures the amount of overlap between the system and human translations. It also penalizes short system captions,” he explained.

Despite the achievement, the technology is far from perfect.

“BLEU has many limitations that are well-known in the machine translation community,” he said. “We also tried testing with the METEOR [Metric for Evaluation of Translation with Explicit Ordering] metric, and got somewhat below human performance (20.71 percent vs. 24.07 percent).”

The technology was preferred by a fair amount of people when they were asked to evaluate Microsoft’s machine-generated captions. Using Amazon’s Mechanical Turk service, which pays “workers” to complete Human Intelligence Tasks online, “people thought that the system caption was the same or better than a human caption,” reported Platt.

Microsoft has been aggressively leveraging its machine learning research to build a smart software and services portfolio.

In March, Platt told attendees of the GigaOm Structure Data conference that “machine learning is pretty much pervasive throughout all Microsoft products. So, whenever you use a Microsoft product, you’re using a system that’s been generated from machine learning.”

Examples include the company’s Bing search engine and Kinect motion sensor. “The only way you can answer the billions of questions Bing answers is to have something that operates autonomously In Xbox; the Kinect was also trained with machine learning,” said Platt. “The fact that it can see you in the room even though it’s poor lighting and you can wave your arms and it can track you—that’s all done with a piece of software that was trained with machine learning.”

Microsoft has since introduced a new cloud-based machine learning service (Azure ML) for businesses venturing into predictive analytics. Office Graph, the underlying technology that powers the Office Delve app, leverages machine learning to determine the connections between workers and bring to the surface conversations and content in an effort to foster collaboration and improve productivity.

eWeek Logo

eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site's focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.