Software capable of automatically recognizing content in your vacation photos and tagging them so they become easier to search or capable of detecting inappropriate content in images hosted on your crowd-sourced site is not entirely new.
But Google wants such capabilities to become more broadly available through its Cloud Vision API, technology it previewed in limited fashion last December and that it moved into open beta this week.
Cloud Vision API is designed for developers looking to add image recognition capabilities in their software. According to Google, it gives them a way to embed capabilities in their software for automatically categorizing images, detecting faces and sentiment, and detecting text inside images.
Under the open beta program announced Feb. 18, software developers and Google’s Cloud Platform customers can submit images to the Cloud Vision API and have it perform tasks like detecting faces and geographic landmarks in them, or detecting company logos, text, dominant colors and inappropriate content.
Ram Ramanathan, product manager at Google’s Cloud Platform group, described the API as Google’s first step toward enabling applications to more efficiently “see” and “hear” things.
“Now anyone can submit their images to the Cloud Vision API to understand the contents of those images from detecting everyday objects, for example, ‘sports car,’ ‘sushi,’ or ‘eagle,’ to reading text within the image or identifying product logos,” he said in a blog post.
In addition to announcing the open beta availability of the API, Google this week also released pricing information for those planning on using it in this phase. The pricing is based on the specific tasks that developers or Google’s cloud platform customers want the API to do.
For instance, customers who want to apply the API’s optical character recognition or facial recognition capabilities to images will pay nothing for the first 1,000 images and then $2.50 per 1,000 images for up to 1 million images. Customers who use the API to execute full content analysis on the image similarly will pay nothing for the first 1,000 images and then $5 for processing per 1,000 images up to 1 million images.
During the beta phase, Google will cap the number of photos that users can submit to the API to 20 million per month.
Cloud Vision API incorporates technologies from Google’s workaround machine learning, especially TensorFlow, a numerical computing library that Google recently released to the open-source community. TensorFlow powers several of Google’s services, including Smart Reply in Inbox, Google Translate and image search in Google Photos.
Google has said that it expects Cloud Vision API to enable innovative new applications that take advantage of advanced image recognition capabilities. According to Ramanathan, thousands of companies have been attempting to do just that since Google announced a limited preview of the API in December.
As one example, he pointed to Photofy, a photo editing Website that moderates some 150,000 user-submitted photos a day. The site is using the API to flag violent and other inappropriate content in the photos. Yik Yak is another site that is using the software for extracting text in multiple languages from image content, Ramanathan said.