Google Working on 3D Classifiers to Solve AR Challenge

By Clint Boulton  |  Posted 2011-03-20 Print this article Print

"The challenge is to figure what is the most relevant thing for the user," Nalawadi said. "You could throw a lot of info on there, but it would confuse the user. You need to make sure you are sending right users the set of things with AR. These are the user experiences challenges that we haven't cracked."

Even if Google can solve the challenge of figuring user intent via the AR lens, Goggles needs a lot of work. Superfish CTO Joe Dew, whose company also makes computer visions software that competes with Goggles, told eWEEK, Goggles has yet to solve the problem of recognizing most 3D objects.

For example, he said if a user "takes a pair of scissors, put them on a white piece of paper and Goggles probably won't find it." This is because Goggles becomes confused by the two objects, the scissors and the paper.

Superfish is working on this problem, which Google has addressed for recognizing landmarks by applying a classic, if not crude two-dimensional approach.

Specifically, Google accounts for all of the fixed, finite camera angles picture takers employ when they snap pictures of, for example, the Eiffel Tower. Still, solving the 3D challenge with a 2D-based methodology is an approach Nalawadi acknowledged is hardly ideal.

Google is working on hierarchical classifiers -- essentially programming tools that help computer vision software distinguish between objects -- that what a user is looking at is a car as well as product verticals such as shoes, handbags and jewelry.

In time, a user will be able to snap a picture of a handbag on a rack of handbags with their mobile phone, using Goggles. Goggles will recognize that the user wants to learn more about the bag in the foreground and ignore all of the other bags and other external, peripheral distractions in the image and give you a match.

"We have a lot of PH.Ds looking at it to solve the problem in generic way so that we can train engines to recognize a large class of objects, and then train instances within the classes," Nalawadi said.

"Can we train a trainer with a set of images to understand what is a boot, a car, or earrings? It's not easy, but we feel that is the more generic approach."

Above and beyond that audacious goal, Google needs to fill out the long tail of search. For example, Google needs to be able to recognize an entire vineyard's product lineup instead of just 100 of the most popular wine bottles. 



Submit a Comment

Loading Comments...
Manage your Newsletters: Login   Register My Newsletters

Rocket Fuel