Microsoft is turning vacations, walks around the neighborhood and nature hikes into an opportunity for users of its Bing mobile apps to learn more about their surroundings and help improve the company’s artificial intelligence technologies along the way.
The software giant has added new intelligent visual search technology to the Bing mobile apps for Android and iOS, along with the Microsoft launcher app for Android (formerly Arrow Launcher), that enables users to point their cameras at the landmarks and objects they encounter during their wanderings and get more information about them. The feature is one of the many ways the company is showcasing how its AI and machine learning technologies can be baked into applications, starting with its own.
“The visual search feature uses Microsoft’s computer vision algorithms, which are trained with datasets containing vast amounts of labeled images, as well as images from around the web. From the training images, the algorithms learn to recognize dogs from cats, for example, and roses from daisies,” explained Microsoft AI writer John Roach in a June 21 blog post.
“What’s more, the learning process is never done; the performance of the algorithms improves as they get more data,” he added.
Microsoft envisions developers taking its computer vision technology and using it in their own apps, and the company has been steadily releasing tools that simplify embedding AI capabilities into third-party software. For example, mobile developers can use Custom Vision Service’s code-free, AI model export feature to add image recognition functionality to iOS and Android apps.
Custom Vision, part of the Azure Cognitive Services suite of cloud services, allows users to train, deploy and optimize image classifiers. In May, Microsoft revealed that the offering was the first Azure Cognitive Service to make the leap from the cloud to the edge, setting the stage for drones and internet of things (IoT) devices that can process visual information without connecting to the cloud service.
Google and IBM Advance Computer Vision
Of course, Microsoft’s rivals aren’t sitting still on the computer vision front. DeepMind, a subsidiary of Google’s parent company Alphabet, recently unveiled a system that can re-create 3D scenes from 2D images.
Using a new framework called Generative Query Network (GQN), machines are trained with only the data they obtain themselves, essentially enabling them to learn based on the information they gather from their surroundings, explained DeepMind research scientists S. M. Ali Eslami and Danilo Jimenez Rezende in a June 14 announcement. Using the technology, DeepMind’s researchers were able to devise a renderer that can re-create a 3D scene from information the system observed in a flat 2D image.
Meanwhile, IBM is working on a stereo vision system that can capture scenes in 3D, similar to the way the human brain can perceive depth from the information produced by a pair of eyes. Powered by the company’s TrueNorth chips for AI neural networks, the system can detect an object’s location and distance using less power than conventional systems, paving the way for high-performance vision systems for autonomous vehicles.