Robots have long followed instructions. Now Google wants them to understand what they’re actually doing.
The company has introduced Gemini Robotics-ER 1.6, a new AI model designed to help machines interpret real-world environments, plan tasks, and determine when a job is done.
The system focuses on “embodied reasoning,” which lets robots connect what they see with what they need to do. It improves how machines understand space, combining inputs from multiple cameras and real-world signals. Google has made the model available through its Gemini API and AI Studio for developers building robotics and automation systems.
A step toward real-world autonomy
Google said Gemini Robotics-ER 1.6 improves how robots process and understand their surroundings. The model strengthens spatial reasoning, enabling machines to identify objects, count them, and understand relationships between items in a scene.
Interesting Engineering noted that it also introduces multi-view reasoning, enabling robots to combine inputs from different cameras, such as overhead and wrist-mounted feeds. This capability is essential in dynamic real-world settings where visibility may be limited or constantly changing.
The model also supports natural language interaction, allowing users to assign complex tasks in plain language and break them into smaller steps, as stated in the Gemini API documentation.
According to Google DeepMind’s official blog, the model was designed as a “reasoning-first” system that helps robots bridge the gap between digital intelligence and physical action.
New capabilities expand industrial use cases
The update introduces several practical features aimed at enterprise environments. Interesting Engineering stated that one of the most notable capabilities is instrument reading, which allows robots to interpret gauges, sight glasses, and digital displays commonly used on industrial facilities.
Google DeepMind’s blog also cited Marco da Silva, vice president and general manager of Spot at Boston Dynamics. “Capabilities like instrument reading and more reliable task reasoning will enable Spot to see, understand, and react to real-world challenges completely autonomously,” he said.
Interesting Engineering explained that Google reported significant gains in this area, with instrument reading accuracy improving from 23% in earlier models to as high as 93% with advanced vision capabilities.
The model also improves task planning, safety awareness, and how robots interact with objects, helping them operate more reliably in real-world environments.
What it means for enterprise robotics
Gemini Robotics-ER 1.6 highlights a change in robotics. Systems are moving away from rigid instructions and becoming more flexible in how they respond to real-world conditions.
This change could impact industries ranging from manufacturing and logistics to energy and facility management. Robots equipped with advanced reasoning could handle inspections, navigate complex environments, and respond to changing conditions with greater flexibility.
Google has made the model available to developers through its API and AI Studio to help developers test and build faster. The company is also encouraging collaboration with partners to refine the model for specialized use cases.To learn more about Gemini’s capabilities, read our coverage of six Google Gemini features most people haven’t discovered yet.


