Tesla has shifted its training strategy for its humanoid robot Optimus. Instead of relying on motion capture suits and teleoperation, Tesla is moving toward a vision-only approach.
Workers now wear camera rigs made up of helmets and backpacks with five in-house cameras that record mundane tasks like folding a t-shirt or picking up an object. Those videos are then used to train Optimus to mimic the actions.
According to insiders, the change is meant to help Tesla scale data collection faster. This is a familiar blueprint for Elon Musk, as it’s the same camera-first strategy the company has used to train its self-driving software.
The shift comes after Milan Kovac stepped down as director of the Optimus program in June 2025. Since then, Tesla AI director Ashok Elluswamy has taken charge. Until June, the Optimus project had leaned heavily on motion capture suits and VR headsets, which are tools widely used across the robotics industry. Then in July at a Tesla Owners event, Musk said Optimus could generate $30 trillion in revenue and called humanoids “probably the world’s biggest product.”
Benefits and risks from Tesla’s strategy change
Christian Hubicki, director of the robotics lab at FAMU-FSU, noted to Business Insider that the multiangle camera setup likely captures “minute details, like the location of joints and fingers,” making the data more precise.
But others are viewing this change with caution. Robert Griffin, a senior research scientist at the Institute for Human and Machine Cognition, explained to BI in its exclusive that teleoperation data gives robots something video alone can’t: the ability to learn by physically interacting with their environment. Without that hands-on element, he said, it’s much harder for the AI to take what it sees on video and reliably apply it to the real world.
The scale problem
One major challenge will be teaching Optimus skills that can be applied to many tasks, rather than memorizing each action. Business Insider quoted Jonathan Aitken, a robotics expert at the University of Sheffield, who put it plainly: “Working at this scale, they must have a generalized set of actions or else it would take forever to do all of them.”
Musk acknowledges this. During a January earnings call, he said that “the training needs for Optimus humanoid robot are probably at least ultimately 10x of what is needed for the car.”
A very Tesla approach
Even with these hurdles, the strategy is consistent with Tesla’s broader philosophy, which is to lean into vision data and scale it massively. While most autonomous vehicle developers use Light Detection and Ranging (LiDAR) or radar alongside cameras, Tesla relies almost solely on visual inputs, pulling in data from millions of cars on the road. Optimus appears to be following the same formula.
As Aitken summed up in the BI article: “It’s a very Tesla way of doing robotics. No one else is trying to do this at the same scale.”
Whether video-driven learning will be enough to get Optimus from folding t-shirts to real-world factory work remains an open question.
In more robotics news, read eWeek’s coverage about the first World Humanoid Robot Games and a robot fight club.


