Abandoning motion capture, fully shifting to pure visual data collection—Tesla Optimus’ latest training progress revealed!
```
Tesla is training Optimus using pure visual data, enabling the robot to truly understand the world with its "eyes."
According to the latest report from Business Insider, Tesla has shifted its training method for the humanoid robot Optimus from motion capture to pure camera data collection. Dozens of data collection employees repeatedly perform daily actions in the lab to provide video training material for the robot to learn human behavior.

The report states that since June of this year, Tesla abandoned the previously used motion capture suits and teleoperation in favor of a camera-only data collection method. Workers wear helmets with five cameras and carry equipment backpacks weighing 30-40 pounds, repeatedly performing basic movements such as wiping tables, lifting cups, and pulling curtains.
During the third quarter earnings call, Musk described Optimus as "potentially the biggest product of all time," and predicted that the company will eventually produce one million robots annually. He also noted that Optimus could one day account for about 80% of the automaker's value.
Training methodology shifts entirely to camera data collection
In a glass laboratory at Tesla's engineering headquarters, data collection workers perform seemingly simple but extremely precise repetitive actions. Each movement must be repeated hundreds of times during an 8-hour shift, with all behavior fully recorded by the five helmet cameras and the backpack equipment.
In June of this year, after project director Milan Kovac left, the company informed employees that it would shift from motion capture suits and teleoperation to exclusively camera-based data collection. Workers said the team was told this method could scale up data collection more quickly.
In addition to the cameras worn by workers, Tesla has also installed fixed cameras around the work area. Jonathan Aitken, a robotics expert from the University of Sheffield, said these fixed camera towers can provide a broader environmental perspective to supplement the body-worn camera data.
Sometimes, workers are equipped with haptic gloves to track subtle hand movements. Musk once said that Tesla has devoted significant effort to developing human-like hands for Optimus, calling it an "extremely difficult engineering challenge."
AI-generated task instructions cover complex movement scenarios
Tesla has begun using AI-generated prompts to assist in training the robot. In some training exercises, workers receive a series of AI-generated instructions through the helmet device connected to the backpack and need to complete each task within 3-5 seconds.
According to workers, these exercises include squatting, doing the "chicken dance," imitating a gorilla, pretending to vacuum, sprinting short distances, pretending to play golf, etc. Some tasks even include baby intelligence games, such as stacking rings by size and color, or placing shapes into corresponding slots.
Two data collectors mentioned that some AI-generated tasks made them uncomfortable, including crawling on all fours or being asked to remove clothing. However, experts believe that these seemingly random tasks may help Tesla identify areas for improvement.
At the Fremont factory, data collectors also sort car parts and work on conveyor belts while wearing the helmet and backpack. Experts say collecting different data points for the same task is helpful for training.
Robot performance still faces technical challenges
Although in company videos, Optimus can walk, fold clothes, perform kung fu moves, and distribute candy at Times Square, its actual performance in training lags behind noticeably.

The report states that two workers said the robot falls about half the time when performing tasks that require bending or leaning, sometimes damaging expensive equipment. Unless the task requires moving more than a few feet, it is usually tethered to a support frame to keep it upright.
Aitken said that in a controlled environment like Tesla's office, the robot should be able to stand upright easily. "Getting it to stand up and maintain balance should be one of the first problems you solve."
Alan Fern, an AI and robotics expert at Oregon State University, pointed out that robotic demonstrations "are always the best demonstrations they can show you." When you see it performing kung fu, although it appears to be doing something intelligent, "it's just reacting to the environment; there's no cognitive thinking behind it."

Currently, more than 100 people have participated in data collection, but after the biannual performance review in September, the company laid off dozens of data collectors. Workers are rated based on task performance, with at least 4 hours of usable video footage required per shift.
Risk warning and disclaimerThe market has risks; investment needs caution. This article does not constitute personal investment advice and does not take into account the special investment goals, financial situations, or needs of individual users. Users should consider whether any opinions, views, or conclusions in this article are suitable for their particular situation. Investing based on this is at your own risk. ```