Tesla, Meta, and Figure—a "photon war" is underway

Tesla, Meta, and Figure—a "photon war" is underway

```

The field of artificial intelligence robotics is undergoing an unprecedented "photon scramble," as major tech giants are frantically collecting visual data from the real world to train AI robots.

According to Hard AI, Morgan Stanley stated in its latest research report that with the development of AI robots and embodied artificial intelligence, companies such as Tesla, Meta, and Figure AI are collecting large-scale visual data to train Visual Language Action (VLA) models.

Specifically, Tesla is transitioning to a "vision-only" training approach, Meta is collecting daily activity data through smart glasses, while Brookfield has partnered with Figure AI to deploy data collection across its massive real estate portfolio.

This trend means that for investors, visual data has become a new "goldmine" for AI training, and companies with data collection capabilities will hold a dominant position in the AI robotics race.

Morgan Stanley uses the "fat tuna" metaphor to explain the value of visual data: In 2019, a 612-pound bluefin tuna sold for $3.1 million at auction in Tokyo, but without fishing tools, the fish would be worth nothing. Similarly, without processing power (yottaflop-level computing power, 1 yottaflop = 1 trillion teraflops), the world's visual data is worthless. But once collection and processing capabilities are in place, this data becomes incredibly valuable.

Tesla’s Strategic Transformation: From Remote Control Operation to Pure Vision Training

Morgan Stanley states that Tesla is undergoing a major strategic shift in the training of its Optimus robot.

According to Business Insider, Tesla insiders have revealed that the company has shifted the training of its Optimus robot to a "pure vision" approach, abandoning traditional remote control operations, motion capture suits, and VR technologies, and is instead recording videos of workers performing tasks as training data.

In May 2025, the former head of Optimus at Tesla published a series of video clips on the X platform, demonstrating Optimus performing autonomous tasks reportedly learned from human videos. These videos initially used a first-person perspective (with cameras mounted on human demonstrators), but the ultimate goal is to expand to third-person perspectives captured by “random cameras” and content from the internet.

This strategic shift highlights the core value of visual data in AI robot training. As stated in the Morgan Stanley report: "When you drive a Tesla, you’re not just moving through physical space, you’re also playing a video game… feeding data into a simulated world to train Tesla’s latest FSD model."

Meta’s Smart Glasses: Turning Everyday Life into Training Data

Morgan Stanley’s internet team believes that although Meta's wearable devices are “long-term call options” that are unlikely to impact financials in the coming years, their strategic significance cannot be underestimated. Meta is advancing its long-term vision of integrating leading large models and agent capabilities into the next generation of wearables.

The Morgan Stanley report points out:

When you wear Meta glasses, you are teaching models how to play the piano, knit sweaters, pour coffee, or take out the trash.

Imagine if 20 million such devices were in operation within 2 years—almost twice the number of Teslas on the road—every Meta glasses user could be training a humanoid avatar to iterate through billions of scenarios in the metaverse.

Brookfield and Figure AI: A Real Estate Empire’s Data Collection Network

Morgan Stanley’s alternative investment team sees Brookfield as a leader in executing large-scale AI infrastructure solutions. The collaboration between Brookfield and Figure AI is viewed as an important step in building expertise in the rapidly developing humanoid robot sector.

Brookfield's extensive global footprint makes it a unique partner in helping Figure AI build the largest pre-training dataset. Brookfield is one of the largest real estate owners, with over 100,000 residential units, more than 500 million square feet of commercial office space, and 160 million square feet of logistics office space.

This collaboration will allow Figure AI to accumulate critical AI training data, teaching humanoid robots how to navigate, perceive, and act in various human-centered spaces. Data collection efforts have already begun in Brookfield environments, with the project expected to scale up in the coming months.

This article is from WeChat public account “Hard AI”. For more cutting-edge AI news, click here

Risk Warning and DisclaimerThe market carries risks, and investment requires caution. This article does not constitute personal investment advice and has not taken into account the specific investment objectives, financial situation, or needs of any individual user. Users should consider whether any opinions, views, or conclusions in this article are appropriate to their particular circumstances. Investing accordingly is at your own risk. ```