The next gateway to superintelligence: Google, Meta, Nvidia... Tech giants are doubling down on "world models"

The next gateway to superintelligence: Google, Meta, Nvidia... Tech giants are doubling down on "world models"

```

As the progress of large language model technology slows, a new AI competition centered around "world models" is quietly unfolding among tech giants. This trend signals that the focal point of AI competition may be shifting from the domain of language to the understanding and simulation of the physical world.

According to a September 29 report by the UK's Financial Times, companies such as Google DeepMind, Meta, and Nvidia are striving to gain an edge by developing a new type of system. These systems no longer rely solely on language and text but instead learn to understand and navigate the physical world through video and robotics data.

The potential market for "world models" is considered extremely vast. Rev Lebaredian, VP of Omniverse and Simulation Technology at Nvidia, stated that "world models" bring technology into physical sectors such as manufacturing and healthcare, and the potential market size may "reach as high as $100 trillion."

"World models" are seen as a key step towards progress in autonomous driving, robotics, and so-called "AI agents," but their training also faces enormous challenges in terms of data and computing power.

Simulating the Physical World: Latest Technological Breakthroughs

In recent months, several AI companies have successively released progress in the field of "world models," highlighting the heating up of this track.

Google DeepMind released Genie 3 last month, a model that can generate video frame by frame and take into account past interactions, changing the traditional approach of models that generate entire videos at once. Shlomi Fruchter, co-lead on the Genie 3 project, said that by building simulation environments of the real world, AI can be trained in a more scalable way and "without having to bear the consequences of making mistakes in the real world."

Meta is attempting to emulate the way children passively learn by observing the world, using raw video content to train its V-JEPA model. Facebook AI Research (FAIR), led by Meta’s Chief AI Scientist Yann LeCun, released the second version of this model in June and has begun testing it on robots.

At the same time, Nvidia CEO Jensen Huang asserted that the company's next major growth phase will come from "physical AI," and these new models will completely transform the field of robotics. Nvidia is using its Omniverse platform to create and run such simulations to support its expansion into the robotics field.

One recent application of "world models" is in the entertainment sector. World Labs, a startup founded by AI pioneer Fei-Fei Li, is developing a model that can generate video-game-like 3D environments from a single image.

Video generation startup Runway also launched a product last month that uses "world models" to create game scenarios. CEO Cristóbal Valenzuela noted that compared to previous models, "world model" systems can better understand and reason about physical rules within scenarios.

Why Are Giants Betting on the New Track?

One core driving force for tech giants turning their attention to "world models" is the widespread belief in the industry that large language models are approaching the ceiling of their capabilities.

Even though companies are investing heavily, the performance leap of the new generation of LLMs released by institutions such as OpenAI, Google, and Musk's xAI has already begun to slow down.

Meta's Chief AI Scientist Yann LeCun, regarded as one of the "godfathers" of modern AI, has consistently warned that LLMs will never achieve human-like reasoning and planning abilities.

However, building these new models requires collecting massive amounts of physical world data and computing power, which remains a major unsolved technical challenge. Nevertheless, companies like Nvidia and Niantic are attempting to fill the data gap by generating or predicting environments using models.

Although the prospects are broad, the road to mature "world models" remains long. Meta's LeCun and others believe that machines with human-level intelligence driven by the next generation of AI systems may still require another decade.

Risk Warning and DisclaimerThe market involves risks and investment must be cautious. This article does not constitute personal investment advice and has not considered the particular investment objectives, financial situation, or needs of any individual user. Users should consider whether any opinions, views, or conclusions in this article suit their own circumstances. Investing based on this is at your own risk. ```