Goldman Sachs: The China-US AI gap has narrowed to "3-6 months," and the Doubao phone signals a "change in the landscape of application traffic."

Goldman Sachs: The China-US AI gap has narrowed to "3-6 months," and the Doubao phone signals a "change in the landscape of application traffic."

```

Author: Bao Yilong

Source: Hard AI

As 2025 draws to a close, on December 19, the Goldman Sachs research team published a report summarizing China's key advances and achievements in the field of artificial intelligence, and looking ahead at possible development trends in the coming year.

Goldman Sachs believes the gap between Chinese AI models and top US models has narrowed to 3-6 months. The report also suggests that the launch of AI assistants such as ByteDance’s Doubao mobile assistant may signal a fundamental change in the mobile app traffic landscape.

Goldman Sachs is optimistic about the cloud computing and data center sectors as well as beneficiaries across the AI stack. It forecasts that China’s top cloud service providers’ capital expenditures will rise further to 500 billion yuan in 2026, a 20% increase from the estimated over 400 billion yuan in 2025.

Moreover, the share of domestic chips and computing power is expected to increase sharply from 20-30% in 2025 to 40% in 2026.

At the same time, global AI video generation model market size is expected to grow from $1 billion in 2025 to $39 billion in 2033, a compound annual growth rate of 56% over 8 years.

Cutting-edge AI Models and Agent Capabilities Continue to Break Through

Goldman Sachs believes the technical gap between China and the US in AI is closing rapidly.

Although US models such as Google’s Gemini 3 and OpenAI’s GPT-5.2 remain leaders with each update, Chinese AI models typically catch up within the following 3-6 months and narrow the gap.

After Google released Gemini 3, OpenAI launched GPT-5.2 on December 11, further improving on GPT-5.1 in general intelligence, coding, and long-context understanding. This model performs better in creating spreadsheets, building presentations, image perception, tool usage, and handling complex multi-step projects.

Chinese companies are close behind. On December 16, Xiaomi released the open-source MiMo-V2-Flash model, which ranked among the top two open-source models globally in several agent benchmark tests. On December 18, ByteDance released the latest Doubao-Seed-1.8 large language model, further enhancing agent and multimodal capabilities.

Goldman Sachs observes that US foundational text and multimodal models (such as GPT5, Sora2, Gemini 3 Pro) continue to lead with each update, while Chinese AI models generally catch up within 3-6 months and narrow the gap, followed by another stage leap.

The New Era of AI Assistants—Doubao Mobile Assistant

On December 1, the ByteDance Doubao team released the technical preview of the Doubao mobile assistant (the first device being ZTE Nubia), an AI assistant integrated at the OS level through cooperation with smartphone manufacturers.

Zhipu AI also open-sourced AutoGLM, an AI agent model capable of interpreting screen content and simulating input to perform multi-step tasks such as food delivery orders and flight bookings in more than 50 high-frequency Chinese apps.

Xiaomi revealed that its smartphone AI agent, Super Xiao Ai, has 120 million monthly active users and 65 million daily sessions, capable of more than 3,000 agent skills across 100 commonly used internet apps using various protocols.

The report notes that smoother capabilities in OS-level AI assistants may bring users closer to an era of voice-commanded/all-in-one assistants, where users focus on entertainment content on mobile devices while the assistant replies to messages in the background via voice activation.

Goldman Sachs believes that OS-level intelligent assistants do have the potential to disrupt the existing application market landscape, and they might pose a threat to existing app traffic and advertising business. However, at the same time, these new technologies may also bring new business opportunities.

Nevertheless, the report stresses that “walled ecosystem” or security issues present challenges.

Goldman Sachs continues to focus on the challenges ByteDance faces utilizing AI technology in fiercely competitive areas (such as social, music, transactions, instant messaging, etc.), and the potential impacts of these challenges on its business development.

In fact, in these areas, ByteDance's apps have recently been topping the list of free domestic app downloads on the iOS system.

AI Inference Demand Surges: Daily Token Processing Exceeds 50 Trillion

On December 18, ByteDance announced that Doubao large model’s daily token usage had exceeded 50 trillion (up from 30 trillion in October), ranking first in China and third globally, highlighting the sustained strong momentum in AI adoption and token usage.

ByteDance’s Volcano Engine previously announced that its MaaS now serves 80% of leading FMCG brands, 90% of major auto OEMs, 80% of top brokerage firms, 70% of China’s leading 985 universities, and 9 of the world’s top 10 smartphone manufacturers by shipments.

The company expects revenue to exceed 20 billion yuan in FY2025, doubling from last year.

Goldman Sachs expects Chinese cloud service providers’ capital expenditure to further grow 20% to about 500 billion yuan in 2026, with the proportion spent on domestic chips/computing power to significantly increase (from an estimated 20-30% in FY2025 to 40% in FY2026).

Chinese Multimodal Models Accelerate Global Penetration

Chinese AI models are accelerating global market penetration through cost, open source, and speed advantages.

In mid-December, BBAT (Baidu, ByteDance, Alibaba, Tencent) released new multimodal models:

Alibaba (December 16): Released a new version of the Wan2.6 video generation model, supporting multi-shot narrative and stable multi-character dialogue, capable of generating 15-second 1080p HD video. The Tongyi app officially integrated with Gaode Maps, marking another milestone in its evolution into a life/work partner.

Tencent (December 17): Released Hunyuan WorldPlay 1.5, a streaming video diffusion model capable of interactive world modeling with long-term geometric consistency in real time.

ByteDance (December 18): Released Doubao-Seed-1.8 and the video generation model Seedance 1.5 Pro, a multimodal model for jointly generating audio-video from text or images.

As the gap in multimodal capabilities with global companies narrows, Goldman Sachs believes Chinese AI models differentiate themselves via open source and competitive pricing and speed.

The report compares data showing Kuaishou’s Kelin 2.5 Turbo is far cheaper and competitively performant compared to Google’s Veo 3/OpenAI’s Sora 2.

Goldman Sachs forecasts the total addressable market (TAM) for global AI video generation models to grow from $1 billion in 2025 to $39 billion in 2033, an 8-year CAGR of 56%. Specifically:

Professional user market: Expected to reach about $700 million in 2025 (64% of TAM), expanding to $17 billion by 2033 with an 8-year CAGR of 49%, mainly driven by rising paid user ratio and ARPU growth.Enterprise user market: Expected to expand from about $400 million in 2025 to $22 billion in 2033, an 8-year CAGR of 66%, then representing 57% of total TAM, primarily driven by greater AI penetration in digital video advertising production and film/entertainment video creation.

In the global foundational model market, Goldman Sachs believes Chinese vendors’ revenue share will be about 4% in 2025, and with improved model capabilities and pricing (token pricing discounts narrowing), is projected to steadily increase to 7% by 2029.

Evolution of the Chip Supply Landscape and Capex Outlook

Goldman Sachs believes that China’s large cloud computing companies’ multi-chip combination strategy undoubtedly brings new opportunities for the development of China’s AI cloud industry—these companies no longer fully rely on overseas chip supplies.

Goldman Sachs estimates that BBAT (Baidu, ByteDance, Alibaba, Tencent) total capital expenditures in 2025 will exceed 400 billion yuan (up 62% year-over-year).

For 2026, the report forecasts that large Chinese internet companies’ capital expenditures will grow a further 20%, with a significant portion allocated to domestic chip procurement or the construction of relevant computing facilities.

Goldman Sachs believes Alibaba’s capex this year far exceeds Tencent’s, largely thanks to Alibaba’s strong capabilities in AI infrastructure and full-stack technology, akin to Google’s full-stack advantage from proprietary TPUs.

Analysts continue to be optimistic about the capex outlook for large Chinese internet companies. According to forecasts, Alibaba’s capital expenditures during 2026-2028 will remain at industry high levels, expected to reach 460 billion yuan.

The report notes that higher computing efficiency may boost the conversion rate of AI capital investments into revenue, accelerating cloud revenue growth supported by strong training or inference demand.

Based on its judgment of all-stack AI capabilities, explosive cloud computing demand, and data center computing power expansion, Goldman Sachs gives a clear sector preference ranking: top is cloud computing/data center, second is gaming, third is mobile travel, and lastly, e-commerce.

This article is from WeChat public account "Hard AI". For more AI frontier news, please go here

Risk Warning and DisclaimerThe market has risks; investment requires caution. This article does not constitute personal investment advice, nor does it take into account individual users’ unique investment objectives, financial situations, or needs. Users should consider whether any opinions, viewpoints, or conclusions in this article suit their specific circumstances. Invest accordingly and at your own risk. ```