Lei Jun officially announces the "mysterious model" that has sparked heated discussions among global developers; Xiaomi's "AI roadmap" is gradually becoming clearer.

Once speculated by the entire internet to be the “mysterious model” DeepSeek V4, it was eventually claimed by Xiaomi. This not only settled the developer community’s mystery about its origins, but also provided the capital market with a clearer reference for how Xiaomi’s AI investments are being implemented.

Recently, an AI model called Hunter Alpha was anonymously launched on the developer platform OpenRouter, attracting global attention from the developer community. Hunter Alpha was launched on March 11 as an “invisible model,” featuring over a trillion parameters and a one million token context window. Due to its parameter specs closely matching the rumored DeepSeek V4, it sparked widespread speculation.

Xiaomi put an end to the ongoing speculation with an official announcement. In the early morning of March 19, Xiaomi released a trio of updates for its MiMo large model series: the flagship base model MiMo-V2-Pro, the all-modal agent model MiMo-V2-Omni, and the speech synthesis model MiMo-V2-TTS. Subsequently, Xiaomi’s founder Lei Jun stated on Weibo that Xiaomi had just released its trillion-parameter large model MiMo-V2-Pro, and revealed that Xiaomi’s actual progress in AI “might be much faster than what everyone sees”, with this year’s R&D and capital investment in AI expected to exceed 16 billion.

Regarding this, Goldman Sachs stated in a research report released on March 19 that the concentrated release of these three flagship models marks Xiaomi's transition from AI R&D investment to the phase of actual achievement realization, and its market positioning as the “physical AI leader” is gradually gaining substantive support. The report maintains a "Buy" rating for Xiaomi, with a 12-month target price of HK$41, implying about 14% upside over the current stock price.

Xiaomi’s Three Models Make a Grand Debut

Xiaomi MiMo-V2-Pro is specifically designed for high-intensity agent work scenarios in the real world. It boasts over 1T total parameters (42B active parameters) and supports a 1M ultra-long context window. For its underlying architecture, it inherits a hybrid attention mechanism and significantly boosts the hybrid ratio from 5:1 to 7:1, balancing ultra-large scale with high inference efficiency.

MiMo-V2-Omni is positioned as Xiaomi’s fully modal foundational model, integrating multimodal comprehension capabilities for images, video, and audio, along with powerful agent abilities. According to the Goldman Sachs report, this model matches or surpasses Gemini 3 Pro, Claude Opus 4.6, Gemini 3, and GPT-5.2 across key metrics in audio, image, video understanding, and agent capabilities.

MiMo-V2-TTS is aimed at the era of voice agents, offering highly controllable multi-granular style control, natural rhythm reproduction, and singing capabilities. Goldman Sachs notes its next target is to expand beyond Chinese and English language coverage, and to deeply integrate with MiMo-V2-Omni’s multimodal comprehension, enabling agents to describe the real world with near-human expressiveness in speech.

All three models have been integrated into WPS Office, Xiaomi phones and computers via the miclaw agent system, as well as the Xiaomi browser.

Flagship Model Performance: Global No. 8, Significant Cost Advantage, Deep Optimization for Agent Scenarios

MiMo-V2-Pro is the core of this release. The model features over one trillion total parameters, 42 billion active parameters, and a 1 million token context window. On the Artificial Analysis Intelligence Index global comprehensive model ranking, it ranks eighth worldwide, second among Chinese models, surpassing xAI Grok and just behind Gemini 3.1 Pro Preview, GPT-5.4, GPT-5.3 Codex, Claude Opus 4.6, Claude Sonnet 4.6, and GLM-5.

Goldman Sachs points out that cost efficiency is another core competitive advantage of MiMo-V2-Pro. According to Artificial Analysis data, the cost to run the model for the Intelligence Index test is $348, which is 36% lower than GLM-5 (also ranked highly), and 90% lower than Claude Sonnet 4.6. Compared with Claude Opus 4.6 and Claude Sonnet 4.6, MiMo-V2-Pro’s token usage cost is as much as 80% lower.

In terms of training efficiency, Xiaomi’s recently launched ARL-Tangram system is now deployed to support MiMo model training, achieving a 4.3x improvement in average action completion time, up to 1.5x acceleration for reinforcement learning training, and external resource savings of up to 71%.

Notably, MiMo-V2-Pro is deeply optimized for agent scenarios. MiMo-V2-Pro performs SFT & RL for complex and diverse agent scaffolds, enabling stronger tool invocation and multi-step reasoning abilities. On the OpenClaw standard evaluation lists PinchBench and ClawEval, MiMo-V2-Pro is among the world’s best. Additionally, with its 1M ultra-long context window, MiMo-V2-Pro can effortlessly support high-intensity real-world Claw complex application streams.

Clear AI Roadmap: Systematic Layout from Models to Ecosystem

The Goldman Sachs report emphasizes that this release is not an isolated event, but part of Xiaomi’s systematic push to accelerate turning AI R&D investment into actual achievements.

Previously, in February this year, Xiaomi released the vision-language-action model Xiaomi-Robotics-0 for robotic reasoning and real-time execution; in March, it released the AI agent system miclaw; and on March 19, concurrently launched an upgraded assisted driving system HAD powered by the XLA cognitive model, which integrates Xiaomi’s self-developed cross-embodied foundational model MiMo-Embodied.

Goldman Sachs has mapped out clear iterative directions for the three models: MiMo-V2-Pro’s next target is to solve high-complexity reasoning and long-term task planning; MiMo-V2-Omni aims for cross-hour and even cross-day continuous intent planning, real-time stream sensing, and execution of robot and hand actions; MiMo-V2-TTS will expand into more languages and deepen integration with Omni.

Goldman Sachs believes that with leading multimodal AI capabilities and rich agent application scenarios in the “people-car-home” ecosystem, Xiaomi has the potential to capture the vast global AI model industry market and is poised to create high-premium, differentiated consumer-grade AI hardware.

Increased Investment and Valuation Rationale: Short-Term Profit Pressure, Long-Term Value Reassessment

Goldman Sachs expects Xiaomi’s R&D expenditure to reach RMB 40 billion in 2026, up from the estimated RMB 32.2 billion in 2025. The continuous increase in investment will weigh on near-term profits. Financial data shows that Goldman Sachs forecasts Xiaomi's net profit (excluding special items) at RMB 27.9 billion in 2026, lower than the estimated RMB 39.5 billion in 2025, corresponding to a 2026 P/E ratio of about 27.4 times.

Nevertheless, Goldman Sachs believes that continued achievement delivery should drive the market to revalue Xiaomi as a physical AI leader with in-house AI, operating system, and chip capabilities, rather than only measuring its value by near-term P/E ratio. Goldman Sachs maintains a Buy rating with a target price of HK$41, based on a sum-of-the-parts valuation method, including a 16x 12-month forward EV/NOPAT for Xiaomi’s core business, a DCF valuation (US$45 billion) for Xiaomi’s electric vehicle business, and applying a 10% holding discount.

Risk Warning and DisclaimerThe market involves risk, and investors should be cautious. This article does not constitute personal investment advice, nor does it take into account individual users’ specific investment objectives, financial situations, or needs. Users should consider whether any opinions, viewpoints, or conclusions in this article are suitable for their particular situation. Investing based on this is at your own risk.