Facing TPU and Trainium? NVIDIA publishes another article to "prove itself": GB200 NVL72 can boost open-source AI model performance by up to 10 times.

```

NVIDIA is facing challenges from competitors such as Google’s TPU and Amazon’s Trainium. To consolidate its dominance in the AI chip market, the company has launched a series of intensive technical “self-endorsement” campaigns and public responses recently. After previously rebutting bearish views with private letters and publicly claiming its GPU technology is “a generation ahead of the industry,” NVIDIA has again released a technical blog post, emphasizing that its GB200 NVL72 system can boost top open-source AI model performance by up to 10 times.

On December 4, according to media reports, NVIDIA stated in a post that the GB200 NVL72 system can enhance the performance of top open-source AI models by up to 10 times. In a blog post on Wednesday, the company highlighted its server system’s optimization capabilities for Mixture of Experts (MoE) models, including China’s startup Moonshot’s Kimi K2 Thinking and DeepSeek’s R1 model.

NVIDIA’s series of technical “self-endorsements” are seen as direct responses to market concerns. Media previously reported that NVIDIA’s key client Meta is considering sourcing Google’s self-developed AI chip, Tensor Processing Unit (TPU), on a large scale for its data centers. According to Wallstreetcn, Google’s TPU directly challenges NVIDIA’s over 90% market share in AI chips. The market is concerned that if hyperscale clients like Meta turn to Google, it will mean NVIDIA’s previously solid moat may be breached.

NVIDIA’s intensive public statements have not eased market worries; the company’s stock price has fallen nearly 10% over the past month.

GB200 NVL72 Technology Advantages Stand Out

NVIDIA stated on its official blog that the GB200 NVL72 system can significantly increase the performance of leading open-source AI models. The blog detailed the technical advantages of the GB200 NVL72 system. The system integrates 72 NVIDIA Blackwell GPUs into a single unit, providing 1.4 exaflops of AI performance and 30TB of fast shared memory. Through NVLink Switch connections, the internal GPU communication bandwidth reaches 130TB/s.

In performance tests, Kimi K2 Thinking, rated as the most intelligent open-source model by the Artificial Analysis leaderboard, achieved a 10-fold performance boost on the GB200 NVL72 system. Other top MoE models such as DeepSeek-R1 and Mistral Large 3 also demonstrated significant performance improvements.

Mixture of Experts (MoE) models have become mainstream for cutting-edge AI models. NVIDIA pointed out that all top 10 models on the Artificial Analysis leaderboard use MoE architecture, including DeepSeek-R1, Kimi K2 Thinking, and Mistral Large 3. This architecture mimics the human brain’s working method, activating only the specific 'expert' modules needed for a particular task, rather than all model parameters,which allows MoE models to generate tokens faster and more efficiently without disproportionate increases in computing costs.

NVIDIA emphasized that its system, through co-design of hardware and software, addresses MoE models’ scaling challenges in production environments and effectively removes the performance bottlenecks found in traditional deployments.

Cloud Service Provider Deployments Accelerate

NVIDIA revealed that the GB200 NVL72 system is being deployed by major cloud service providers and NVIDIA cloud partners, including Amazon Web Services, Core42, CoreWeave, Crusoe, Google Cloud, Lambda, Microsoft Azure, Oracle Cloud Infrastructure, and Together AI.

Peter Salanki, co-founder and CTO of CoreWeave, said: "At CoreWeave, our customers are leveraging our platform to put Mixture of Experts models into production. Through close collaboration with NVIDIA, we are able to offer a tightly integrated platform."

Lin Qiao, co-founder and CEO of Fireworks AI, said: "NVIDIA GB200 NVL72’s rack-scale design significantly boosts MoE model service efficiency, setting new benchmarks for performance and efficiency in large-scale MoE model serving." Reportedly, the company is currently deploying the Kimi K2 model on the NVIDIA B200 platform, achieving top performance rankings on the Artificial Analysis leaderboard.

Risk Warnings and DisclaimerThe market involves risks, investment should be taken with caution. This article does not constitute individual investment advice, nor does it take into account specific users’ investment objectives, financial situations, or needs. Users should consider whether any opinions, views, or conclusions in this article are appropriate to their own circumstances. Investment decisions made based on this article are at your own risk. ```