Second Wave of DeepSeek Impact: V3.2 Rewrites China’s Cloud and Chip Ecosystems

The release of DeepSeekV3.2 marks the official entry of China’s AI market into the “second wave of impact” stage.

On December 6, according to Hard AI news, JPMorgan stated in a research report that this is not just a model iteration, but a structural revolution targeting inference costs and hardware ecosystems. Through architectural innovation, DeepSeek has further reduced API prices by 30-70%, causing long-context inference costs to plummet by 6-10 times.

The report stressed that more crucially, V3.2-Exp achieved “Day-0” first-day support for non-CUDA ecosystems (Huawei Ascend, Cambricon, Hygon), completely breaking the dependency path of cutting-edge models on Nvidia hardware.

According to JPMorgan’s analysis, the beneficiaries include cloud operators Alibaba, Tencent, Baidu, and chip manufacturers Zhongwei Company, North Huachuang, Huaqin Technology, and Inspur Information. The V3.2 model is expected to further boost the adoption rate of generative AI in China over the next few quarters.

According to a WallstreetCN article, on December 1, DeepSeek released and open-sourced two models in the V3.2 series. V3.2 is oriented toward everyday applications, with inference ability matching GPT-5, and for the first time integrates thinking patterns with tool invocation; V3.2-Speciale focuses on ultimate inference, winning gold medals in four international competitions: IMO, CMO, ICPC, and IOI.

Performance and Architecture: Extreme Efficiency and “Agent” Evolution

DeepSeekV3.2 is not simply a stacking of parameters but achieves a qualitative leap in efficiency through algorithmic innovation. The model continues the hybrid expert (MoE) architecture of V3.1, but introduces the DeepSeek Sparse Attention mechanism (DSA).

JPMorgan pointed out that as the follow-up product to the experimental V3.2-Exp model first released on September 29, the V3.2 model introduces the DeepSeek Sparse Attention mechanism (DSA) via continued training, which is the only architectural change. It reduces long-context computation while maintaining public benchmark performance.

Specifically, it includes the following four aspects:

Architectural breakthrough: The DSA mechanism uses a lightning indexer to select key-value entries, directly reducing the computational complexity of long-context scenarios from quadratic (O(L2)) to quasi-linear (O(L·k)).

Performance data: In long-text environments with 128k tokens, V3.2 inference speed is 2-3 times faster than the previous generation, GPU memory usage is reduced by 30-40%, and not only does model performance not decrease, but it maintains a very high standard.

Agent positioning: V3.2 is explicitly positioned as an “inference-priority model for Agents.” It realizes a deep intertwining of “thinking + tool calling”—the model can combine chain-of-thought and tool use (API, search, code execution) in a single trajectory.

Premium version: The Speciale version excels in Olympic-level math competitions and competitive programming, with inference benchmarks comparable to Gemini3.0Pro and GPT-5-grade systems.

Pricing Revolution: Deflationary Inference Economics

JPMorgan pointed out that the release of DeepSeekV3.2 has once again cemented its status as the “price butcher,” showcasing an astonishing cost-performance advantage especially when compared to top US models.

The report argues that efficiency improvements from the DSA architecture have directly resulted in structural API price reductions, specifically reflected as follows:

Specific pricing: For V3.2 Reasoning, the input price per million tokens is reduced to $0.28, output price to $0.42.

Reduction comparison: Compared to the V3.1Reasoning released in September 2025 (input $0.42/output $1.34), output cost has dropped 69%, input cost reduced by 33%. Compared to the R1 model from January 2025, the price advantage has multiplied exponentially.

According to JPMorgan analysis and third-party benchmarks, the actual cost for some long-context inference workloads has dropped by 6-10 times. This pricing strategy forces the market to redefine the cost benchmark for “frontier-level” capability, exerting immense downward pricing pressure on all competitors.

In ArtificialAnalysis’s intelligence index and price comparison, DeepSeekV3.2 is in the “high intelligence, extremely low price” absolute advantage quadrant.

Ecosystem Restructuring: “Day-0” Moment for Domestic Chips

According to the report, DeepSeekV3.2 marks a shift in China’s AI models from simple reliance on Nvidia CUDA ecosystems to proactively adapting to domestic hardware.

JPMorgan stated, V3.2-Exp is among the first cutting-edge models to be optimized for non-CUDA ecosystems on release day (“Day-0”), supporting Huawei’s CANN stack and Ascend hardware, Cambricon’s vLLM-MLU, as well as Hygon’s DTK.

This sends a strong signal to the market—GPT-5-grade open-source models can run efficiently on domestic accelerators. This essentially lowers execution risks for Chinese AI buyers from the bottom up, directly boosting incremental demand for domestic AI chips and servers.

This article is from WeChat official account "Hard AI". For more AI frontier news, go here

Risk Warning and DisclaimerThe market carries risks, investment requires caution. This article does not constitute personal investment advice, nor does it take into account specific users’ investment objectives, financial situation, or needs. Users should consider whether any opinions, views, or conclusions in this article are suitable for their own circumstances. Invest accordingly at your own risk.