Meet during Spring Festival? DeepSeek next-generation model: "High cost-performance" innovative architecture, helping China overcome the bottleneck of "computing power chips and memory."

```

Author: Bao Yilong

Source: Hard AI

Nomura Securities points out that DeepSeek's upcoming next-generation large model V4 is not expected to trigger the global AI computing panic seen with last year's V3. However, it may accelerate the commercialization of global large language AI applications through two fundamental architecture innovations.

WallstreetCN mentioned that according to reports, DeepSeek's new flagship model V4 is expected to be released in mid-February 2026. Internal preliminary tests indicate that V4 surpasses other top models currently on the market in programming ability, such as Anthropic's Claude and OpenAI's GPT series.

A core question arises once again: Will V4 once again disrupt the global AI value chain? Nomura Securities gives a clear answer in its "Global AI Trend Tracker" report released on February 10: No.

The report points out that the significance of this release lies in V4's potential to further reduce training and inference costs through innovative architectures (mHC and Engram technology), accelerating the innovation cycle of China's AI value chain.

At the same time, it is expected to help global large language model and AI application companies accelerate commercialization, thus alleviating the increasing pressure of capital expenditures.

Innovative Technical Architectures Bring Performance and Cost Optimization

The report points out that computation chips and memory have always been the bottleneck for Chinese large models. The two key technologies that V4 is expected to introduce—mHC and Engram—optimize these hard constraints from both algorithmic and engineering perspectives.

mHC:Full name: "Manifold-Constrained Hyper-Connection." Its purpose is to address the bottleneck of information flow and instability in training when Transformer models become deeply layered.Simply, it makes the "conversation" between neural network layers richer and more flexible, while strict mathematical "guardrails" prevent information from being amplified or corrupted. Experiments prove that models using mHC perform better in mathematical reasoning tasks.

(Hyper-connection vs. Manifold-Constrained Hyper-Connection)

Engram:A "conditional memory" module designed to decouple "memory" from "computation".Static knowledge in the model (such as entities, fixed expressions) is specially stored in a sparse memory table, which can be placed in affordable DRAM. When reasoning is needed, it can be quickly looked up. This frees up expensive GPU memory (HBM) to focus on dynamic computation.

(Engram architecture)

The report points out that the combination of these two technologies is of great significance for China's AI development: Using more stable training processes (mHC) compensates for potential shortcomings of domestic chips; smarter memory scheduling (Engram) bypasses HBM capacity and bandwidth limitations.

Nomura emphasizes that the most direct commercial impact of V4 is the further reduction of training and inference costs for large models. This enhancement in cost-efficiency will stimulate demand, and Chinese AI hardware companies will benefit from an accelerated investment cycle.

Hardware Benefits from "Acceleration Cycle"

Nomura believes that global major cloud service providers are all striving to pursue general artificial intelligence, and the race for capital expenditures is far from over. Therefore, V4 is not expected to create the kind of shock wave in the global AI infrastructure market seen last year.

However, global large model and application developers are bearing increasingly heavy capital expenditure burdens. If V4 can, as expected, significantly reduce training and inference costs while maintaining high performance, it will serve as a strong boost.

It could help these players more quickly convert technology into revenue and relieve profit pressure.

The report reviews the market landscape one year after the release of DeepSeek-V3/R1.

Previously, DeepSeek's two models, V3 and R1, combined "computing management efficiency" and "performance enhancement" to accelerate Chinese LLM and application development, changed the competitive landscape between global and Chinese large language models, and increased attention on open-source models.

(Weekly Token Consumption of Top 15 Open-Source Models on OpenRouter)

At the end of 2024, DeepSeek's two models had accounted for more than half of the open-source model token usage on OpenRouter. By the second half of 2025, as more players entered, their market share had declined significantly.

The market shifted from "one dominant player" to "a multipolar rivalry." This shows that mere high efficiency of a single model is insufficient to dominate the rapidly-evolving open-source ecosystem; the competitive environment faced by V4 is much more complex than a year ago.

Software May Welcome "Value-Added Rather Than Replacement"

On the application side, a stronger, more efficient V4 will give rise to more powerful AI agents.

The report observes that apps like Alibaba's Tongyi Qianwen are already able to execute multi-step tasks in a more automated way. This means AI agents are shifting from "dialogue tools" to "AI assistants" capable of handling complex tasks.

These multitasking agents require more frequent interaction with underlying large models, which will consume more tokens, driving up computing demand.

Therefore, improvement in model efficiency will not "kill software," but will actually create value for leading software companies.

Nomura emphasizes the need to watch software companies that can take the lead in leveraging new-generation large model capabilities to build disruptive AI-native applications or agents. Their growth ceiling may be raised once again due to leaps in model capabilities.

This article is from WeChat public account "Hard AI". For more AI cutting-edge news, go here

Risk Warning and DisclaimerThe market has risks; investment needs caution. This article does not constitute personal investment advice, nor does it take into account the particular investment objectives, financial situations, or needs of individual users. Users should consider whether any opinions, viewpoints, or conclusions in this article suit their specific circumstances. Any investment made accordingly is at your own risk. ```