Second wave of computing power tax, CPU prices rising

The global semiconductor market is experiencing structural changes; the CPU sector, traditionally regarded as a mature category, is becoming a focus of capital market attention.

As of January 21, Intel's stock price hit a nearly four-year high, with an annual gain of over 44%; AMD continued its upward momentum; in China’s A-share market, Loongson Technology and Hygon Information each recorded a 20% daily limit and a single-day increase of over 13%. Behind this boom, the market is reflecting a re-pricing of the “compute tax” transmission logic; following the surge in GPU demand due to AI training, CPUs are becoming the second wave of carriers for increased compute costs.

Industry consensus is forming rapidly. Both Guolian Minsheng Securities and Western Securities have pointed out in recent reports that the current changes in CPU market supply and demand are not cyclical fluctuations, but structural reforms driven by the large-scale deployment of AI agents.

Unlike AI training centered on GPUs, in agent workloads CPUs take on significant non-AI-native computing tasks including tool invocation, task scheduling, and real-time decision-making, with related processing time accounting for 80%-90% of total task latency. This means that at the system level, CPUs may become the performance bottleneck even earlier than GPUs.

Demand prospects are already being supported by data. According to IDC forecasts, the number of active global agents will leap from around 28.6 million in 2025 to 2.216 billion in 2030, with a compound annual growth rate of 139%. In a neutral scenario, long-term CPU demand may exceed 11.73 million units, creating a significant incremental market.

The supply side of CPUs is also under extreme pressure. JPMorgan data shows Intel’s advanced process capacity utilization has reached an overloaded state of 120%-130%, while TSMC’s advanced packaging bottlenecks have lengthened CPU delivery cycles from the normal 8-10 weeks to over 24 weeks.

Against this trend, domestic CPU manufacturers are embracing dual opportunities from industry and policy. CPUs, once long regarded as “traditional” compute components, are regaining their system-level value amid the wave of AI agents.

AI Agents Catalyze Reshaping of External CPU Demand

Traditional AI computing focused all computational power on GPUs, primarily for model training and inference acceleration. However, as AI evolves into agents with autonomous planning and execution abilities, the computing workload structure is fundamentally changing.

To fulfill a real-world task—such as “analyze a batch of resume data”—an agent’s workflow is far more complex than a simple API call. It must autonomously: create independent sandbox environments, access specified cloud drives to download files, extract archives, run data analysis scripts, generate visual reports, and finally clean up the environment. In this complete task chain, only the steps of task decomposition and result generation depend on GPU inference, while 80%-90% of the task time—the intermediate steps, including file operations, code execution, data processing, and system scheduling—are entirely handled by CPUs.

Intel’s white paper, “Agent AI from a CPU-Centric Perspective,” clearly states that agent workload latency mainly comes from CPU-side tool processing tasks.

Agent Architecture Paradigm Unification: Mainstream Platforms Fully Transition to “Sandbox Execution” Mode

As AI agents move from concept to large-scale application, the industry’s technical architecture is undergoing fundamental transformation. According to industry research by Guotai Haitong Electronics, since the second half of 2025, mainstream AI platforms, including Doubao and Zhipu, have fully transitioned to a “sandbox execution” architecture model. This model’s core is to create independent, isolated virtual execution environments for each agent task, ensuring safe file operations, code running, network access, and other external calls. This architectural shift has directly given rise to new characteristics of computing demand: CPU resource consumption is strongly correlated with user scale and task concurrency, but weakly linked to GPU numbers.

Breakthroughs in engineering practice provide key technical support for this architectural evolution. The DeepSeek research team published a milestone "storage-compute separation" solution: successfully storing a 100-billion-parameter embedding table entirely in the CPU’s host memory instead of traditional GPU memory. Through a sophisticated PCIe asynchronous data transfer mechanism, this solution introduces less than 3% extra inference latency, marking a key breakthrough in engineering feasibility.

This technical breakthrough reveals two major industry trends: On the technical path, the dependency of model parameter scale on GPU memory capacity has been effectively broken, making cost-effective host memory a feasible choice for large-scale parameter storage; in system architecture, the role of the CPU essentially changes, shifting from auxiliary computing unit to the core hub of data scheduling and system management, taking on key responsibilities such as real-time retrieval, intelligent screening, and efficient forwarding of massive parameters.

Imbalance of Supply and Demand Accelerates Price Increase Expectations

Dramatic changes in demand structure coincide with dual pressures of supply-side capacity bottlenecks.

According to TrendForce’s January 2026 supply chain monitoring report, TSMC’s advanced processes like N2 and N3 have had their 2027 capacity pre-allocated by giants such as Apple, Nvidia, and Broadcom. Because the "single wafer output value" of high-end GPUs and custom ASICs is significantly higher than traditional CPUs, foundries allocate production capacity with a clear bias. At the same time, bottlenecks in advanced packaging technologies like CoWoS have further worsened the supply chain—IDC analysis points out that its capacity utilization surpassed 100% in Q4 2025, stretching CPU delivery cycles from a usual 8-10 weeks to over 24 weeks.

Intel’s internal ecosystem is also under extreme stress. As its 18A process enters peak mass production, the company not only needs to ensure supply of its own Core and Xeon series but also fulfill commitments to external foundry customers like Microsoft and Amazon. JPMorgan research reports indicate that capacity utilization at Intel’s core nodes has climbed to overloaded states of 120%-130%, forcing some non-core components to be shifted to secondary foundries such as UMC.

Western Securities’ latest industry commentary states that to cope with the imbalance between supply and demand and ensure stable supply, Intel and AMD are planning to raise server CPU prices by 10%-15%, and both vendors’ server CPU capacity for 2026 has “basically already been pre-sold.”

In summary, as AI moves from “content generation” toward “task execution,” the core of compute demand is undergoing a structural migration—from GPU-centric parallel computing to CPU-centric system scheduling and resource coordination. With supply-side capacity at physical limits and demand driven by exponentially growing agent applications, CPUs are not only facing continued upward price pressure, but their strategic value in the entire compute ecosystem is also undergoing a systemic revaluation.

Risk Warning and DisclaimerThe market comes with risks; investment requires caution. This article does not constitute personal investment advice and does not take into account the specific investment objectives, financial circumstances, or needs of individual users. Users should consider whether any opinions, perspectives, or conclusions in this article are suitable for their particular situation. Investing based on this article is at your own risk.