The ratio has changed from 1:8 to 1:1, and the underestimated CPU is becoming the new bottleneck for AI.

The focus of the AI computing power competition is quietly shifting from GPUs to a long-neglected role—the CPU.

With the explosive growth of AI agents and reinforcement learning (RL) workloads, the strategic importance of CPUs in data centers is undergoing a structural reassessment. Dylan Patel, Chief Analyst at renowned semiconductor research firm SemiAnalysis, said in a deep interview on April 8, The paradigm of AI workloads is evolving from simple text generation to complex agents and reinforcement learning, with CPUs facing "extremely severe capacity shortages."

The latest report by market research firm TrendForce confirms this view: Currently, the ratio of CPUs to GPUs in AI data centers is about 1:4 to 1:8, but in the era of agent AI, this ratio is expected to sharply narrow to 1:1 to 1:2.

This structural shift has triggered chain reactions on both supply and demand sides. Intel and AMD have raised prices for some CPU product lines by the end of Q1 2026. Meanwhile, Nvidia and Arm both announced in March 2026 their entry into the server CPU market—a GPU giant and an IP licensor making the same decision in the same month is no coincidence, but a concentrated release of market signals.

The rise of agents, CPUs shift from supporting role to bottleneck

In the early stages of AI development, the role of the CPU was quite marginal. Dylan Patel described it as: "Very light workload. You send a string, it returns a string, simple inference, not much demand for CPU." At that time, GPUs dominated the demand for AI computing power with their massive parallel matrix computing capabilities, and CPUs only assisted in compressing and routing memory data to GPUs.

However, the emergence of the new generation of inference models represented by OpenAI o1, as well as the rise of agent architectures, fundamentally changed this pattern. Unlike static large language models, agent AI needs to dynamically interact with environments—planning tasks, calling tools, passing data between sub-agents, and evaluating if tasks are completed. All this "orchestration layer" coordination falls on the CPU, making it a typical CPU-intensive workload.

The academic paper "A CPU-Centric Perspective on Agentic AI" released in November 2025 further quantified this pressure: in agent AI scenarios, delays from CPU tool processing (including Python interpretation, web crawling, lexical summarization, database retrieval, etc.) can account for up to 90.6% of total latency; in large-batch processing scenarios, CPU dynamic power consumption can account for up to 44% of system total dynamic power consumption.

Arm's calculations reveal the magnitude of the demand gap from a capacity perspective: traditional AI data centers require about 30 million CPU cores per gigawatt (GW), but in the era of agent AI, this demand will soar to 120 million cores—a fourfold increase.

Intel under pressure, AMD expands with momentum

The structural rise in CPU demand has first triggered a reshaping of the landscape in the traditional x86 market.

Intel's Xeon processors long occupied more than 95% of the data center CPU market. This dominance began to loosen in 2021—yield issues with the Intel 7 process delayed the launch of Xeon Sapphire Rapids by nearly two years, opening a market gap for AMD's EPYC Milan.

In 2026, Intel plans to launch two flagship products: Xeon 6+ (Clearwater Forest) with Darkmont architecture, featuring 288 cores/288 threads, TDP about 450W; and Xeon 7 (Diamond Rapids) with Panther Cove-X architecture, up to 256 cores/256 threads, TDP up to 650W. Both products are based on Intel's most advanced 18A process and introduce Foveros Direct hybrid bonding technology for the first time. However, TrendForce notes, due to ongoing 18A yield issues, mass production for both products may be delayed until 2027.

In contrast, AMD's pace is more steady. Its 2026 flagship EPYC Venice adopts TSMC N2 process, Zen 6 architecture, equipped with CoWoS-L and SoIC advanced packaging, and employs simultaneous multi-threading (SMT) to achieve 256 cores/512 threads—the highest thread count in the market. TrendForce predicts AMD will continue to eat into Intel's market share in 2026.

Nvidia and Arm enter strongly, rewriting competition

Beyond the traditional x86 giants, non-traditional players are entering the server CPU track at unprecedented speeds, fundamentally rewriting the competitive landscape.

In March 2026, Nvidia announced that its Vera CPU would be sold as an independent product to meet client needs for more flexible CPU:GPU ratios. Vera uses Nvidia's self-developed Olympus architecture, based on TSMC N3 process and CoWoS-R packaging, offering 88 cores/176 threads, and equipped with 1.8 TB/s NVLink-C2C interconnect for memory sharing with Nvidia GPUs. Initial partners include Alibaba, ByteDance, Cloudflare, CoreWeave, Oracle, etc. Nvidia also launched Vera CPU racks—each rack integrates 256 CPUs, totaling 22,528 cores/45,056 threads, with 400 TB of memory.

In the same month, Arm announced its first self-developed CPU product, the Arm AGI CPU, ending its 35-year history as a pure licensor. The product is based on TSMC N3 process and Neoverse V3 architecture, offering 136 cores/136 threads, TDP 300W, supporting DDR5-8800 memory and PCIe Gen6. Initial partners include Meta, OpenAI, Cerebras, Cloudflare, SK Telecom, etc. Arm also launched two rack configurations: air-cooled version integrating 60 AGI CPUs (8,160 cores, about 180 TB memory), and a liquid-cooled version supporting 336 CPUs (45,696 cores, 1 PB memory).

Major cloud service providers (CSPs) are also accelerating development of their own CPUs. AWS released Graviton5 (192 cores/192 threads, TSMC N3 process) in December 2025, deployed together with its self-developed Trainium 3 AI ASIC to reduce AI computing costs; Microsoft launched Cobalt 200 (N3 process, 132 cores/132 threads) in November 2025; Google plans to launch Axion C4A.metal bare metal version and the next-gen Axion N4A in 2026, focusing on highest cost performance.

IC backend design service providers embrace incremental opportunities

The large-scale entry of non-traditional players is creating considerable incremental business for IC backend design service providers.

TrendForce points out AWS still insists on completing CPU backend design in-house, while Google and Microsoft have outsourced CPU backend design to Global Unichip Corp. (GUC). With more CSPs and emerging CPU vendors entering the market, this outsourcing demand is expected to continue expanding.

TrendForce predicts that between 2026 and 2028, Broadcom, Marvell, GUC, Alchip, MediaTek and other ASIC design service providers will successively take on new projects from these clients. For market participants seeking new investment entry points in AI infrastructure, this segment may well represent a structural opportunity beyond the GPU boom that has not yet been fully priced in.

Risk Warning and DisclaimerThe market has risks, invest with caution. This article does not constitute personal investment advice and does not take into account individual users' particular investment goals, financial situation or needs. Users should consider whether any opinions, views, or conclusions in this article are suitable for their specific situation. Investment made accordingly is at the user's own risk.