CPU's Major New Opportunity? When Intel and AMD Meet the New Agent Trend

Inference demand is exploding, the era of Agents has arrived, the AI industry structure faces transformation.

Will the logic of hardware procurement undergo a major reversal?

What key advantages do Intel and AMD respectively possess?

1. What happened? CPU price increases

According to our discussions with industry insiders, TSMC’s n3b process conversion will result in a decrease in PC CPU shipments, causing Chinese customers to face challenges with CPU price hikes, with an expected increase of 20–30% in the first quarter. US stock market leaders Intel and AMD have recently continued to outperform both the market and sector indices.

Beyond price hike expectations, CPUs also face major opportunities for industry logic restructuring. As mentioned in our earlier VIP article "DeepSeek Continues to Reduce Costs! Engram Technology Revolutionizes Storage??”, the explosion of inference demand will bolster CPU demand on the AI side, introducing CPUs as cost-effective engines for small language models and data preprocessing pipelines.

As foundational infrastructure in computing, the importance of CPUs is likely to become even more prominent.

① High-end AI servers generally follow a configuration of "every 8 GPUs paired with 2 high-end CPUs"; the CPU is the core for coordinating hardware and guaranteeing system stability. Without a CPU, tasks such as server boot-up, monitoring, and fault diagnostics cannot be performed.

② As a general-purpose processor, CPUs excel at handling serialization tasks and complex logical operations, covering all stages of the AI workflow (data preprocessing, supporting model training, inference, post-processing), especially tasks not well suited for GPU-parallelization.

③ Decades of accumulated software ecosystems (operating systems, databases, development tools) are all designed for CPUs, requiring no additional adaptation; for non-GPU-specific tasks, CPU offers higher cost-effectiveness, balancing performance and cost.

At the key turning point where AI compute shifts from “large model training” to “full-scenario inference” and “autonomous Agents,” computing architecture is undergoing a disruptive revolution. Over the past two years, the market’s feverish pursuit of GPUs and HBMs (High Bandwidth Memory) has overshadowed the role of CPUs in system-level efficiency. However, with DeepSeek’s groundbreaking paper "Conditional Memory via Scalable Lookup" and the launch of its core Engram module, the underlying logic of hardware demand has been entirely rewritten.

2. Why is this important? The Age of Agents Has Arrived

The arrival of the AI Agent era means the CPU is no longer just the “commander” of the system, but becomes the "core productivity" in the inference pipeline. In tasks such as reinforcement learning (RL) environment building, tool invocation, and data preprocessing, CPUs are overtaking GPUs as the new system bottleneck. Currently, the global server CPU market has entered a “seller’s market”: mainstream Intel and AMD models have lead times of over 20 weeks, and 2026 production is almost sold out. Expert meeting minutes indicate CPU ASP (average selling prices) face 10%-15% systemic increases. Intel’s 18A process node early mass production and AMD EPYC’s nearly 50% market share in data centers signify that the rivalry between the two has entered a new, efficiency-driven phase.

Source: Huaxing Securities

In the trend towards AI Agents, models no longer just output text, but must perform closed-loop operations via external tools such as Python interpreters, web search, and database retrieval.

① Environment construction pressure: In Agent-related reinforcement learning training and inference, the CPU needs to construct massive simulation tools and environments in real time.

② Concurrency bottlenecks: The CPU determines how quickly data can be concurrently generated, evaluated, and fed to the GPU. If CPU performance is lacking, GPU utilization plummets, Policy Lag (policy delay) occurs, and training convergence slows.

For the huge volume of long-tail inference tasks that do not require top-tier GPU compute (such as vectorized preprocessing, small language model SLM inference), CPUs, with their flexibility and minimal deployment barriers, become the preferred cost-performance engine. We observe that inference is taking up a rising proportion of AI compute, bringing CPU back into the narrative as the core of inference.

According to our latest industry research and supply chain feedback, server CPUs are facing the most severe shortage since 2021:

① Intel: Severe shortage of 4th and 5th generation scalable Xeons, mainly because the AI trend is driving cloud service providers (including top North American internet companies) to make bulk purchases of older models for inference optimization. Current lead times have stretched beyond 20 weeks, and the shortage is expected to continue through the first half of 2026.

② AMD: Beginning from Q4 2025, 6–7 core models of the 4th and 5th generation EPYC processors are also in short supply.

Amid CPU shortages and rising upstream memory prices, server OEMs (Lenovo, Inspur, Dell, etc.) have raised the profit requirements on system shipments for Q1 2025 by 30–40% compared to before. Mainstream institutions (like KeyBanc) indicate that both Intel and AMD are considering a 10–15% increase in ASP for server CPUs.

3. What to watch next? Intel and AMD

If the new version of DeepSeek V4 brings Engram technology into a new spotlight, both server CPU giants—Intel and AMD—will benefit.

For AMD: Major beneficiary of industry trends

① Direct benefit: AMD EPYC processors have established a strong reputation in the AI inference market due to their advantages in core count, memory bandwidth, and efficiency. Engram will drive up CPU compute demand, allowing AMD to leverage its product strengths to directly take market share. Call notes indicate AMD’s global server CPU share has exceeded 40%, with rapid growth outside China.

② Full-stack advantage: AMD offers a complete solution from CPU (EPYC), GPU (Instinct), to interconnect (Infinity Fabric), allowing customers to choose a “one-stop” closed-loop Engram architecture, attractive for those wishing to diversify their supply chain and reduce reliance on NVIDIA.

For Intel: Potential disruptor through strategic synergy

Intel’s opportunities are even deeper, with potential to go from “recovery” to “leadership,” anchored in its unique strategic synergies:

① Perfect fit with the Saimemory project: The Intel–SoftBank Saimemory joint venture aims to develop low-cost, low-power stacked DRAM as a replacement for HBM. Its goals (double the capacity, 40-50% lower power consumption) highly match Engram’s needs for “high-capacity, low-cost memory.” This exclusive strategic layout is something neither AMD nor other competitors possess.

② Advanced packaging applications: Intel’s advances in EMIB, Foveros, and other advanced packaging technologies can optimize high-speed interconnects between CPUs and large-capacity memory (ordinary DRAM or future Saimemory), further reducing latency and boosting Engram architecture’s overall performance.

③ Supply chain and ecosystem influence: If the “CPU+Saimemory” path succeeds, Intel could break the current HBM market monopoly dominated by three giants, gain autonomy in key AI storage nodes, and elevate its role in the AI industry chain from component supplier to system-level solutions provider.

What key investment dimensions should users focus on?

① Rebound in CPU and generic memory (DDR5/DDR6) configuration

If “CPU+DRAM” can do the jobs of “GPU+HBM”, then CPU ratios and memory capacity in inference servers will soar. Firms with high-performance server CPUs and deep involvement in stacked DRAM technology (like Intel and AMD) will benefit.

② Advanced packaging and stacked DRAM (HBM-like DRAM)

To close the performance gap between DRAM and HBM, non-HBM but vertically stacked high-density memory modules will become the cost-effective choice. Projects like SoftBank and Intel’s Saimemory, and domestic leaders in advanced DRAM packaging, deserve attention.

③ Substantial performance realization in the CXL industry chain

Pay attention to CXL controller chips and CXL memory expansion modules. These are the "arteries" for implementing DeepSeek Engram architecture. Without efficient interconnects, this "budget solution" won’t run.

④ “Long-tail market” on the inference side: edge computing and personal servers

If trillion-parameter models can run without HBM, then even edge AI servers and high-end workstations can run the full version of DeepSeek V4. This will greatly boost market penetration of consumer-grade high-performance DRAM (like 32GB/64GB modules).

The next chapter of AI will be one of inference and agents—where computing moves from centralized training to widespread deployment and autonomous action. The CPU is no longer a silent background player but comes center stage as the core pivot balancing performance, cost, and scale. The current CPU shortage and price increases are the first shadows cast by this deep transformation. After memory, CPUs are the next core compute element to see revaluation and surging demand—investment opportunities cannot be ignored.

Risk Disclaimer and Liability ClauseThe market has risks; investment requires caution. This article does not constitute personal investment advice, nor does it consider individual users’ special investment goals, financial situation, or needs. Users should carefully consider whether any opinions, viewpoints, or conclusions in this piece are suitable for their own circumstances. Invest accordingly at your own risk.