Two $20 Billion: OpenAI and Nvidia in a "Reasoning Battle"
In December 2025, Nvidia quietly spent $20 billion to acquire an AI chip company called Groq.
On April 17, 2026, OpenAI announced it would purchase more than $20 billion worth of chips from another AI chip company, Cerebras. On the same day, Cerebras officially submitted its IPO filings to NASDAQ, targeting a valuation of $35 billion.
Two sums of money, almost exactly the same amount. One is an acquisition, one is procurement. One comes from the world’s largest AI chip seller, one from the world’s largest AI chip buyer.
This is not two separate events, but two symmetric moves in the same war. The battlefield is called: AI inference.
The vast majority of people haven't noticed this war. Because there are no explosions, only lines of financial announcements and technical discussions circulating among Silicon Valley engineers. But its impact may be more far-reaching than any AI launch event in the past two years -- because it is redistributing control over what is almost certain to become the largest technology market in history.
What is inference, and why is "training" no longer the keyword for 2026?
Before discussing the two $20 billion deals, we need to understand the background: the AI chip battlefield is undergoing a shift in focus.
Training and inference are the two phases where AI computational power is consumed. Training is building the model—feeding massive amounts of data to a neural network so it learns a certain capability. This process generally happens only once, or is updated periodically. Inference is using the model—every time a user asks a question and ChatGPT gives an answer, that’s an inference request.
In 2023, the bulk of global AI computational spending was on training, while inference played a supporting role.
But that ratio is rapidly flipping.
According to Deloitte and CES 2026 market research data, by 2025, inference already accounts for 50% of total AI computational spending; in 2026, this proportion will jump to two-thirds. Lenovo CEO Yang Yuanqing said at CES more bluntly: the structure of AI spending will flip completely from "80% training + 20% inference" to "20% training + 80% inference."
The logic isn’t complicated. Training is a one-time cost; inference is an ongoing cost. GPT-4 was trained once, but daily it needs to answer questions from hundreds of millions of users; every single conversation is an inference request. Once scaled up, the cumulative cost of inference far exceeds training.
What does this mean? It means the most profitable piece of the AI industry is shifting from "training chips" to "inference chips." And these two types of chips require fundamentally different architectures.
Nvidia's problem: Chips designed for training are inherently bad at inference
Nvidia's H100 and H200 are beasts designed for training. Their core advantage is extremely high computation throughput—training requires massive matrix multiplications, and GPUs excel at "multi-core parallel computing."
But the bottleneck for inference is not computation, but memory bandwidth.
When a user asks a question, the chip needs to "move" the entire model's weights from memory to the computation unit before it can generate the answer. This "move" process is the true source of inference latency. Nvidia's GPUs use external high-bandwidth memory (HBM), and this move inevitably introduces delay—for ChatGPT, which processes tens of millions of requests per second, this delay, when multiplied by scale, becomes a real performance bottleneck.
OpenAI engineers noticed this issue when optimizing Codex (the code generation tool), discovering that no matter how they tuned parameters, response speed was constrained by the architectural limits of Nvidia GPUs.
In other words, Nvidia's disadvantage in inference isn't a matter of effort—it's about architecture.
Cerebras's WSE-3 chip takes a completely different approach. This chip is so large it requires wafer-scale packaging—at 46,255 square millimeters, bigger than a human palm—packing 900,000 AI cores and 44GB of ultra-fast SRAM memory on the same silicon. Memory is directly adjacent to the compute cores, reducing "move" distance from centimeters to microns. Result: inference speed is 15 to 20 times faster than Nvidia's H100.
It should be added: Nvidia hasn't sat around waiting. Its newest Blackwell (B200) architecture offers 4x inference performance improvement over H100 and is being widely deployed. But Blackwell is chasing a moving target—Cerebras is iterating as well, and the chip market now contains competitors beyond just Cerebras.
Nvidia's $20 billion: The acquisition is a giant admission
On December 24, 2025, Nvidia announced its largest-ever acquisition.
The target: Groq.
Groq is a direct competitor to Cerebras, also focusing on inference-optimized SRAM chips—it calls its chip the LPU (Language Processing Unit), and at the time, was the world’s fastest chip for inference in public benchmarks. Nvidia spent $20 billion to buy Groq's core tech and founding team, including founder Jonathan Ross, along with several top chip engineers from Google's TPU team.
This is three times bigger than Nvidia's $7 billion acquisition of Mellanox in 2019.
To many analysts, the message behind this money is far more important than the amount: Nvidia believes it has a structural gap in inference that is so large it’s worth $20 billion to fill.
If Nvidia truly believed its GPU was unbeatable for inference, it wouldn't need to acquire Groq. Essentially, this money is a $20 billion tech procurement order—an admission that embedded SRAM architecture offers genuine inferencing advantages, that Nvidia’s existing product line can’t naturally cover these advantages, so it paid top dollar to buy a technical gap it can’t fill itself.
Of course, Nvidia’s post-acquisition official narrative is different—"Deeply integrating Groq, delivering more complete inference solutions." Translated: We realized our own stuff isn't enough, so we bought someone else's.
OpenAI's $20 billion: Buying chips is the surface, equity is the key
Now let's return to OpenAI.
In January 2026, OpenAI and Cerebras signed a $10 billion, three-year compute procurement agreement—the media at the time played it down as "OpenAI diversifying its chip suppliers."
But details revealed on April 17 changed the situation fundamentally:
First, the procurement amount doubled from $10 billion to $20 billion.
Second, OpenAI will receive Cerebras stock warrants, with its stake reaching up to 10% of Cerebras’s total shares based on order size.
Third, OpenAI will provide $1 billion in datacenter construction funding—in other words, OpenAI is helping Cerebras build factories.
Taken together, these three details paint a completely different picture: OpenAI isn’t just buying chips, it is incubating a supplier.
This logic has clear precedent in tech history. In 2006, Apple began working with Samsung to customize A-series chips, initially via bulk procurement agreements. But as Apple deepened its involvement, eventually developing its own M-series chips, control over the supply chain shifted completely from Intel and Samsung to Apple itself. What OpenAI is doing is somewhat similar—but with one key difference: Apple had control over chip design from the start, while OpenAI is still a buyer. Cerebras will continue to develop independently and serve more customers after its IPO. The endgame may not be OpenAI fully controlling Cerebras, but rather both forming a deeply interdependent ecosystem.
On one hand, OpenAI is binding Cerebras via $20 billion and equity, ensuring sustained supply of non-Nvidia inference compute; on the other, OpenAI is working with Broadcom to develop its own ASIC chips, expected to enter mass production by the end of 2026. Both approaches lead to compute autonomy.
Cerebras IPO today: What are you actually buying?
On April 17, Cerebras formally filed for a NASDAQ IPO, targeting a $35 billion valuation, aiming to raise $3 billion.
This valuation is up more than fourfold from its $8.1 billion in September 2025. It completed a new funding round in February this year, with its valuation already at $23 billion, so the IPO target represents a further 52% premium.
People familiar with Cerebras's history know this is its second attempt to go public. The first, in 2024, was withdrawn after CFIUS intervened for national security reasons—at the time, its core customer, G42 (Abu Dhabi sovereign tech investment fund), accounted for 83%~97% of revenue.
This time, G42 has disappeared from shareholder lists, replaced by OpenAI.
In other words, the structural issue of customer concentration hasn't fundamentally been solved—the big client changed, but dependence on a big client remains. Investors must decide: Is this client better or worse? From a credit standpoint, OpenAI is clearly superior to G42; from a strategic angle, OpenAI is also incubating a competing supplier—its own ASIC, when mature, is a real replacement threat to Cerebras.
To be fair, Cerebras is actively diversifying customers, and its prospectus will list more varied revenue sources, so the concentration will improve. But before OpenAI's ASIC is in mass production, the answer to this question is not yet clear.
Buying Cerebras stock, you’re essentially betting that: OpenAI will keep choosing Cerebras, and OpenAI’s own ASIC won’t arrive early. Neither of these is certain.
Of course, the bull arguments are real: If the inference market grows as predicted, even Cerebras capturing a small share means huge absolute numbers. The issue isn’t whether Cerebras has an opportunity—it’s whether a $35 billion valuation already prices in the most optimistic scenario.
Two $20 billion moves, appearing symmetrically between late 2025 and April 2026.
One comes from the world’s largest AI chip seller, buying the technology of a rival in the inference market.
One comes from the world’s largest AI buyer, incubating a challenger to Nvidia in inference.
Nvidia’s $20 billion is defensive—it paid top dollar to plug a gap it couldn’t fill.
OpenAI’s $20 billion is offensive—it is burning money to build an inference expressway not dependent on Nvidia, while also acquiring warrants to a tollbooth on that road.
This war has no gunfire, but the flow of capital never lies. These two sums tell you more clearly than any AI launch event: The control of AI inference infrastructure is now up for grabs. And this market will account for two-thirds of industry compute spending in 2026.
Cerebras’s IPO is the bugle call for this war.
Risk Warning and DisclaimerThe market has risks, please invest prudently. This article does not constitute individual investment advice, nor does it take into account the special investment goals, financial status, or needs of any individual user. Users should consider whether any opinion, view, or conclusion in this article fits their specific situation. Invest accordingly, at your own risk.