AI Enters the Era of "Pay-Per-Use and Compute Allocation"—What Anthropic and DeepSeek Teach Us
```
Anthropic's forecasts presented to investors during a round of financing: The company reported revenue of $4.8 billion in the first quarter, expected to rise to $10.9 billion in the second quarter, with operating profit projected to reach $559 million in the June quarter. In the first quarter, Anthropic spent 71 cents on computing power for every dollar earned; by the second quarter, this ratio is expected to drop to 56 cents. Last year, the company told investors profitability might not be achieved for a full year until at least 2028, but now the profit statement has taken a turn. This shift is driven by enterprise demand for Claude’s programming tools and agent capabilities.
As demand surges, charging models for AI have already changed.
Well-known tech investor Gavin Baker summed this up with a vivid phrase: AI is moving from “all you can eat” to “pay by the drink.” He discusses real enterprise usage scenarios: a person launches dozens or hundreds of agents simultaneously to run code, search, analyze, and carry out multi-step reasoning. Flat monthly packages can hardly cover the actual consumption behind such activities.
For enterprises, model selection has shifted from engineering preference to a financial decision, with price disparities between the cheapest and most expensive frontier models reaching dozens—even hundreds—of times.
The supply side isn’t giving the industry much buffer. Bridgewater Joint Chief Investment Officer Greg Jensen said in an interview that the only thing growing faster than computing power supply is demand for it. Rationing has already begun, via price hikes, restricted access, or even suspending some internal research projects at labs. Computing power, electricity, storage, and rack systems are still under strain; per-use charging and access limitation are appearing at the same time, unsurprisingly.
The key question for the market is this: After the “shovel” takes all profits, what’s next for AI transactions?
On this, Goldman Sachs and SemiAnalysis have distinct differences. Goldman Sachs analyst James Covello focuses on first-stage profit allocation: Over the past two years, the most certain profit was retained in upstream sectors like chips, HBM, advanced packaging, data centers, and electricity. Cloud vendors bear the capital expenditure themselves, while enterprise-side ROI hasn’t largely appeared on profit statements. SemiAnalysis looks at the other end: tokens are no longer just Q&A costs. Code, research, modeling, charting, and financial analysis tasks are now treating tokens as production materials. If high-value tasks are willing to pay for stronger models, cache hit rate, inference optimization, and hardware iteration are continually driving down unit costs, allowing the model labs layer to move from a money-burning layer to a profit-generating layer.
Anthropic’s profitability shows top-tier model makers aren’t just telling growth stories; code, long-chain agents, and complex reasoning—high-value tasks—are driving revenue and cost structures forward. Goldman’s main concern was downstream profit statements not being solid enough, but Anthropic provides a clear signal: in cutting-edge tasks, high-value tokens can already yield profit.
But Anthropic isn't the whole story for the model layer.
On May 22, DeepSeek announced permanent 75% discount for its V4-Pro API. After promotion, V4-Pro is officially charged at a quarter of the original price: uncached input costs $0.435 per million tokens, output $0.87; V4-Flash is lower, $0.14 and $0.28 respectively. Cached price has also been compressed. Reports highlight that this pricing is especially sensitive for agents, coding assistants, customer service, and document workflows—scenarios inherently reliant on repeated context and caching. DeepSeek isn't offering a single low-price model, but a gradient better suited for routing: Flash handles low-cost everyday loads, Pro handles more complex agent coding, long document reasoning, and high-value automation tasks.
This path is very different from Anthropic.
Anthropic represents the route of top American model makers: First secure the hardest, most valuable, and most sensitive tasks, then let high-end workflows prove model premiums. Its competition is in density of capability—whether enterprises are willing to pay higher prices for stronger reasoning, longer linkages, and more stable results.
DeepSeek is more like another route. It doesn’t replicate the same high-premium, closed-source logic; instead, it focuses on minimizing unit call cost, product tiers, and cache friendliness for large-scale deployment. For many applications, price isn’t just about saving a bit of budget; it determines whether a function can exist, or whether a workflow can enter production.
In the context of the Goldman/SemiAnalysis debate, these two companies each answer half of the question.
Anthropic proves that high-value tokens can make money. DeepSeek proves low-cost tokens can scale. The former raises the economic value of each invocation, the latter expands the total invocation boundary. Goldman’s concern: downstream isn’t earning enough to support the profit and capital expenditures already consumed upstream. With Anthropic and DeepSeek both present, this assessment changes. The model layer is showing two distinct business paths, and both are growing the pie.
This is a more positive signal for upstream. The reason is simple: Anthropic proves that high-value workloads are willing to pay for stronger models; DeepSeek proves that when prices drop sufficiently, a large number of invocations previously infeasible will be released. As long as computing power and infrastructure remain in tight supply, upstream gains not only “scarcity premium,” but also a bigger total demand. Greg Jensen’s “rationing” isn’t over; it actually clarifies things: both high-value and low-cost tokens are growing, supply constraints persist, and upstream can’t exit quickly.
The bigger change may be in the middle layer.
When the model layer no longer offers just one approach, enterprises aren’t simply buying a single model, but scheduling ability: which tasks go to Anthropic, which go to DeepSeek Flash, which to DeepSeek Pro, which need high compliance, which allow abundant caching, which should switch dynamically between high and low configurations. The more obvious the model layering, the more valuable routing, orchestration, workflow entry points, and budget controls become.
Goldman’s worries have not disappeared. Anthropic proves top-tier model makers can earn earlier in high-value scenarios; DeepSeek proves that low prices can enable more products. These do not mean ordinary businesses have stably transformed AI costs into profits on their statements. DeepSeek’s low-price strategy also faces profit pressure. This is an approach that trades profit for reach, and scale for platform stickiness. Whether it succeeds depends on architecture efficiency, funding endurance, developer adoption becoming stable platform usage, and geo/compliance constraints slowing global enterprise penetration.
Nevertheless, the main market concern has shifted. It’s no longer merely “who sells shovels,” nor just “when will model makers make money.” Anthropic and DeepSeek split the model layer into two paths: one makes individual tokens expensive, the other grows token volume. As long as these two paths aren’t outliers, SemiAnalysis’s “profit pool still expanding” view will be closer to reality than Goldman’s “upstream profits maxed, downstream insufficiently sustaining” framework. Upstream still has room—not just due to scarcity, but because downstream is using two different methods for the first time to grow the business.
In the next few quarters, the most interesting numbers to watch in financial reports will be: whether high-end models like Anthropic continue to improve revenue quality and gross margin, whether the low-price route like DeepSeek can turn developer adoption into stable invocation volume, whether cloud vendors’ capital expenditures translate more smoothly into revenue and contracts, and whether upstream shortages in memory, computing power, and rack continue to push demand forward.
The next round of market divergence will likely unfold along these numbers.
Risk Warning and DisclaimerThe market involves risks and investment should be done prudently. This article does not constitute personal investment advice and does not consider individual users’ specific investment objectives, financial status, or needs. Users should consider whether any opinions, views, or conclusions in this article suit their particular situation. Investments made based on this article are at your own risk. ```