AI pricing reform and surging compute costs! Coinbase CEO: 80% of AI workloads will be handled by models that are 99% cheaper within 12-18 months.
```
GitHub Copilot’s pricing transformation is triggering a chain reaction in the AI industry, bringing a deep debate about the sustainability of AI business models to the forefront. As usage-based billing replaces fixed subscriptions, user bills are soaring, and tech giants like Coinbase and Hugging Face are offering radically different responses. The rise of low-cost models may fundamentally reshape the cost structure of AI computing power.
On June 1, Microsoft’s GitHub Copilot officially switched its billing model from charging per request to charging per token usage, with some heavy users expected to see their monthly bills jump from tens of dollars to hundreds. This change quickly sparked strong backlash on social media, with users posting screenshots of internal cost estimates showing their monthly costs jumping from $44.68 to $754.29; another user estimates their bill could reach $847.
Behind this pricing storm is the outbreak of the AI industry’s longstanding growth-through-subsidy model. Coinbase CEO Brian Armstrong responded, predicting that 80% of AI workloads will migrate to models that are 99% cheaper in 12 to 18 months, making energy and computing power the real bottlenecks.
Hugging Face CEO Clement Delangue cites Stanford University research to provide empirical support for large-scale replacement by local, open-source small models.
GitHub Copilot Pricing Transformation: The End of the Subsidy Era
GitHub Copilot’s pricing adjustment is not unexpected. In April, GitHub’s Chief Product Officer Mario Rodriguez publicly stated that with the rise of intelligent AI agents, the current pricing model is “no longer sustainable”—a brief conversational Q&A and a multi-hour autonomous programming task were previously charged the same, while GitHub has been absorbing the ever-increasing inference costs.
The new policy went into effect June 1. Under the new billing structure, usage costs are converted into AI credits based on the AI model used and the number of tokens consumed, with each credit valued at $0.01. Subscription users receive a fixed base quota of credits and additional flexible credits depending on their subscription tier. Because cutting-edge AI models typically consume more tokens, actual costs vary widely between models.
User reactions were swift and fierce. On GitHub’s Reddit community, one user claiming to have subscribed to Copilot Pro+ from day one wrote: “$39 a month already felt expensive, but still worth it. Now with this AI credits system, I calculated next month’s expected bill: $847.” Multiple users compared the change to Uber’s business path—cultivating user dependence with ultra-low prices, then hiking prices once habits are formed.

Gartner analyst Arun Chandrasekaran told Business Insider that Copilot’s case “might just be an early sample,” predicting that, as advanced inference models and agent workflows drive up inference-side computing consumption, more companies will switch to token- or usage-based billing.
Systemic Risks of Subsidy Models
This pricing storm reflects deeper structural contradictions in the AI industry. Investor Tommy Shaughnessy posted on social media, systematically outlining what he believes to be the “most obvious collapse path for AI.”
He points out that fixed seat-basis subscription fees have long been heavily subsidized—far lower than the actual costs of heavy users. Once companies switch to API calls for reasons like data protection or compliance review, they are confronted by the true price of usage-based billing, and actual consumption often far exceeds prior expectations. He cites several cases, including Uber exhausting its entire annual AI budget in four months in 2026.

Shaughnessy further notes that current AI giants’ profit margins are deeply negative—reportedly, OpenAI’s margin is as low as negative 122%—meaning they rely entirely on external capital to buy GPUs, train models, and continually subsidize usage. He believes that once investors lose confidence in returns, capital flow faces reversal risk.
However, he also notes the boundaries of this logic: if AI truly catalyzes new drug development or entirely new business models, users’ willingness to pay high prices for AI services will increase significantly, which could alleviate the pressures above.
Coinbase CEO: Low-Cost Models Will Dominate the Future
Faced with rising computing costs, Coinbase CEO Brian Armstrong lays out his framework. He believes the demand for intelligence is almost infinite, but the market will rapidly diverge: 80% of workloads will migrate to models that are 99% cheaper within 12–18 months, while the remaining 20%, requiring the utmost intelligence—such as scientific breakthroughs and high-level agent orchestration—will still run on the latest frontier models.

Armstrong likens this trend to the consumer electronics market: users who buy top-end MacBooks or gaming PCs are always the minority, and AI’s price drop is even faster than Moore’s Law. He concludes that the real limiting factor in the future will be energy and computing power, not the capabilities of models themselves.
Armstrong also revealed Coinbase’s internal practice: the company is actively implementing prompt routing strategies, directing requests to lower-cost models. In some scenarios, total costs have basically remained stable, while token usage continues to grow exponentially.
Open-Source Small Models: Empirical Support for a Multi-Model Future
Hugging Face CEO Clement Delangue cites Stanford University research to provide quantified evidence for the replacement potential of cheap models: local models’ accuracy in real-world conversational and inference queries has jumped from 23.2% in 2023 to 71.3%, with cost and energy usage only a fraction of frontier APIs.
Based on this, Delangue proposes a “multi-model future”: For most workloads, local, open-source, small and cheap models will be the mainstream choice; only when there’s no other option will frontier APIs be called.

Shaughnessy’s analysis echoes this view. He notes that DeepSeek V4’s performance on the SWE-bench programming benchmark is close to Anthropic Claude Opus, but costs about one-thirtieth as much; the cheapest open-source models cost as little as one percent. He argues that China’s labs are continuously open-sourcing frontier-level models, allowing inference service providers to obtain core model costs for free, fundamentally suppressing closed-source AI giants’ pricing power and profit margins.
Risk DisclaimerThe market carries risks; investments should be made cautiously. This article does not constitute personal investment advice and does not take into account individual users' special investment goals, financial situations, or needs. Users should consider whether any opinions, views, or conclusions in this article suit their specific situation. Investing accordingly is at your own risk. ```