Castle Securities In-Depth Analysis: The High Costs of Cutting-Edge AI Are Pushing the Industry Toward a Structural Inflection Point

```

The token economy faces severe challenges, the price war has already begun, and the deployment of cutting-edge AI may accelerate towards concentration in a few capital-rich enterprises.

Frank Flight, Head of Macro Strategy at Citadel Securities, makes it clear in his latest report that the AI market is becoming "polarized"—cutting-edge models are continuing to make breakthroughs, but their usage will be concentrated among a handful of leading enterprises. Only a few scenarios are worth invoking the most expensive AI; most use cases will actively downgrade to cheaper models. High costs and physical bottlenecks are forcing users to turn to more affordable options and are pushing deployment decisions to be more refined.

This judgment is rapidly being confirmed at the industry level. From Goldman Sachs to Apollo, from Citadel Securities to OpenAI itself, multiple top institutions and industry leaders have issued similar warnings in succession: the computing power cost of cutting-edge models has reached the limit that enterprises can bear, and the former "Token frenzy" is quickly turning into "Token panic."

According to the Wall Street Journal, OpenAI is seriously considering a major reduction in token prices, and CEO Sam Altman has openly admitted that cost has become "a huge problem". This statement marks a fundamental turning point in AI industry pricing logic.

For the market, the above signals release far-reaching impacts that cannot be ignored: investment logic in AI infrastructure is being repriced, the commercial path of cutting-edge models is far more tortuous than expected, and asset prices will be forced to seek a new balance between ambition and reality, technology and physical limits.

Market Consensus: Collective Shift Driven by Cost

This week, several top institutions independently reached similar conclusions under the same framework, and a new market consensus is forming in real time.

Rich Privorotsky, Partner at Goldman Sachs, pointed out that token spending has peaked, and enterprise customers will proactively optimize the "cost per task"—simple tasks will be routed to local models, complex tasks will be handled by the cloud, and cutting-edge models will only be invoked when necessary. He also cautioned that the market "may be allocating too much spending to data-centric models."

John Zito, Co-President of Apollo, was more direct: Many companies use cutting-edge models for tasks that fundamentally aren't worth consuming such computing power; from the cost per unit of knowledge perspective, "prices are collapsing."

Frank Flight summarized in his report that even the most powerful technologies must be tested by cost curves, capacity constraints, and marginal returns. "Adoption is less and less determined by what cutting-edge models can do in principle, but by the price and scarcity required for large-scale implementation of AI."

Token Bills Ignite Industry Crisis

A series of incidents have pushed cost pressures from theory into reality. Amazon has taken down its token leaderboard, Microsoft has cancelled Claude Code subscriptions, and several companies have reported unexpectedly high token bills.

According to the Wall Street Journal, OpenAI is considering a "major reduction" in token prices to compete for Anthropic's customer base. Altman has recently said, "We will have many ways to help users get more value for less spending."

Frank Flight characterizes this situation as a "classic deflationary price competition"—precisely the opposite of what the already tight profit industry needs. Under the pressure of huge balance sheets and special purpose vehicles (SPVs), the AI industry is facing both revenue declines and narrowing profit margins, with cash consumption accelerating further.

The recent decline of the Silicon Data LLM Spending Index is evidence of this trend. Flight points out that the decline of this index reflects users are migrating to cheaper models—when users’ sensitivity to the total cost of AI deployment (token price × usage) rises, non-essential scenarios for cutting-edge technology will accelerate shifting toward efficient, low-cost models.

Divergence Between Cutting-edge AI and Everyday AI

Frank Flight makes a key judgment in his report: The deployment of cutting-edge AI will not disappear, but will be highly concentrated in a handful of qualified enterprises—those with enough balance sheet to absorb computing costs, research depth to deploy effectively, and the ability to realize scalable returns from solving complex problems.

For the broader economy, before physical constraints are eased, simpler models may be a more cost-effective path to productivity improvement. Flight summarizes this trend as "the divergence of cutting-edge AI and everyday AI usage."

He points out that the most sustainable productivity gains right now come from AI as a supplement to human labor: developers accelerate coding and debugging with programming assistants; customer service reps resolve tickets faster with Copilot; knowledge workers use models for compressing search, drafting, and translation prep work. These applications are "more focused, more disciplined, and more token-efficient than the vision of ubiquitous autonomous agents."

Risk Warning and DisclaimerThe market has risks, investments require caution. This article does not constitute personal investment advice and does not consider the particular investment objectives, financial situation, or needs of individual users. Users should consider whether any opinions, viewpoints, or conclusions in this article fit their specific circumstances. Investments based on this are at your own responsibility. ```