When even Microsoft can't afford to burn tokens, "model routing" becomes a "core requirement" for enterprise AI.

AI Token costs are reshaping the foundational logic of enterprise AI, and a Microsoft internal decision has brought this upheaval into the spotlight.

According to a previous Wallstreetcn article, Microsoft is considering introducing a fine-tuned version of the open-source model DeepSeek V4 into its enterprise AI tool Copilot Cowork, as a low-cost alternative to OpenAI and Anthropic models, and is expected to announce its final choice in the coming weeks.

At the same time, Microsoft has announced that Copilot Cowork will switch from unlimited usage to a pay-per-compute model. This series of actions sends a clear signal: even Microsoft can no longer afford unchecked model invocation costs.

This news has sparked widespread resonance in the enterprise AI market. The Silicon Data Token Index, which tracks AI Token prices, has dropped in 12 out of the past 13 trading days, heading straight for a recent low. Cost pressures are spreading from individual companies to an industry-wide issue, and the question of "which model to use" is being replaced by "how to afford the model".

When "affordability" takes precedence over "capability" for enterprises, "model routing"—the ability to dynamically match the most economical model based on task complexity—ceases to be just a technical choice and becomes a core requirement determining whether an AI project is financially viable.

Microsoft's Cost Dilemma: The End of Unlimited Usage

Microsoft Copilot Cowork previously offered unlimited use to enterprise customers, but this approach is no longer sustainable.

Charles Lamanna, Microsoft's Executive Vice President responsible for Copilot, frankly stated: "Some users accomplish hundreds of tasks per week, achieving very high efficiency—but the cost can skyrocket as a result."

Therefore, Microsoft announced Copilot Cowork will switch to a pay-per-compute usage model, and is simultaneously exploring the introduction of a fine-tuned version of DeepSeek V4 or other open-source models to significantly reduce model invocation costs. The underlying logic is straightforward: there is a significant pricing gap for input/output Tokens between Chinese and American models, and the cost advantage of open-source models can no longer be ignored.

This decision reflects a common dilemma across the enterprise AI market. Frontier model capabilities are increasing rapidly, but so too are invocation costs—taking Fable 5 as an example, its output Token cost for similar tasks is about 180% higher than Opus 4.8. Higher intelligence is bringing ever more difficult bills to digest.

Token Economics: The Dominant Issue for the Next Six to Twelve Months

Cost pressures have infiltrated every aspect of enterprise AI procurement.

Mason Daugherty stated on social media that in every conversation with customers over the past two months, organization-wide Token spending was raised as a major concern. He predicts that "Token economics" will dominate discussions on AI procurement and usage over the next six to twelve months.

He pointed out that, as annual enterprise contracts with major vendors come up for renewal, management has begun to doubt whether they can renew at the same or even higher prices. Meanwhile, the cost gap between frontier APIs and self-hosted open-source models continues to widen—this is the direct driver behind accelerating open-source model procurement.

The continued decline in the Silicon Data Token Index underscores the market impact of this trend—competitive pressure on Token pricing is now evident in the data.

Architecture as Moat: Model Routing Becomes Core Capability of Enterprise AI

Under cost pressure, the competitive focus for enterprise AI is undergoing a fundamental shift.

Arvind Jain from enterprise AI platform Glean noted that the biggest bottleneck in enterprise AI is no longer model intelligence itself, but "Token output efficiency"—that is, how much productive work a system can produce per Token spent. He stressed that most enterprise AI costs are not in the prompts themselves, but in the systems surrounding the model: retrieval, tool invocation, memory management, and multi-step reasoning. An eleven-word request, once the system starts gathering context and processing tasks, can expand to thousands or even tens of thousands of Tokens.

Jain believes that true competitive advantage does not come from aggressively using the most powerful models, but from AI architectures that can match the right model and reasoning layer to the corresponding task—that is, systems with strong routing capabilities, spend management, and governance mechanisms. "Frontier intelligence is becoming abundant; efficient execution is not."

This assessment closely matches Microsoft's actual actions: introducing low-cost models as alternatives is essentially building a model routing mechanism, rather than simply "switching to a cheaper model".

Nadella's Warning: Whoever Owns the Learning Loop Owns Sovereignty

Microsoft CEO Satya Nadella recently proposed a more macro framework which provides strategic annotation for the above trends.

Nadella stated that every company must build what he calls "Token capital" and "human capital"—the former refers to the enterprise’s own AI capabilities and systems, the latter to employees' knowledge, relationships, and judgment. He defines both as core assets in the AI economy, and emphasizes that the value of human capital does not fall as Token capital grows: "Without human direction, you’re just making compute spin in place."

He clearly pointed out that the real opportunity is not in choosing the strongest model, but in building a continuous learning loop on top of models, so human and AI capabilities grow exponentially together. The critical test is: can enterprises replace foundational base models without losing their accumulated proprietary knowledge and capabilities? "This is the core test for control and sovereignty in the future era."

Nadella also issued a warning that if all value ultimately concentrates in a few dominant models, it will repeat the history of globalization hollowing out industrial economies. He said: "There is no societal license to support an AI future that hollows out the entire industry." This statement was made precisely as Microsoft is considering introducing open-source alternative models, proactively decentralizing its dependency on a few leading vendors—the internal tension is thought-provoking.

Risk Disclosure and DisclaimerThe market carries risks, and investment should be cautious. This article does not constitute personal investment advice, nor does it consider the specific investment goals, financial status, or needs of individual users. Users should consider whether any opinions, views, or conclusions in this article suit their particular situation. Invest accordingly at your own risk.