From the "Token Race" to "Token Throttling": With a Monthly Per Capita Cost of $7,500, Sky-High Bills Force Tech Giants to Hit the Brakes

From the "Token Race" to "Token Throttling": With a Monthly Per Capita Cost of $7,500, Sky-High Bills Force Tech Giants to Hit the Brakes

```

Enterprise AI spending is undergoing a directional reversal. Tech giants that once motivated employees to consume large amounts of tokens through rankings are now setting limits on AI usage. "Tokenmaxxing" (maximum consumption) is quickly giving way to "tokenminimizing" (maximum saving), and a wave of AI budgeting controls is sweeping companies like AT&T, Meta, Uber, Walmart, and Amazon in the Fortune 500.

According to The Information, AT&T has begun restricting some employees' access to Microsoft's GitHub Copilot; Meta is tightening spending by employees on Anthropic and other AI service providers—which starkly contrasts the bustling scenes a few months ago where employees competed to consume tokens. Bloomberg previously reported that Uber and Walmart have set usage caps for AI programming tools; according to the Financial Times, Amazon has abolished internal rankings based on employees' AI usage.

The driving force behind this change is rapidly escalating cost pressure. For enterprises with the highest AI usage intensity, AI spending per employee per month has reached $7,500. Even though the unit token price continues to decrease, agentic tools that repetitively call models have tripled AI bills compared to before, pushing cost pressures beyond many companies’ budget tolerance.

This change is redefining the beneficiaries of the AI market. Demand is surging for "gateway" tools and model routers that help companies monitor, limit, and optimize AI spending. Companies like Microsoft, Databricks, and Factory (backed by Nvidia) are seeing new growth opportunities, and software suppliers like Palantir and Snowflake are also viewed as potential beneficiaries of this structural shift.

Shocking bills: Cost overflow reshapes budgeting logic

Cost pressure accumulation can be traced. Uber is the most extreme case so far—the company exhausted its entire annual AI programming budget in April 2026 and has now adjusted the monthly usage cap for each tool to $1,500 per employee. Walmart has set caps on the use of its internal AI agents; Amazon directly abolished the related rankings after discovering employees were massively consuming compute resources to compete for rankings, driving up costs.

Even at the individual level, the cost is eye-opening. Microsoft found that some engineers spent $500 to $2,000 per month on token fees for Claude Code alone.

The root of the problem lies in the structural change in token consumption due to widespread agentic tools. Unlike users manually sending single instructions, these tools automatically and repeatedly invoke models in completing tasks, greatly increasing actual usage. Thus, even with decreasing token prices, overall enterprise bills remain high.

Diverging responses: Braking or accelerating?

Not all companies are tightening. Box CEO Aaron Levie is quite pleased about this. "We never celebrated tokenmaxxing," he said, "We don't have rankings, so we've never gone astray—never incentivized the wrong behavior."

In contrast is Databricks. Nikita Shamgunov, Databricks' head of engineering, said last week at a Nebius event that Databricks has no upper limit for engineers' AI budgets, "so tokenmaxxing still exists." This stance reflects the viewpoint that, for firms confident their employees can efficiently use AI, limiting usage may not be cost-effective.

This divergence reveals the internal tension in token saving policies: While controlling usage can cut costs, it may simultaneously reduce the productivity gains promised by AI—which was the main justification for spending in the first place.

Infrastructure beneficiaries: Cost control tools see structural demand

The other side of the "token saving" wave is a surge in structural demand for cost control infrastructure.

More and more companies are migrating simple tasks from expensive frontier models to cheaper or open-source alternatives in order to control costs without cutting actual usage. Executives from Palantir and Box say such requests from enterprise customers are rapidly growing.

Infrastructure providers are quickly filling the gap. Microsoft and Databricks have each launched "gateway" tools to help companies monitor employees' AI usage and enforce spending caps. Factory, an AI software developer backed by Nvidia and valued at $1.5 billion, released a new model router earlier this month that automatically assigns low-complexity tasks to lower-cost models.

Microsoft CEO Satya Nadella echoed this trend in an article posted on the X platform last weekend, advocating that AI models should operate like interchangeable commodities. He wrote: "None of us want to see a world where every company in every industry hands over all value to a handful of 'winner-take-all' models." Notably, this statement comes from the leader of a tech giant whose productivity software is facing competitive pressure from Anthropic and OpenAI, making the strategic intention behind it all the more intriguing.

Microsoft fights on two fronts: New pricing, yet promoting "cost control"

While actively responding to customers' calls for cost reduction, Microsoft this week unveiled the pricing structure of its new flagship AI product Copilot Cowork, with billing logic highly similar to the model launched earlier by Anthropic.

Copilot Cowork mainly uses Anthropic’s models and is designed to automatically complete complex multi-step tasks within Microsoft Office 365 software—for example, a user can send a batch of receipt screenshots to the tool and have it automatically generate an electronic spreadsheet with the corresponding fee details. This goes far beyond what the existing 365 Copilot can do (such as summarizing emails or building financial models in Excel).

In terms of pricing, users must first have a 365 Copilot license starting at $30 per month and then pay extra based on actual usage of Copilot Cowork. This "seat fee + consumption fee" hybrid pricing model is identical to the one Anthropic rolled out for enterprise customers earlier this year.

Faced with widespread enterprise concern about soaring AI costs, Microsoft's Executive Vice President Charles Lamanna noted in a blog post on Tuesday that customers "can choose to control costs," including setting usage caps for Copilot Cowork for employees. At the same time, Microsoft forecasted a feature allowing customers to swap Anthropic models in Copilot Cowork for alternatives from OpenAI or Microsoft's own models, claiming comparable effects at lower cost; according to an insider, Microsoft is also testing open-source model options to replace Anthropic’s model in some scenarios. These moves show that, in the era of "token saving," how to maintain product competitiveness while easing customers' cost anxiety has become the new core topic in the enterprise software market.

Risk disclaimerThe market carries risks, and investment requires caution. This article does not constitute personal investment advice, nor does it take into account individual users’ particular investment objectives, financial situations, or needs. Users should consider whether the opinions, views, or conclusions in this article are suitable for their specific circumstances. Investment based on this is at your own risk. ```