Lu Fuli: The black hole of lobster costs has emerged, requiring an Agent framework with higher token efficiency.

Anthropic’s move to block third-party tools from abusing subscription privileges is uncovering a long-ignored cost crisis in the era of AI Agents.

Two days ago, Anthropic announced it would cut off channels for third-party invocation frameworks using Claude’s subscription access. Fuli Luo, lead of Xiaomi MiMo large model, immediately shared an in-depth analysis of this event, tying it to MiMo’s newly launched Token plan three days prior.

She believes Anthropic’s action is not simply a business defense, but a vital turning point for the ecosystem as global computing power can't keep pace with Agent demand growth.

Directly affected this time are users of third-party frameworks such as OpenClaw and OpenCode that rely on Claude subscription privileges. Their costs are set to surge, possibly reaching dozens of times their previous rates in the short term.

Yet Fuli Luo thinks this pressure acts as a catalyst compelling improvements in engineering quality—Only when inefficient costs become visible can developers truly focus on context management and cache optimization.

The Cost Black Hole Behind Subscription Systems

Luo points out that Claude Code’s subscription system is cleverly designed for computing power allocation, but admits this scheme likely isn’t profitable, and may even be running at a loss.

The root of the problem lies in how third-party frameworks invoke the service. Taking OpenClaw as an example, its context management clearly has flaws: handling a single user request involves splitting into multiple low-value tool calls, each sent as a separate API request, with the context window often exceeding 100,000 tokens for each.

Even with some cache hits, this mode is extremely wasteful; in extreme cases, it also increases the cache miss rate for other requests.

Luo estimates that for each query, such frameworks generate multiple times the requests compared to Claude Code’s native framework. In API billing terms, actual costs may be dozens of times the subscription price. She refers to this gap as "not a gap, but a giant pit."

AI workshop organizer @newlinedotco commented: The “all you can eat” subscription is a ticking time bomb from the start—third-party harnesses like OpenClaw invoke the API non-stop, potentially costing up to $5,000, while the subscription is only $200. Official tools like Claude Code are sustainable because they optimize prompt cache.

After the Blocking: Short Pain and Long-Term Order

Anthropic’s adjustment hasn’t completely closed third-party access. Tools like OpenClaw and OpenCode can still call Claude via API—they’ve just lost the subscription channel.

This difference is crucial. For users accustomed to subscription pricing for these tools, the cost impact is immediate and considerable.

But Luo argues this pain has a corrective effect—It will force framework developers to enhance context management, maximize prompt cache hit rates to reuse processed contexts, and reduce ineffective token consumption. She describes this process as "pain ultimately turning into engineering order."

She also reminds major language model companies that before clarifying the cost structure of programming plans, they shouldn't blindly compete on price. Selling tokens cheaply while keeping doors open for third-party frameworks seems friendly to users but is actually a trap—Anthropic has just exited this trap.

Furthermore, she points out if users expend huge effort on low-quality frameworks, unstable inference services, or downgraded models with no returns, user experience and retention will suffer real harm.

On this, AI engineer @karpathy noted:

“Great software often emerges under restriction. If tokens are free, nobody would bother writing concise prompts or studying context compression; as cost becomes the bottleneck, developers are forced to think seriously about building 'smart' Agents.”

MiMo Token Plan: A Different Path

While analyzing Anthropic’s actions, Luo Fuli also clarified MiMo Token Plan’s design logic.

This plan supports third-party framework calls, charging based on token quota, and logically aligns with Claude’s newly introduced overage packs.

Luo emphasizes MiMo’s goal is "to deliver high-quality models and services steadily over time—not to lose users after impulsive payment."

This expresses a different philosophy of computing power allocation from subscription systems: constraining user and framework behavior with real usage costs, rather than managing abuse risks through closed channels.

Efficiency Competition, Not Compute Consumption

At the end of her article, Fuli Luo offers a broader view: global computing supply can no longer keep up with surging token demand from Agents.

In her opinion, the solution is not about lowering token prices further, but the co-evolution of “Agent frameworks with higher token efficiency” and “more powerful, efficient models.”

Anthropic’s move, whether by design or not, is pushing the entire ecosystem—open or closed—toward this direction.

"The era of Agents doesn’t belong to those who burn the most computing power, but to those who use it smartest," writes Luo.

Risk Warning and DisclaimerThe market involves risk, and investment requires caution. This article does not constitute personal investment advice, nor does it take into account individual users' specific investment goals, financial situation, or needs. Users should consider whether any opinions, perspectives, or conclusions stated herein suit their particular circumstances. Investments made based on this article are at the user’s own risk.