Shrimp Farmers Start Keeping Close Tabs on Costs

04/09 2026 404

Over the past two months, OpenClaw, or the "lobster," has become wildly popular.

Even ordinary people who don't know how to code are willing to spend 699 RMB to have someone install OpenClaw for them, just to experience the thrill of having AI do their work.

But the hype faded almost as quickly as it arose.

In April, when you search for "OpenClaw" again, the top results are no longer tutorials and praise but laments like "The lobster craze is over" and "It burns too many tokens."

Recent heavy blows from major domestic and international companies have brought all shrimp farmers back to reality.

The first blow came from Anthropic.

On April 4, Anthropic suddenly cut off access to third-party agents like OpenClaw through Claude subscriptions. Want to keep using it? Sure, switch to API keys and pay per token.

The second blow came from Luo Fuli, head of Xiaomi's MiMo large model.

On April 6, she posted on X, calling third-party agents like OpenClaw a "false token orgy" and criticizing their wasteful consumption of computing power.

Both pieces of news point to the same fact: The lobster is creating a token black hole that no one can plug.

After the nationwide craze, how much does it really cost to raise a lobster? Can the cost black hole be solved?

01

The Pricey Lobster

"With a monthly salary of 20,000 RMB, I can't afford to raise a single lobster."

"Raising shrimp is fun until the bill arrives."

These widely circulated jokes reflect the sentiments of early shrimp farmers.

A small business owner described his experience: His team of five shared one OpenClaw instance, setting it to automatically execute test cases and conduct code reviews.

At the end of the first month, they expected a cost of $100 but received a bill close to $800.

"The scariest part is that you have no idea where the money went," he shared on social media. "It's like having an invisible faucet dripping water at home."

This is not an isolated case because OpenClaw's billing is highly deceptive.

When you send a simple instruction to OpenClaw, like "Help me optimize this code," you might think you're making a single API call. But in reality, it could trigger multiple independent model requests in the background:

The first to parse the intent, the second to generate task steps, the third to call tools to analyze the code, the fourth to generate a response, the fifth to create a title and tags for the conversation, and even suggest a few follow-up questions.

Users see only one response, but the bill silently piles up in the background.

Even more insidious is the Heartbeat mechanism.

To maintain contextual coherence, OpenClaw sends a "check for new instructions" request to the model every 30 minutes by default. If left running in the background all day, it will automatically generate dozens of API calls even without any user instructions.

Luo Fuli calls this phenomenon "wasteful token consumption"—poor contextual management, low cache hit rates, and a large amount of low-value redundant computations in multi-turn dialogues.

In her view, tools like OpenClaw are like prototype vehicles without engineering optimization, consuming ten times more fuel per mile than normal vehicles.

Such consumption is barely acceptable for casual users: Occasionally assigning a few tasks, organizing documents, researching, or writing reports might cost a few hundred thousand tokens per month, amounting to a few to a few dozen RMB—still within the experimental range.

But once usage becomes moderate to heavy, or the lobster is kept on standby 24/7, bills spiral out of control.

02

Vendor Quota Games

Users finding shrimp farming expensive is just the tip of the iceberg.

Beneath the surface, vendors are facing an even harsher reality—they're bleeding money faster than expected.

Anthropic's subscription model, originally a perk for casual users to "browse the web and chat," was completely overwhelmed by high-intensity agents like OpenClaw.

For example, Claude Max's $200 monthly subscription was used to run lobsters 24/7, burning through $5,000 worth of computing power.

Industry analysts estimate that API consumption generated by OpenClaw users through subscription arbitrage is more than five times the actual price paid.

When Anthropic was being fleeced by developers worldwide using lobsters, structural losses became inevitable.

Thus, Anthropic's ban came swiftly and harshly: All Claude Pro, Max, and Free subscriptions were cut off from OpenClaw overnight. Users now have to use APIs and pay per token.

Google is similarly cracking down on abusive accounts, refusing to cover the lobster's high consumption.

In contrast, domestic vendors recognized the lobster's true nature earlier and introduced tiered quota packages with clear pricing, capping token consumption from "unlimited" to "monthly maximums."

For example, Alibaba Cloud's Coding Plan initially offered a Lite package for 7.9 RMB in the first month (regularly 40 RMB/month) with 18,000 API calls per month; the Pro package cost 39.9 RMB in the first month (regularly 200 RMB/month) with 90,000 calls per month.

Tencent Cloud, Baidu Intelligent Cloud, and Volcano Engine adopted aggressive customer acquisition strategies, offering Lite packages for new users at 7.9 to 9.9 RMB/month for the first month.

In designing these packages, vendors further imposed limits on computing power consumption.

For instance, Alibaba Cloud's documentation states that the Coding Plan is suitable for interactive programming tools like Claude Code and OpenClaw.

However, using these keys for automation scripts, custom backends, or batch calls constitutes abuse and may result in subscription suspension or termination.

In other words, Alibaba Cloud prevents users from turning low-cost subscriptions into sustained agent consumption through usage restrictions.

Tencent Cloud's Lite package limits 1,200 calls per 5 hours, 9,000 per week, and 18,000 per month; the Pro package limits 6,000 calls per 5 hours, 45,000 per week, and 90,000 per month.

Users not only face call limits but also "5-hour" quotas.

Interestingly, domestic vendors continue adjusting their offerings to better suit shrimp farming.

After April, Alibaba Cloud's "best value" shrimp farming package sold out.

The discounted 7.9 RMB/first-month Lite package was fully withdrawn, and the 200 RMB/month Pro package showed as out of stock.

Some users complained in developer communities: "Just as I got serious about shrimp farming, the food ran out."

In Tencent Cloud's community, a migration plan from Coding Plan to Token Plan emerged in late March:

Lite package: 39 RMB/month for 35 million tokens; Standard package: 99 RMB/month for 100 million tokens;

Pro package: 299 RMB/month for 320 million tokens; Max package: 599 RMB/month for 650 million tokens.

The copywriting explicitly states these are "new exclusive subscription packages upgraded for lobster scenarios."

Xiaomi MiMo launched four Token Plan tiers on April 3:

Lite: 39 RMB/month, Standard: 99 RMB/month, Pro: 329 RMB/month, Max: 659 RMB/month, corresponding to 60 million, 200 million, 700 million, and 1.6 billion Credits, respectively, with no "5-hour token usage limits."

A notable change is the shift from selling tokens to selling Credits, indicating Xiaomi's intent to sell not just model calls but comprehensive computing budgets for different capability tiers, context lengths, and usage intensities.

These changes show that domestic vendors are meticulously calculating and treating "lobster farming" as a standalone business.

03

When Quotas Clash with Demand

Despite domestic vendors using subsidies to capture the market with various computing packages, user feedback remains mixed when these clash with real needs.

A product manager at a tech company recorded his actual costs:

In the first week, he used Tencent Cloud's Lite package, which quickly ran out of quota. Upgrading to Pro didn't help either, as quotas depleted rapidly. Subsequent API overage fees and cloud service costs exceeded 3,000 RMB in less than a week.

"It's more expensive than hiring an intern," he complained in an interview.

A front-end developer posted in a developer community that he used OpenClaw with MiMo-V2-Pro for code reviews, automatic changelog generation, and minor bug fixes but quickly burned through his 200 million Credit quota.

"I bought the Standard package thinking it would last half a month, but I got a quota warning in days," he wrote. "Now I'm manually counting remaining tasks, feeling the same anxiety as when my phone data is almost used up."

More than insufficient quotas, users are anxious about the inability to calculate costs.

Domestic vendors widely adopt Token-Credit conversion systems, but the conversion rates are labyrinthine; additionally, consumption rates vary by model.

A full-stack engineer wrote in a blog: "I need an Excel spreadsheet to track real-time consumption—it's more exhausting than coding. Using OpenClaw for efficiency turned me into a cost accountant."

This uncertainty directly suppresses usage depth.

When users hesitate before clicking "send," wondering if it will burn half their task quota, OpenClaw degrades from a productivity tool into a cost experiment.

Another common complaint is that quotas exist, but tasks remain unfinished.

Volcano Engine's developer community shows that Coding Plan imposes not just monthly but also 5-hour and weekly quotas. Once triggered, the system alerts, "5-hour usage quota exceeded."

For long tasks, multi-round calls, and expanding contexts—common for lobster-like agents—this restriction is crippling. Packages may show remaining balance, but tasks get interrupted first.

This frustration is particularly prevalent among SMEs.

When OpenClaw evolves from a personal toy to a team tool, quota packages shift from cost control to task completion issues.

Of course, not all users are dissatisfied with vendor packages.

For lighter tasks with flexible completion requirements, domestic low-cost packages seem ridiculously cheap.

Some users mention that Alibaba Cloud's former 7.9 RMB/month Lite package allowed 700 million tokens for OpenClaw but only used 40% of the quota.

Others say Tencent Cloud's Coding Plan "can be pushed hard for 5 hours without exceeding 50%."

Behind these mixed emotions lies a vendor dilemma.

The consumption of lobster-like agents is inherently unpredictable. When vendors try to solve this cost black hole with quota packages, users perceive not cost control but capability castration.

A developer's Zhihu post summing up this sentiment gained many upvotes:""Foreign majors choose bans, admitting they can't do this business; domestic vendors choose quotas, pretending they can but handcuffing you. Neither is a good answer.""

04

Solving the Cost Black Hole

The real answer may lie in tech communities.

When vendor packages fall short, users are forced to become their own "cost accountants."

Community users have explored several practical cost-saving strategies.

The first is model tiering + intelligent routing.

A developer behind the ClawEasy optimization tool proposed an intelligent model routing scheme: Configure OpenClaw with three-tier routing for routine, complex, and uncertain tasks.

For example, daily planning, tool calls, and simple code go to cheap, fast models; only cross-module refactoring and architectural decisions use pricier models—key is letting cheap models try first.

More aggressive users deploy local models as the first line of defense.

By running DeepSeek-R1 locally via Ollama, simple tasks achieve "zero-token costs," calling cloud APIs only when local models fail.

A developer who implemented this hybrid architecture calculated: Replacing a pure cloud solution costing $800/month with 70% local + 30% cloud reduced actual spending to $240/month, with response latency dropping from 800ms to 60ms. The initial hardware investment was $2,000, but it paid off in three months.

The second strategy is contextual engineering—the Achilles' heel Luo Fuli highlighted.

In GitHub communities, Work-Fisher's contextual compression project gained traction.

This front-end developer found that OpenClaw's default dialogue format was filled with semantic redundancies: polite phrases, repetitive confirmations, and over-explanations—human-friendly but costly for models.

His compression scheme converts dialogues into pure structured formats and enables prompt caching: Reused system instructions and history records raise cache hit rates, reducing call costs by 70%.

Reddit users report that cache hit rates stabilize at 91%-95% for Claude-series models; without caching, overall spending increases about fivefold.

Currently, Alibaba Cloud, MiniMax, and others support automatic caching, transforming agents from token-guzzlers into token-savers.

The third strategy is monitoring dashboards to prevent OpenClaw heartbeat disasters.

Solutions include setting hard quota limits that stop operations when reached, built-in monitoring dashboards for real-time consumption tracking, and timeout mechanisms that cap maximum steps per task and revert to manual control when stuck.

User feedback is direct: "I used to wake up to find heartbeat burns costing hundreds overnight; now I can sleep."

These self-help solutions work but impose hidden costs: Users become their own cost engineers.

A developer who implemented all these optimizations admitted in a blog: "I now spend 30 minutes daily monitoring token consumption, adjusting model routing parameters, and clearing historical contexts. The savings from OpenClaw became my time cost. Is this progress or regression?"

There's no standard answer.

But one thing is clear: As OpenClaw evolves from an out-of-the-box tool into a system requiring precise tuning, its user base is quietly stratifying:

Casual users quit due to quota limits, returning to AI web versions;

Moderate users struggle between vendor packages and optimization tricks, becoming cost accountants;

Heavy users are forced into architectural overhauls, becoming true cost engineers.

People are still raising shrimps, but the approach has shifted from casual consumption to meticulous calculation.

05

Conclusion

The lobster industry is not cooling down.

The discussions about packages, quotas, caching, and heartbeats precisely prove one thing: users have transitioned from being early adopters to regular users.

For any tool to evolve from a geek's toy into infrastructure, there comes a moment when someone starts calculating costs seriously.

The lobster industry is currently at that moment.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.