Why Has There Been a Sharp Decline in the Weekly Utilization Volume of China’s AI Large Models?

04/30 2026 405

Since the beginning of the year, the utilization volume of China’s AI large models has surged, fueled by the "lobster craze." However, a notable and sudden drop in weekly utilization has recently emerged. What is driving this trend? Why have users seemingly lost interest?

I. Why Has There Been a Sharp Decline in the Weekly Utilization Volume of China’s AI Large Models?

According to recent data from OpenRouter, as analyzed by National Business Daily, the total global utilization volume of AI large models last week (April 13-19) reached 20.6 trillion tokens, marking the second consecutive week of decline.

Among the leading AI large models, China’s models saw their weekly utilization drop to 4.441 trillion tokens, also declining for two weeks in a row, with a 23.77% decrease from the previous week. In contrast, U.S. AI large models recorded a weekly utilization volume of 4.908 trillion tokens, up by 20.62% week-over-week. For the first time in nearly two months, U.S. models surpassed China’s in weekly utilization.

National Business Daily reported that among the top nine models globally by utilization volume last week, four were Chinese AI large models. DeepSeek V3.2 ranked second, with a weekly utilization volume of 1.28 trillion tokens. MiMo-V2-Pro came in fourth, with 1.15 trillion tokens, a 90% increase week-over-week. Two models from MiniMax also made the list: MiniMax M2.5 ranked sixth, with 1.05 trillion tokens, while MiniMax M2.7 ranked seventh, with 0.961 trillion tokens, a 19% decrease week-over-week.

According to a report by Phoenix Finance, the surge in token consumption serves as the most direct indicator of AI application expansion. Data from China’s National Bureau of Statistics reveals that the average daily domestic token utilization exceeded 140 trillion in March, a more than 40% increase from the end of last year. In scenarios with high token consumption, such as intelligent agents, cost advantages make domestic models more appealing to price-sensitive developers.

II. Why Have Users Suddenly Stopped Using China’s AI Large Models?

The recent news of a sharp decline in the weekly utilization volume of China’s AI large models has attracted significant attention within the industry. Many are puzzled: Why have large models, once considered a cutting-edge technological frontier, suddenly fallen out of favor? What are the underlying reasons?

First, the decline in utilization essentially reflects the re-emergence of the price leverage mechanism. In the early stages, the AI large model market was still nascent. To rapidly expand market share and attract developers, companies generally adopted low-price or even free strategies. This approach yielded immediate results, drawing in a large number of developers and leading to explosive growth in utilization volumes. However, market conditions are constantly evolving. As technology matured and the market stabilized, domestic large model companies began adjusting their strategies by raising prices. After all, in the internet era, the tactic of offering free services initially and charging later has become commonplace, applicable to everything from shared economy products to large model products.

While price hikes are logical, the increased costs have undoubtedly placed a heavy burden on developers. For many small and medium-sized enterprises (SMEs) and individual developers with limited financial resources, the high utilization costs have forced them to reevaluate their development plans and budgets. Scenarios where large models could be freely utilized for experiments and development have now become more cautious. Recently, a friend’s company faced a similar situation. Earlier in the year, the company required everyone to deploy their own lobster applications. However, they later found that the token costs were unsustainable and quickly issued a notice requiring employees to report in advance when using lobster applications and to provide tokens based on actual output.

After all, in the business world, the return on investment (ROI) is a critical factor that every decision-maker must consider. When the cost of utilizing large models exceeds the expected benefits, willingness to pay naturally declines, leading to a reduction in utilization volumes. This change is an inevitable result of market forces and reflects the profound impact of corporate strategy adjustments on the downstream industrial chain at different stages of market development.

Second, shifts in application patterns have triggered a sharp increase in token consumption. The profound transformation of AI application scenarios has simultaneously raised the utilization costs and technical barriers of large models, compelling developers to adopt a more cautious approach. This is an inevitable challenge brought about by technological iteration. In the past, AI applications were primarily concentrated in traditional question-and-answer interactions, with simple scenarios and clear logic. The utilization efficiency and resource consumption of large models were relatively controllable, with each conversation typically consuming only a small number of tokens without significant issues.

However, since this year, with the rise of intelligent agents exemplified by OpenClaw, AI applications have shifted from passive responses to proactive task completion. This leap in scenario complexity has fundamentally altered the utilization patterns of large models. Intelligent agents need to handle complex task chains involving multi-round reasoning, dynamic decision-making, and cross-scenario collaboration. Behind each task execution lies an exponential increase in token consumption, meaning that utilization costs no longer rise linearly but explode exponentially. The author has personally experienced this multiple times, where even a few simple tasks quickly consumed millions of tokens in a short period, incurring costs ranging from tens to hundreds of dollars.

For developers, this cost pressure has far exceeded expectations. Even with technical capabilities, they must weigh the ROI and dare not utilize large models as freely as before for trial and error and exploration. It can be said that the technological transformation brought about by intelligent agents has made large models shift from being "easy to use" to "unaffordable." This imbalance between cost and efficiency has directly suppressed developers' enthusiasm for utilization, making the decline in utilization volumes inevitable.

Third, the market's novelty has faded, and traffic has returned to rational levels. The decline in popularity of new applications like intelligent agents essentially reflects the market's spontaneous filtering of "pseudo-needs" and a return to value after the novelty effect wears off, confirming the objective laws of industrial development. When OpenClaw intelligent agents first emerged, their novelty in autonomously completing tasks quickly ignited the market. Both individual users and enterprise developers viewed them as the ultimate form of AI implementation, flocking to experience them. This explosive attention brought short-term traffic peaks and drove up the utilization volumes of large models.

However, as usage deepened, most people gradually realized that intelligent agents are not universal tools for everyone. Their use requires a certain level of technical expertise and scenario adaptation. For ordinary users, the operational barriers are too high. Currently, the problems that OpenClaw can solve are still relatively basic. Even the recently popular "Hermes for horse breeding" is not suitable for everyone. For most enterprises, the implementation costs do not match the actual benefits. Both OpenClaw and Hermes can only meet the needs of a few specific scenarios, making it difficult to achieve large-scale applications. This gap between ideal and reality has quickly dissipated the novelty effect of intelligent agents, causing traffic and utilization volumes to naturally return to rational levels.

Fourth, the logic of the AI large model industry is undergoing a comprehensive reshaping. As the market enters a new development stage, a full return to rationality has become an irreversible trend. For domestic large model companies, finding a balance between price and traffic has become a critical issue determining their long-term development.

If companies blindly pursue high prices, they may achieve higher profits in the short term but risk losing developers and users, leading to a decline in utilization volumes and ultimately affecting their market share and brand influence. Conversely, if companies overly reduce prices to attract traffic or continue using low-price or free strategies, they may increase utilization volumes to some extent but could face issues like excessive cost pressures and insufficient profitability, which are detrimental to sustainable development.

Fortunately, we are also seeing new hope. The newly released DeepSeek V4 large model has not experienced large-scale price hikes; instead, it has chosen to reduce prices against the market trend. The core logic is that DeepSeek has successfully lowered its costs on a large scale through technological optimizations, such as utilizing sparse attention architectures and mixture-of-experts models, which rapidly reduce the token consumption per inference. This represents the inevitable trend of AI large model development. Physically, leveraging the cheap green energy and low-temperature advantages in central and western China can significantly reduce computing costs, thereby lowering prices at the physical level. Technologically, using technological innovations to achieve full-process optimizations can reduce the cost per inference. This dual-pronged approach is the key to the future.

Therefore, domestic large model companies need to formulate appropriate pricing strategies based on factors such as their technological capabilities, market positioning, and cost structures. These pricing strategies must cover the company's costs, ensure a certain profit margin, and remain competitive in the market to attract developers and users. At the same time, companies must continuously optimize their products and services, improve the performance and stability of large models, reduce token consumption, and provide a better experience for developers and users. Only in this way can they find the optimal balance between price and traffic and ensure their long-term advantages.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.