The First AI Inflation We've Experienced: Public Cloud Price Hikes

Home

Finance

ICV

Smart City

Digital Live

Cloud

Optics

Home Finance AI ICV Smart City Digital Live Cloud Optics

04/09 2026 346

Price increases for daily consumer goods such as rice, pork, utilities, and other essentials represent economic inflation that we are all familiar with. However, in 2026, global developers and enterprise users felt the reality of AI inflation for the first time.

Previously, AI cloud computing power had long been in a low-price honeymoon period. However, in January 2026, global cloud giants led by Google and Amazon AWS took the lead in raising prices for AI-related products. Domestic cloud providers broke away from the convention (convention) of "operating at a loss for visibility" and only lowering prices, following suit with AI cloud price hikes: On March 18th, Alibaba Cloud announced price increases for core products such as AI computing power and storage; on the same day, Baidu Intelligent Cloud synchronization (simultaneously) raised prices for AI computing power-related products; Tencent Cloud took the lead in terminating the limited-time free public testing of some large models and raised the prices for model invocation.

This round of collective price hikes marks the formal conduction (transmission) of global computing power inflation to China's public cloud market.

According to macroeconomic theory, inflation is essentially price adjustment under conditions of supply falling short of demand. However, for a long time in the past, the scarcity of AI computing power was not reflected in cloud service pricing. High-end GPUs were in short supply, and NVIDIA's high-end graphics cards remained expensive in the domestic market, yet cloud providers continued to attract developers through low-priced Token and API services. It can be said that the previous pricing mechanism for GPU clouds did not reflect the true supply and demand for computing power at all.

This raises a new question: Why were cloud providers willing to absorb computing power costs on their own before, but are now choosing to pass cost pressures onto the market, leading to the formal land (implementation) of AI inflation?

Through the act of raising prices, we can understand the changes taking place in the public cloud market.

Many developers have reported frequent throttling, quota limits, and slower real-time throughput when using the MaaS services provided by model vendors. Sometimes, tasks assigned to the "lobster" intelligent agent cannot be executed for half a day, and normal use can only be resumed after recharging, becoming a daily trouble (trouble) for many developers.

The inconvenience perceived by developers stems from the imbalance between supply and demand for upstream Tokens, which is the ultimate result of inflation being passed downstream.

On the supply side, prices for high-end chips and high-performance storage skyrocketed in 2025, with supply remaining tight; on the demand side, there was explosive growth in intelligent agent applications, with Token consumption per task increasing by over a hundredfold compared to traditional conversational AI, leading to a significant surge in resource consumption. Additionally, multimodal applications such as video generation, digital humans, and real-time communication became widely popular among the general public in 2025, further exacerbating the demand for Tokens. This is entirely consistent with the logic of inflation in macroeconomics: excessive demand chasing limited resources inevitably leads to price increases.

In summary, the collective price hikes by public cloud providers represent a self-correction of the pricing mechanism. Over the past two years, demand for computing power has far exceeded supply, and continuous price wars have continuously compressed the profit margins of cloud providers. Now, pricing has begun to truly reflect hardware and resource costs, constituting this round of AI inflation.

Developers are most acutely aware of changes in price levels. A comic drama author revealed that API invocation costs have increased severalfold compared to before. Before the Spring Festival, producing a comic drama using AI cost about 200 yuan, but now it has risen to 300 yuan. Although the increase is not extreme, it signifies that the era of "powering AI with love" in the AI industry has come to a complete end.

This raises a question: The demand for AI computing power and Tokens has existed since the explosion of large models in 2023. Why were cloud providers able to maintain low prices in the previous two years, but began to abandon the tradition of "only lowering, never raising" prices and actively ended the price war in 2026?

It is noteworthy that the price hikes by cloud providers are not comprehensive increases. In general-purpose basic cloud services, such as ECS general-purpose instances, OSS standard storage, and VPC networking, prices are still decreasing. Meanwhile, a certain domestic cloud provider that has not been affected by overseas chip costs has synchronization (simultaneously) raised prices for instances using domestic chips.

This indicates that price wars have not disappeared, and the price hikes for AI products are not solely due to cost pressures.

The core logic of this round of price hikes is to adopt a divide-and-conquer approach for different computing power customers: In the fiercely competitive general-purpose computing market where users can freely migrate, small and medium-sized enterprises are highly price-sensitive. Cloud providers continue to defend their market base through price wars and dare not easily raise prices, after all, there is no shortage of low-cost alternative resources in the market.

In fact, many government and enterprise units have begun to deploy localized solutions, reducing their reliance on public clouds and avoiding the risk of rising Token costs by building small models and private computing power pools. The popularity of DeepSeek all-in-one machines is an attempt by government and enterprise units to reduce cloud API invocation costs through local deployment.

The real ones paying for inflation are heavy users in the AI sector, including AI developers, model vendors, startups, and research and development teams for autonomous driving and robotics. Their common characteristics are:

1. High migration costs. The businesses of these users are highly dependent on cloud-based GPUs, whether for training large models, running agents, or real-time inference. Once they switch platforms, issues such as service queuing, speed limiting, and degradation are likely to occur, damaging the business experience.

2. Difficulty in building self-owned computing power. AI inference clusters (especially GPU clusters) are scarce resources, and chip suppliers prioritize ensuring supply for their largest and most stable customers. Small and medium-sized vendors and enterprises find it difficult to obtain stable supply chain support and can only rely on leading cloud providers for sufficient computing power.

3. High technical dependency. Users are deeply bound to cloud platforms, with cloud providers solving the technical challenges of integrating diverse computing power. A research institute revealed to us that when building their own clusters in the past, they avoided mixing different types of chips as much as possible to prevent cluster failures. However, to avoid risks associated with overseas supply chains, they now must deploy diversely. For most organizations, building clusters that integrate diverse computing power is impractical. The cloud is much more worry-free, eliminating the need to concern themselves with the operational challenges of mixing different chip clusters and significantly reducing their technical pressure.

Therefore, these users are deeply bound to AI cloud services, giving cloud providers pricing power and the core confidence to dare to raise prices.

Overall, the price wars among cloud providers are no longer purely about price competition, and AI computing power inflation exhibits structural imbalance in the cloud market.

Tokens are becoming as essential as water, electricity, and natural gas, and no one hopes for long-term, rigid increases in cloud computing power prices. Against this backdrop, many individuals and enterprise users wonder whether this round of AI inflation will sweep across all cloud applications and whether prices will fall as computing power supply becomes abundant.

Those familiar with macroeconomics know that while malignant (vicious) inflation is bad, deflation also has negative impacts, and benign inflation is the best scenario.

In the GPU cloud market, deflation, or vicious price wars, would lead to long-term losses for cloud providers, who rely on low prices to acquire user scale. This development model is clearly unhealthy. At the same time, the era of low-priced Tokens has also fueled the AI bubble, with many small scenarios blindly using large models, resulting in inefficient consumption of computing power resources. After cloud computing power costs become explicit, developers will be forced to be more prudent, adopting optimization methods such as caching, summarization, and local small model pre-screening to design more efficient agent workflows, helping the entire industry establish a sustainable AI engineering paradigm.

Therefore, the price correction in AI clouds essentially represents a reasonable return of prices to true costs and commercial sustainability. Whether the future leads to mild or vicious inflation, like the pig cycle, there is a long conduction (transmission) period from the reduction in live pigs to the increase in pork prices and then to a comprehensive surge in prices.

On the one hand, the overall impact of this round of GPU cloud price hikes is controllable. Although the unit price of AI-related products has increased by up to 34%, AI accounts for a limited share of cloud providers' total revenue, so the overall increase in computing power costs remains controllable, and there has been no widespread price increase. Additionally, there are abundant low-cost resources in the market, and providers such as Alibaba Cloud and Baidu Intelligent Cloud have also set up grace period (grace periods) for price increases for users who have already purchased services, reducing the impact of the hikes.

Therefore, if regulation is appropriate in the coming period and cloud providers achieve breakthroughs in cost-reduction methods, it is entirely possible to control inflation and return to low prices.

Based on this, responding to this round of price hikes should be viewed from two time dimensions:

In the short term, how to help users directly affected by the price hikes alleviate pressure; in the medium to long term, how to achieve stability in computing power costs through cross-cycle adjustments.

For heavy users of AI clouds directly affected by the price hikes, the most important thing is to abandon fantasies and face reality. They should change their previous expectations of "permanently free computing power" and accept the reality that cloud market pricing mechanisms are gradually aligning with true costs. They can build their own computing power clusters, actively optimize models, explore local low-cost high-performance inference solutions, and achieve equivalent results with less computing power. In short, they should prepare for the possibility that price hikes may not be avoidable in the short term and ensure that AI-related businesses can continue normally even under computing power cost pressures.

In the medium to long term, uncontrollable inflation must be controlled. Once inflation becomes excessive and persistent, it will impose huge pressures on AI users, with students, independent developers, and small and micro teams unable to afford the increased computing power costs, leading to a halt in diverse innovation and contradicting the policy orientation of inclusive AI.

Especially since public clouds serve as one of the computing power infrastructures, cloud providers have long surpassed their role as mere IaaS providers and bear the social responsibility of making AI computing power inclusive. Controlling AI computing power inflation is precisely the core manifestation of this responsibility.

So, how can AI inflation be effectively controlled? The core answer does not lie in reverting to price wars. From the perspective of high-quality development, the higher the computing efficiency, the more Tokens can be produced per unit of computing power, reducing cost pressures for cloud providers and effectively alleviating inflation. Cloud providers that enhance computing efficiency through technological innovation can play a significant role in controlling AI inflation and need to possess at least the following capabilities.

First is the development of self-researched chips at the most fundamental level. Self-researched chips play two roles in resisting inflation: First, they reduce reliance on overseas high-performance, high-priced chips, allowing for autonomous control over computing power supply and alleviating computing power shortages. As the supply of domestic chips increases, the costs of domestic computing power clusters will further optimize.

Second is collaborative design. With self-researched chips, deep adaptation between model architectures and chip instruction sets can enable specific models to perform optimally on specific chips. For example, the joint optimization of Ascend chips and the DeepSeek model can achieve the same effects as NVIDIA chips.

The diversity of domestic chips also requires cloud platforms to possess intelligent computing fusion capabilities, such as Alibaba Cloud's BaiLian, Baidu Intelligent Cloud's BaiGe, and Lenovo's WanQuan, enabling pooled training and inference across multiple computing power architectures. For example, Sugon deeply integrates HPC high-performance computing power with AI intelligent computing to address computing power shortages while avoiding reliance on GPUs from a single vendor, further stabilizing computing power supply.

Finally, advanced technologies such as liquid cooling can reduce cluster energy consumption and the comprehensive operational and maintenance costs of cloud providers, thereby lowering the overall costs of GPU clouds and avoiding continuous increases in computing power prices.

It can be seen that public cloud providers are both conductors of inflationary pressures and key forces in solving inflation problems.

Providers with full-stack closed-loop capabilities encompassing chips, models, and clouds not only possess significant cost advantages and strong resistance to price hikes but also have autonomous pricing power, enabling dual repair of prices and profit statements. Therefore, this round of price hikes also compels cloud providers to increase technological innovation and self-research efforts, serving as a stabilizing anchor for computing power prices.

AI inflation is not unique to China but a global issue. Overseas cloud providers initiated price hikes as early as the fourth quarter of 2025, and the recent price increases in China are merely a follow-up response to the global trend. This means that domestic enterprises needing to conduct business and deploy AI applications overseas will face the dilemma of no inclusive cloud services being available.

Among domestic cloud providers, Alibaba Cloud and Tencent Cloud have far fewer overseas nodes than AWS, while Huawei Cloud has a relatively complete overseas node layout but still cannot match the overall computing power scale of international cloud giants.

In China, enterprises can respond to price hike pressures by building their own computing power clusters, but overseas, building computing power centers faces multiple challenges such as compliance, operations and maintenance, and optimization, with difficulty far exceeding that in China.

Therefore, when Chinese enterprises go overseas and seek cloud and intelligent services, they will likely have to rely on international cloud providers. The synchronous inflation of global computing power, coupled with price hikes by international cloud providers, will further increase overseas business costs.

This dilemma has also brought new opportunities for domestic cloud providers: there is still a market gap in providing cloud computing power support for companies going overseas. For domestic cloud providers with a well-established overseas node layout, this is undoubtedly an important opportunity to seize the overseas computing power service market and break the monopoly of international cloud giants.

The price hike of AI clouds represents the first AI inflation we have experienced, serving as a microcosm of the era marked by global resource competition and an imbalance between computing power supply and demand. It is deeply tied to the global political and economic environment and is not a phenomenon that will end in the short term.

This reality has driven cloud providers to shift from blind, cutthroat price wars to a rational path of raising prices to achieve reasonable growth and actively restore market prices.

While rejecting 'low-price freeloaders,' maintaining the bottom line (which can be translated as "bottom line") of making AI computing power accessible and providing innovators with low-cost cloud computing power services is the core proposition that domestic cloud providers will face and address over the long term.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.

Newest

Links