One Article to Decode 'Computing Power Inflation': Why AI Costs Less for You While Computing Power Firms Rake in Profits

05/15 2026 377

Author | Jiang Xu For More Financial Insights | BT Finance Data Pass The main text spans 2,576 words and is estimated to require 9 minutes for reading.

In March 2026, China's daily average Token API calls surged to 140 trillion, marking a staggering thousandfold increase from the early months of 2024. (Source: East Money, May 2026)

During this same timeframe, the API call prices for leading AI large models, including DeepSeek, Kimi, and Tongyi Qianwen, plummeted by 80%-99%. The cost per million Token calls plummeted from tens of yuan in 2023 to mere cents.

On the surface, this appears to be a boon for users: AI is becoming more affordable, and computing power is more accessible.

However, another set of figures tells a different story: Computing power rental company Dongyangguang Yunzhisuan secured computing power service contracts valued at RMB 16-19 billion. (Source: Dongyangguang Announcement, May 2026); GPU rental prices climbed from $1.70 per hour in October 2025 to $2.35 per hour in March 2026, a nearly 40% hike. (Source: Letong Electronics Annual Report, May 2026)

Yet, the most noteworthy aspect this time is not the magnitude of any single company's order, but rather the subtle yet profound shift in the pricing model within the computing power industry. This shift is reshaping the profit distribution across the entire AI industry chain.

Some might argue that this is simply a result of increased volume leading to lower prices, a classic case of economies of scale. But is it really that straightforward?

As always, I aim to elucidate this complex phenomenon in a single, comprehensive article.

1

Why Is AI Becoming Cheaper While Computing Power Costs Rise?

To grasp this, we must first understand the concept of the computing power inflation paradox.

On the surface, cheaper AI calls suggest a surplus of computing power, with prices expected to decline. However, the reality is more nuanced: cheaper AI spurs explosive growth in usage, with the rate of usage growth far outpacing the rate of price decline. This results in a surge in total demand for computing power.

A cross-industry analogy: Imagine highway tolls dropping, only to see a surge in traffic and toll booth revenues. In 2023, most companies utilized fewer than 1 million Tokens per month. By 2026, an ordinary AI programming tool witnesses over 100 million Token calls daily.

This explosion in total demand completely offsets the decline in unit price, driving scarcity in computing power.

This is the essence of the computing power inflation paradox: unit prices fall, but total spending rises; users perceive it as cheaper, yet the industry's computing power costs soar.

2

The Shift in Pricing Models: The Real Game-Changer

But merely stating that "higher volume leads to lower prices, but greater total demand" does not fully explain why computing power companies' profits are on the rise. The deeper transformation lies in the pricing model.

Traditionally, computing power was priced on a "fixed-duration lease" basis, akin to renting a parking space where you pay a monthly fee regardless of usage.

By 2026, the computing power rental industry is rapidly transitioning to a "revenue sharing based on Token calls" model, resembling a gas station where you pay for consumption, and the more you use, the more you pay.

I refer to this shift as the transfer of pricing power from "parking spaces" to "gas stations." Parking fees buy space, while gas station fees buy consumption. When consumption grows exponentially, gas stations fare significantly better than parking spaces.

Under the fixed-duration model, a GPU server's annual revenue is capped. Under the Token revenue-sharing model, as long as the usage of AI applications running on it grows, the computing power company's revenue grows too. And in 2026, the growth in AI application usage is exponential.

3

Three Specific Consequences of Computing Power Inflation

【Consequence 1】Computing power scarcity is intensifying, not diminishing
Intuitively, with increasing production capacity for computing power chips, computing power should become less scarce. However, there's a timing mismatch: demand is growing much faster than capacity expansion. Google raised its 2026 capital expenditures to $180-190 billion, with Amazon and Microsoft following suit, totaling over $450 billion among the three. (Source: East Money, May 2026) These funds are earmarked for building data centers that won't be operational until 2028. Yet, AI application demand is exploding in 2026.

【Consequence 2】Pricing power in the AI application layer is shifting to the computing power layer
Previously, it was assumed that the majority of profits in the AI industry chain would accrue to the "model" layer. However, the existence of computing power inflation is enhancing the bargaining power of the computing power layer. A cross-era analogy: This is somewhat akin to the dot-com bubble in the 1990s, when companies selling network cables and servers ultimately profited more than those creating web content—the "picks and shovels" outperformed the "gold prospectors." In the AI era, computing power is the pick and shovel.

【Consequence 3】The Token economy has forged a new profit distribution mechanism
The emergence of the Token revenue-sharing model allows computing power companies to directly participate in the commercial monetization of AI applications. This transforms their business model from "selling time-based leases" to "participating in the commercial ecosystem of AI applications"—a systemic elevation of the computing power layer's position in the AI industry value chain.

4

Where Can the Computing Power Inflation Paradox Be Applied?

The true value of this framework lies in its transferability.

Applied to personal electricity bills: Household appliances are becoming more energy-efficient, yet household electricity consumption is rising. Why? Because energy efficiency lowers the barrier to use, leading to more devices and longer usage times. The total revenue of power companies rises instead of falls. This is structurally identical to computing power inflation.

Applied to bandwidth economics: In the 4G era, per-unit data prices fell sharply, but operators' total revenue didn't collapse because usage growth offset the price declines. The same applies to 5G: unit prices drop, but the number of connected devices and data transmission volumes expand exponentially.

Universal framework: This "inflation paradox" emerges whenever two conditions are met—falling unit prices and explosive usage growth. Understanding this framework helps identify industries undergoing similar structural shifts on the supply side.

5

What Does This Mean for You?

If you're an ordinary user of AI products: It's true that using AI will become cheaper. However, when computing power inflation trickles down to the application layer, you'll find premium features adopting tiered pricing (Doubao's launch of a paid version is a signal), with free offerings becoming more basic and truly valuable tasks requiring payment. This is structural.

If you're a corporate decision-maker: Computing power costs are becoming a core cost item for AI application companies and will grow linearly with usage under the Token revenue-sharing model. Effective budgeting for computing power costs is a new challenge in corporate financial planning for the AI era.

If you follow AI industry investments: The assumption that "the AI application layer will be the most profitable" needs reevaluation. The bargaining power of the computing power layer is rising, driven by changes in pricing models, not just short-lived hype. Of course, for individual stocks, this is not investment advice.

This article simplifies greatly. In reality, pricing models vary widely across different types of computing power and customer segments, and the Token revenue-sharing model is still in its early stages.

What this article offers is a framework to understand the computing power inflation paradox: falling unit prices → explosive usage → rising total demand → sustained scarcity → transfer of pricing power from buyers to sellers. With this framework, when you see "AI getting cheaper" and "computing power companies securing large contracts" happening simultaneously, you'll understand these events are not contradictory.

This article is for information sharing and industry analysis only and does not constitute any investment advice, investment analysis, or trading solicitation. Markets carry risks, and investments should be made with caution. Anyone making investment decisions based on this article assumes all risks and outcomes, with the author and publishing platform bearing no legal liability.

This article is an original piece by BT Finance and may not be used, reproduced, disseminated, or adapted without permission. Infringement will result in legal action.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.