03/19 2026
462

AI freedom is now out of reach for ordinary people
The illusion that cloud computing could only become cheaper was shattered in 2026.
Recently, Tencent Cloud's intelligent agent development platform announced optimizations to the billing strategies for certain models.
According to the announcement, the adjustments primarily involve two types of changes: one is the end of free access to public beta models, with GLM 5, MiniMax 2.5, and Kimi 2.5 models transitioning from free public beta to formal commercial services on March 13; the other change involves price increases for the Hunyuan series models, Tencent HY2.0 Instruct and Tencent HY2.0 Think, with some models seeing price hikes exceeding 400%.

Tencent Cloud, however, is not the first cloud platform to raise prices.
On February 11, UCloud issued an announcement regarding price increases for its product services. UCloud stated that due to ongoing global supply chain disruptions, there have been significant and structural increases in infrastructure costs, particularly in the procurement of core hardware. After careful evaluation, the company decided to implement price adjustments for all products and services for both renewing and new customers starting March 1.
This wave of price hikes is not just a collective frenzy among Chinese cloud providers. Looking across the ocean, the first domino to fall was quietly pushed by Amazon.
On January 4, without any announcement or press conference, AWS raised EC2 prices by approximately 15%, with the flagship instance price increasing from $34.61 to $39.80 per hour.
Google Cloud soon followed, announcing that starting May 1 of this year, it would raise global data transfer service prices, with rates per GB in North America doubling from $0.04 to $0.08. Without much fanfare, the decision was made with a simple announcement.
AWS, Google, Tencent, and UCloud—cloud providers spanning the Pacific, from East to West—have all, at the same time, independently decided to raise prices. This has shattered a nearly two-decade-long industry belief: cloud services only get cheaper, never more expensive.
So, is this price hike the end of a two-decade myth of falling cloud service prices, or just a cyclical correction? Will the unit price of tokens continue to rise? Will your total AI bill become cheaper or more expensive in the future?
01
How cloud providers slashed prices over the past two decades
The shock from this price hike stems from users' long-standing habit of cloud computing prices consistently declining for twenty years.
Cloud computing is fundamentally a high-stakes gamble on "scale."
When Amazon launched AWS in 2006, the logic was straightforward: with a large number of idle servers in its data centers, why not rent them out by the hour instead of letting them gather dust? This business model, akin to "renting out empty warehouse shelves," has, over the past two decades, completely overhauled the underlying infrastructure of the global IT industry.
According to incomplete statistics from Super Focus, AWS has proactively reduced prices more than a hundred times over the past twenty years. In China, Alibaba Cloud and Tencent Cloud have treated "price cuts" as a semi-annual festival. This relentless, nearly "self-destructive" pricing strategy has driven global IT infrastructure costs to historic lows.
Such a scenario would be unthinkable in any other industry—no landlord voluntarily reduces rent annually, and no supermarket consistently lowers price tags every quarter. Yet, the cloud computing industry has achieved this for two decades.
So, why has only the cloud computing industry been able to do this?
The reason is simple: physical constraints have supported these price reductions.
Traditional cloud computing, or Infrastructure as a Service (IaaS), essentially sells "server space and electricity." Whether it's compute instances, storage, or network bandwidth, these are highly standardized digital utilities. The cost of these digital resources is subject to a harsh physical law: Moore's Law.
The density of transistors on chips doubles every 18 to 24 months, meaning the physical cost per unit of computing power has been in freefall. As cloud providers purchase newer servers at lower costs, they enjoy significant technological dividends.
However, these dividends have never fully benefited cloud providers, as traditional cloud services are highly commoditized.
Your servers can run code; so can your competitors'. If you seek profits and don't pass hardware cost savings to customers, competitors will undercut your prices by 30% and steal your clients.
Thus, price cuts are not a choice but a survival instinct. If you don't cut prices, others will.
Beyond technological advancements, economies of scale are another defining feature of cloud services.
Building a hyper-scale data center—acquiring land, installing power lines, constructing server rooms, and purchasing tens of thousands of servers—requires tens of billions in fixed investment.
Once operational, the marginal cost of hosting an additional startup website or processing one million more data requests is nearly zero, aside from a few cents in additional electricity costs.
This is cloud computing's most alluring commercial leverage. At scale, serving more clients dilutes fixed costs per client.
Under this logic, industry giants see the bigger picture: market share trumps unit pricing. By offering rock-bottom prices to dominate the market, they can spread costs, undercut competitors, and solidify their position. This is an unbeatable virtuous cycle.
Giants have grand ambitions, aiming for "low prices to achieve monopoly." The ideal scenario: with sufficient capital, weaker competitors will collapse, leaving pricing power firmly in their hands.
But reality is harsher. The players at this table are all heavily armed "great powers."
Across the ocean stand Amazon, Microsoft, and Google—the "three mountains." In China, Alibaba, Tencent, and Huawei dominate with seemingly bottomless pockets. All have ample cash flow and strategic resolve to endure losses. Want to outlast them with low prices? Sorry, these trillion-dollar giants aren't backing down.
Thus, cloud computing's price wars have dragged on for two decades in a quagmire of attrition.
Objectively, this ruthless competition has driven technological breakthroughs in cloud computing, transforming once-expensive enterprise IT infrastructure into today's "utilities." Without this two-decade price war, today's thriving internet ecosystem would not exist.
Yet, the price-cutting vortex never stopped—until 2026, when explosive demand for large models suddenly jammed the decades-long price-cutting flywheel.
02
Short-term price hikes are a feint; long-term bills are the real threat
The sudden collective price hikes by cloud providers stem from a simple reality: their hardware is struggling to keep up with the surge in AI demand that erupted in early 2026.
When large models first gained traction, giants didn't immediately raise prices but instead offered free public betas. This was like supermarkets offering free samples of new drinks—initial usage was minimal, with users merely testing funny queries. Cloud providers' computing reserves easily handled this.
By 2026, the situation changed dramatically. Businesses discovered AI's practical value, integrating it into customer service systems, internal data analysis, and even core operations. Individual users found AI agents like Openclaw could handle simpler tasks.
Token consumption skyrocketed, resembling a tsunami.
As UCloud admitted in its announcement, "infrastructure costs, particularly for core hardware, have risen significantly and structurally." In plain terms: users are consuming resources too rapidly, and procurement and electricity costs are depleting profits.
This is the true logic behind the current price hikes—not cloud providers seizing pricing power but an awkward "supply-demand mismatch."
Many data centers still rely on general-purpose, heavy-duty GPUs designed for training large models, not for everyday token generation. Using these expensive, power-hungry devices for mass token production is financially unsustainable.
With soaring usage and inefficient hardware, cloud providers, under cash flow pressure, resorted to price hikes to stay afloat.
However, seeing Tencent Cloud end free tiers and AWS raise rates has sparked panic. Many assume that as AI becomes indispensable, cloud providers will gouge customers, making AI bills a bottomless pit.
This is unfounded. Current high computing costs stem from a transitional phase of outdated hardware. Chip giants are already addressing this.
As large models' applications expand, the market no longer needs as many training chips but desperately requires inference chips designed for token generation. Soon, data centers will be filled with next-generation inference hardware optimized for token output.
For instance, at the upcoming GTC conference, NVIDIA is expected to unveil new inference chips integrating LPU technology. Domestic players like Cambricon are also focusing on inference chips.
These new chips eliminate unnecessary computing units, focusing solely on data throughput. This means exponentially higher token output per watt, further reducing unit token costs.
Moreover, software engineers are pushing cost optimization to new heights.
Earlier models were inefficient—answering a simple weather query required activating billions of parameters, wasting electricity. Now, through techniques like Mixture of Experts, systems activate only relevant "brain cells," leaving the rest idle.
This software-level efficiency, combined with new hardware, will continue driving down the real cost of token generation in cloud data centers.
Thus, this price hike is likely a temporary rebound due to outdated hardware being overwhelmed by new demand. Unit token prices will still plummet toward zero.
03
Tokens are cheap; "intelligence" is expensive
If unit token prices are destined to crash, will "AI freedom" soon follow? Not quite. The assumption that completing a task requires a fixed number of tokens no longer holds as AI evolves from a "Q&A tool" to an "AI agent."
In 2023, AI interactions were simple: you input a query, it returned an answer, consuming a thousand tokens at a few cents per interaction.
By 2026, AI usage has fundamentally changed. When an agent is tasked with a real business operation—analyzing a competitor report, reviewing a contract, or processing customer emails—its backend processes are far more complex.
It logically deduces steps, repeatedly queries search engines and corporate databases, and even learns new skills from skill libraries to complete tasks.
Each internal decision and tool invocation consumes tokens. Compared to two years ago, token usage has grown not just severalfold but exponentially.
This mirrors the classic steam engine story from 150 years ago. When James Watt improved steam engine efficiency, coal consumption should have dropped. Instead, factories adopted steam engines en masse, and Britain's coal usage exploded.
Today's AI computing demand follows the same script: higher efficiency and lower unit costs lead to greater total consumption.
Some ask: Will algorithmic optimizations eventually offset rising consumption?
Unfortunately, no. AI computations occur in the physical world. Each silicon transistor state change and cooling system cycle consumes real electricity. With billions of agents operating 24/7, handling exponential task volumes, this translates to relentless data center activity and soaring electricity bills.
The physical world's energy constraints ensure computing power cannot grow infinitely. This answers the core question: Will your total AI bill become cheaper or more expensive?
The answer is stark and uncomfortable: Absolute costs will rise—significantly.
Extending these commercial and physical realities to their logical conclusion leads to a disturbing conclusion:
For three decades, the classical internet era promoted a comforting narrative: Technology is a great "equalizer." Search engines democratized information access, social media amplified grassroots voices, and smartphones bridged urban-rural divides. With near-zero marginal costs for software distribution, technological benefits transcended class barriers, reaching the masses.
But in the AI era, this utopian logic is brutally breaking down.
When large models truly evolve from 'encyclopedias in dialog boxes' into 'super Agents that think and make decisions on behalf of humans,' they inherently become insatiable Token consumers with bottomless appetites. Truly powerful AI capabilities will never be close to free, as web browsing once was. Their costs will escalate infinitely in proportion to the exponential rise in task complexity.
It is foreseeable that as the two-decade-long era of 'price reductions for universal access' in cloud computing reaches its end, future intelligence will inevitably exhibit an extremely rigid 'stratification.'
Individuals or enterprises at the top of the financial food chain, who can afford high-quality AI computing power, will see their productivity exponentially amplified by superior Agents. With sharper business acumen, shorter decision-making paths, and execution efficiency far surpassing ordinary people, their advantages will further compound through data flywheels generated by high-frequency usage, delivering a dimensionality-reducing blow to those below.
Meanwhile, ordinary individuals and small-to-medium enterprises unable to pay exorbitant bills will have to rely on simplified, diluted 'low-tier intelligence' disguised as free offerings. Such versions may help draft perfunctory weekly reports or create a couple of illustrations, but when faced with truly complex commercial gamesmanship capable of transcending social strata, top-tier medical diagnoses, or hardcore legal analyses, they can only offer vague nonsense.
This is not a dystopian fantasy from science fiction—it is the coldest, hardest reality and the future unfolding around us.
Throughout history, what has always been cheap is mere 'computation' and 'information.' True top-tier 'cognition,' however, has always been expensive and the privilege of a select few. Far from breaking down this barrier, large models have used soaring electricity meters and costly Token bills to erect this cognitive wall higher than ever—and made it even harder to detect.
This is the chilling truth of our era lurking behind the current wave of cloud service price hikes.
- END -