The Economics of Token Factories is Restructuring the Entire AI Industry

04/17 2026 410

Author: Haishan

Source: Bowang Finance

From the “bargain-basement” price wars of tokens in 2024 to collective price hikes by Alibaba Cloud, Tencent Cloud, and Baidu Intelligent Cloud in 2026.

The token industry accomplished an astonishing reversal in just two years, transitioning from money-burning internal competition and overcapacity to supply shortages and simultaneous price and volume surges.

Since 2026, the A-share AI computing power sector has accumulated gains exceeding 55%. Leading large model companies such as Yue Zhi An Mian and Zhipu AI have seen monthly revenues surpass 1 billion yuan, with some firms exceeding their 2025 full-year earnings in just 20 days.

This industrial revolution, defined by Jensen Huang as the “economics of token factories,” has transcended mere technological hype. It has become a definitive trend driven by an explosion in real demand, supply-demand imbalances, and global competition for energy and computing power. The restructuring of its underlying logic is reshaping the rules of the entire AI industry and subvert (overturning) the fundamental operational logic of the world.

01

The New Era’s “Oil”

The essence of this industrial inflection point is the AI industry’s full-scale shift from a “model arms race” to a “token capacity race.”

Before 2024, the industry’s core narrative revolved around “whose model had more parameters and was smarter.” Major players Crazy burning money (frantically burned money) to train large models, preempt the market (seizing market share) through free token giveaways and low-price dumping, even leading to the absurd situation where “selling tokens was less profitable than selling bottled water.”

However, the explosive popularity of OpenClaw (nicknamed “Lobster”) in February 2026 shattered this logic.

Traditional large models operated on a “human-to-AI” single-round interaction mode, consuming just 1,000–3,000 tokens per dialogue. In contrast, agents employ a “planning-action-observation-reflection” cyclic architecture, requiring dozens to hundreds of model calls for complex tasks. Medium tasks consume 100,000 tokens, while complex ones reach millions—earning them the industry nickname “token crushers.”

Data from the National Data Bureau confirms this explosion: China’s daily average token calls skyrocketed from 100 billion in early 2024 to 140 trillion by March 2026, a 1,000-fold increase in two years. The first quarter of 2026 alone saw a 40% rise from the end of 2025.

The industry narrative has Completely turn (completely shifted): The focus is no longer on competing for the “highest IQ” of models but on who can produce massive tokens at lower costs and more stably, and who can seize the initiative in intelligent supply.

Faced with overwhelming demand, rigid supply-demand mismatches form the core support for sustained strength in the token market. This imbalance is not a short-term fluctuation but a structural contradiction dictated by long-term industrial cycles.

Three insurmountable bottlenecks exist on the supply side:

First, core hardware production is monopolized, and expansion cycles are lengthy.

HBM (High Bandwidth Memory), the “heart” of AI servers, is dominated by Samsung, SK Hynix, and Micron, which control over 95% of global capacity. Their expansion cycles span 24–36 months, leading to a 40% HBM shortage in 2026.

Squeezed by this, ordinary DDR5 memory prices surged 300% in six months, with 256GB server memory modules exceeding 40,000 yuan each. AI server delivery times extended from three months to twelve.

Second, electricity and energy have become the largest hidden bottlenecks.

Smart computing center cabinet power consumption is 10–20 times that of traditional data centers, with electricity costs accounting for over 60% of token production expenses. However, power infrastructure construction for large data centers takes 3–5 years, creating a severe shortage of computing power indicator (quotas) in eastern China.

Third, infrastructure and operational capabilities cannot keep pace with exploding demand.

Liquid-cooled data center penetration rose from 15% in 2024 to 45% in 2026, but severe shortages of skilled technicians and construction capacity left many completed computing clusters operating below full capacity.

While supply-side capabilities lag, demand exhibits a “three-stage rocket” explosion with strong sustainability.

The first stage is the proliferation of consumer-facing agents. Individual users shifted from casual chat and entertainment to using AI assistants for emails, coding, and planning. Daily token consumption per user surged from dozens to thousands, with future potential to reach tens of thousands.

The second stage is the full-scale implementation of enterprise-grade production applications. Companies no longer view AI as a nice-to-have tool but incorporate tokens as core production factors. Firms like Kunlun Wanwei and 58.com consume over 1 trillion tokens monthly, while AI transformations in manufacturing, finance, and healthcare are unleashing trillion-yuan token demand.

The third stage is the global export boom. Chinese large model tokens cost just 1/5 to 1/3 of overseas alternatives like Claude and GPT, enabling rapid market seizure in Southeast Asia, the Middle East, and Latin America. In Q1 2026, overseas token revenues for Chinese cloud providers surged 320% year-on-year, becoming a new growth engine.

At a deeper level, tokens are becoming the foundational commodity of the AI era, restructuring the entire digital economy’s value system. Just as electricity powered the industrial age and traffic was the core asset of the internet era, tokens represent the core production material of the intelligence era. With their measurable, priceable, and tradable attributes, they serve as a universal value anchor connecting computing power supply with intelligent demand.

This shift has triggered a complete Business Model Revolution (business model revolution): The industry has abandoned the internet-era approach of “burning money for scale” and entered a new phase of “usage-based billing and profit-driven growth.”

Major players widely adopt strategies of “C-end subsidies to cultivate habits and B-end Large scale harvesting (large-scale monetization).” They offer limited free tokens to individual users while precisely charging enterprise clients based on consumption. In Q1 2026, leading cloud providers’ AI business gross margins universally rose above 35%, achieving scaling profitability for the first time.

For China, this token industrial revolution presents a historic opportunity to leapfrog. China boasts the world’s lowest green electricity costs, the most complete computing infrastructure (over 60% of global server production), the broadest application scenarios, and the most cost-effective large models—all prerequisites to become the “world’s token factory.”

Just as China became the “world’s factory” through cost advantages, it now leads global token production and supply through integrated strengths in energy, computing power, and scenarios.

In the short term, supply-demand mismatches will persist until late 2027, keeping token prices elevated and industry concentration high.

In the long term, as chip capacity expands and model efficiency improves, tokens will enter a “bargain-basement” era, permeating every corner of the national economy and becoming the core engine of digital economic growth.

02

How Are Segmented Industries Faring?

As the token industry reverses from “cutthroat low prices” to “supply-demand tightness,” its segments have undergone structural differentiation.

Upstream price control, midstream profit enhancement, and downstream monetization characterize the divergent market conditions. The three major sectors—upstream computing hardware production, midstream token hub scheduling, and downstream application implementation—exhibit vastly different barriers, market sentiment, and value distribution logics.

First is upstream computing hardware, the core capacity of token factories and a rigid demand under monopoly.

It covers four key segments: AI chips, computing power servers, liquid cooling, and smart computing center operations, with an oligopolistic industry structure.

AI chips are the core engines of token production. Nvidia dominates over 90% of the global high-end GPU market.

However, A-share domestic substitute (substitution) leaders are breaking through: Cambricon’s Thought 590 chip has achieved mass production, supporting large model inference and training, with AI chip revenue surging 320% year-on-year in Q1 2026.

Hygon Information’s DCU products penetrate over 30% of domestic smart computing centers, deeply partnering with leading firms like Sugon and Inspur. Jingjia Micro’s JM9 series GPUs have been deployed in government and financial scenarios, becoming a core supplier of domestic general-purpose GPUs.

Computing power servers are the carriers of token capacity, with A-share leaders dominating globally.

Inspur Information maintains the world’s top market share in AI servers, with Q1 2026 shipments surging 180% year-on-year. Sugon’s liquid-cooled servers rank first domestically, supporting over 80% of China’s national-level smart computing centers.

Liquid cooling is essential for high-power smart computing centers, with penetration soaring from 15% in 2024 to 45% in 2026.

Envicool is the absolute leader in liquid cooling, partnering with core clients like Nvidia, Inspur, and Huawei, with liquid cooling orders jumping 210% year-on-year in 2026.

Shenling Environment’s liquid-cooled data center solutions have been deployed in multiple national-level smart computing centers, with order growth exceeding 150%.

In smart computing center operations, Baosteel Software, SinoNet, and Runze Intelligence have become China’s largest third-party operators by leveraging core locations and green power resources, with Q1 2026 computing power rental revenues all surging over 100% year-on-year.

Next is the midstream token hub, shifting from price wars to value competition.

The midstream of the token industry handles core functions like computing power scheduling, model services, and standardized token output. Players fall into two categories: large model firms and cloud service providers.

Leading A-share large model firms have established clear token monetization paths.

For example, Kunlun Wanwei’s Tiangong model sees over 1.2 trillion daily token calls, with over 120,000 B-end paying clients. Its enterprise token service is priced at just 1/4 of overseas models, driving AI business revenue up 450% year-on-year in Q1 2026.

iFLYTEK’s Spark model focuses on vertical scenarios like education, healthcare, and office work, with 70% of token consumption coming from B-end production applications.

Among cloud service providers, Alibaba Cloud, Tencent Cloud, and Volcano Engine (though not A-share listed) benefit A-share ecosystem firms: Yonyou Network and Kingdee International (HK-listed) build enterprise AI applications on Alibaba Cloud, becoming key channels for token consumption.

Finally, downstream application scenarios represent the ultimate outlet for token value, penetrating C-end affordability and B-end necessity.

Downstream segments fall into three categories: C-end personal applications, B-end enterprise services, and vertical industry digitization, with stark differences in token consumption scales and monetization pace.

C-end scenarios prioritize affordability, focusing on personal AI assistants, content generation, and creative design.

A-share examples include Wondershare’s AI creative software (Filmora, Wondershare AI Painting), with over 5.5 million global paying users and Q1 2026 token consumption surging 320% year-on-year. Model optimizations have reduced per-user token costs by 40%.

Colorful News’s AI email and smart office assistants have over 300 million cumulative users, with daily token calls exceeding 50 billion.

B-end enterprise services dominate token consumption, accounting for over 65% of total usage.

For instance, Hisense’s AI investment advisory service covers over 100 million investors, with daily token calls surpassing 80 billion. Q1 2026 AI-related revenue jumped 190% year-on-year.

Supcon’s industrial AI platform provides intelligent operations for chemical and power industries, with single factories consuming over 5 million tokens annually.

Runda Medical’s AI-assisted diagnosis system serves over 3,000 hospitals nationwide, processing over 20 billion medical text tokens daily.

Overall, B-end vertical industry scenarios represent the long-term growth engine for the token industry, with AI transformations in autonomous driving, smart manufacturing, and fintech unleashing trillion-yuan token demand.

03

Which Stocks Are at the Forefront?

From an industrial perspective, the token sector has fully pivoted from “model competition” to “capacity and monetization competition.” Supply-demand mismatches, coupled with accelerating commercial value release, position six A-share leaders as the most promising core plays across computing hardware, midstream models, and downstream applications.

First is Inspur Information, the absolute AI server leader and token capacity anchor. As the global market share leader in AI servers, Inspur is the core hardware carrier supporting global token factories. Its deep partnership with Nvidia ensures priority access to high-end GPU quotas, creating unparalleled supply chain and scale barriers.

Q1 2026 AI server shipments surged over 150% year-on-year, with global market share exceeding 25%. Backlog orders neared 40 billion yuan, with delivery schedules extending to late 2027—making it the industrial chain’s most determinism performer.

Second is Envicool, the liquid cooling leader and token factory cooling heart. As smart computing center power density soars, liquid cooling becomes essential for scalable token production, with industry penetration rising from 15% in 2024 to 45% in 2026. Q1 2026 liquid cooling revenue jumped over 210% year-on-year, with order visibility extending to 2027—making it the upstream segment’s highest-earnings-growth play.

As a pioneer in large model commercialization and a benchmark for Token monetization, Kunlun Tech is the first large model manufacturer in the A-share market to achieve large-scale Token profitability. Its enterprise-level Token service is priced at only 1/3 to 1/4 of overseas models, enabling it to swiftly capture the small and medium-sized enterprise market.

In the first quarter of 2026, the average daily Token calls exceeded 1.2 trillion, with over 120,000 B-end paying customers. AI business revenue surged by over 450% year-on-year, maintaining a gross profit margin of above 42%, making it the purest Token monetization target in the A-share market.

iFLYTEK is the leader in vertical large models and the core carrier of industry Tokens. iFLYTEK has deep roots in vertical sectors such as education, healthcare, and industry, with over 70% of Token consumption in its Spark large model coming from B-end production applications, demonstrating extremely strong demand rigidity.

Leveraging its years of industry accumulation in scenarios and data barriers, the company has experienced rapid growth in customized Token service orders for government and enterprise clients. By 2026, AI-related revenue is expected to account for over 60% of its total. As AI penetration in vertical industries continues to rise, the company will fully benefit from the long-term Token demand dividends brought by industrial digitalization.

Then there's Wondershare, the leader in overseas C-end AI applications and the core of personal Token consumption. Wondershare is a global leader in C-end AI creative tools, with its video editing, AI painting, and other products boasting over 5.5 million paying users. After the full implementation of AI features, user willingness to pay and usage duration have significantly increased, with Token consumption surging by over 320% year-on-year in the first quarter of 2026.

Overall, the current Token dividend represents a long-term opportunity driven by demand. In the short term, it is advisable to prioritize upstream hardware leaders such as Inspur Information and Envicool. In the medium term, focus on commercialization benchmarks like Kunlun Tech. In the long term, vertical scenario leaders such as iFLYTEK are favored. High-quality enterprises are poised to achieve dual improvements in performance and valuation during this high-growth cycle.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.