05/01 2026
472
Introduction: When mainstream models charge based on Tokens, companies set up dedicated Token budgets, and government policy documents also mention 'token trading,' Token is becoming an undisputed new economic unit.

Wang Jian/Author Lishi Business Review/Publisher
In March 2026, two seemingly unrelated events occurred.
NVIDIA CEO Jensen Huang predicted at the GTC conference that the company's revenue would reach at least $1 trillion by 2027.
In his speech, he also redefined data centers, introducing them as 'factories that produce AI intelligence Tokens.'
In the same month, Liu Liehong, Director of China's National Data Bureau, stated at the China Development Forum that 'Tokens are not only the value anchor in the intelligent era but also the settlement unit connecting technological supply with commercial demand.'
Furthermore, he officially designated the Chinese translation for 'Token' as ' token ' (Ci Yuan).
One is the leader of the world's largest chip company, and the other is China's top data official, yet both described Token as an economic unit in almost identical terms.
So, what exactly is Token, which is now globally popular and may even become the currency of a new era?
1
What is Token?
In 1906, American philosopher Charles Sanders Peirce was pondering a seemingly simple question: If a page has 20 instances of the word 'the,' does that count as one word or 20 different words?
This was not just Peirce's whim; he was not nitpicking over words.
As a philosopher, he believed that 'the' as an abstract concept represented a universal rule or form.
He referred to it as the 'Type'; each specific instance of 'the' in the book was a concrete manifestation of this Type, which could be called a 'Token.'
In other words, the 20 instances of 'the' were 20 different Tokens of the same Type.
He pointed out, 'The Type itself does not exist, but it determines what specific things can exist.'
This seemingly esoteric idea circulated in philosophical circles for a long time, but no one thought it would have any connection with computers in the future.
It was not until 1936 that Harvard linguist George Zipf mathematically explained Token again while studying word frequency.
At that time, Zipf discovered an interesting phenomenon when analyzing word frequencies across various languages: The product of a word's rank and its frequency was almost a constant. For example, in Chinese, ' of ' (de) is the most commonly used character, ranked first, with a frequency of about 6%.
In this case, the rank (1) multiplied by the frequency (6%) equals approximately 6%.
Next, the second most common character is ' is ' (shi), with a frequency of about 3%. 2 multiplied by 3% also equals approximately 6%. Then, the third-ranked character ' one ' (yi) has a frequency of about 2%. 3 multiplied by 2% similarly equals approximately 6%.
It can be seen that the product of rank and frequency is approximately constant.
Therefore, the frequency of ' of ' (ranked first) is about twice that of ' is ' (ranked second) and three times that of ' one ' (ranked third).
This regularity (law), where 'frequency is inversely proportional to rank,' was later named 'Zipf's Law.'
No one expected that this seemingly dry mathematical theory would become an important theoretical foundation for computer language processing thirty years later.
The 1960s saw the concept of 'Token' finally applied in the computer world.
For example, when a programmer writes code like int x = 5;, early computers act like meticulous 'grammar dissectors,' breaking down the string of characters one by one to understand it.
In this process, the computer first recognizes 'int' as a keyword indicating an integer type, then marks 'x' as a variable name, sees '=' as the assignment symbol, and finally identifies '5' as a specific numeric value.
Each of these independently identified units, labeled with clear meanings, is a Token.
Thus, Token completed its transformation from a humanistic concept to a machine language, becoming the basic unit for computers to 'understand' instructions and information.

From being a grammatical cornerstone silently supporting the digital world to later being endowed with new value and consensus, the meaning of Token continues to extend.
In 2017, with the rise of blockchain and ICOs, the obscure Token gained global recognition by donning the glamorous cloak of 'digital token.'
Although that wave of enthusiasm gradually cooled, and many projects quietly exited, the concept of Token firmly remained.
It was no longer just a technical term but was mentioned again with a new identity as a 'circulating digital equity certificate.'
It can be said that regardless of the context, the core of Token has always been to standardize complex things into the smallest units that a system can recognize, process, and circulate.
It is precisely this consistent gene that has made Token the most fundamental and important 'language unit' in human-computer interaction today, with the rise of large-scale language models.
So, how does AI use this 'ruler' to learn to 'understand' and 'think' when faced with human language?
2
The Underlying Logic of AI Learning to Think
We must first clarify that AI's understanding of human instructions is not just 'reading' or 'reasoning' as we imagine but a precise 'surgical operation'—'segmentation.'
This means that any sentence you input undergoes a precise 'dissection' by AI.
After the instruction is issued, all text is segmented into a series of Token fragments, which are then converted into computer data.
In other words, all of AI's 'thinking' and 'reasoning' are actually completed through complex calculations of these numbers, which are then 'translated' into language that people can understand.
This sounds simple, but the actual operation is extremely complex.
For example, the most common issue is AI's ambiguity dilemma.
Consider the sentence 'How much did the badminton racket sell for at auction?' Should AI segment it after 'badminton racket' or after 'auction'?
The former is an inquiry about the price of a sports item, while the latter becomes an auction event, with vastly different meanings. AI cannot determine this based solely on characters.
Therefore, the question of 'what and how to segment' becomes the most fundamental issue for AI.
More troubling is that if a word has never appeared in the training data, the model cannot recognize it and can only mark it as 'unknown' and skip it, meaning a BUG ( loophole , loophole) appears in the system.
Thus, enabling AI models to handle ambiguity and 'recognize' never-before-seen word combinations has been a long-standing challenge in computer language processing.
This challenge was overcome thanks to a technical paper forgotten for many years.
In 1994, American programmer Philip Gage published an article in a C-language technical magazine introducing a compression algorithm called BPE (Byte Pair Encoding).
Gage's idea was simple: by repeatedly scanning text and welding the two most frequently adjacent characters (such as 'th') into a new symbol, iterative compression is achieved.
After repeated iterations, common phrases become increasingly compressed. The decompression end only needs to save this 'packaging reference table,' making the entire program extremely small.
However, because its compression efficiency was not outstanding, and no one in the industry cared about changes in a few KB of memory, the algorithm did not attract much attention at the time.
The paper was quickly forgotten, and it remained so for 22 years.
It was not until 2016 that Rico Sennrich, a researcher at the University of Edinburgh, accidentally retrieved this old paper while studying the word segmentation problem in machine translation.
He keenly realized that BPE's frequency-based merging strategy was an excellent solution for word segmentation: it did not require pre-defining a dictionary but let the data 'speak' for itself. High-frequency combinations gradually condensed into Tokens like a snowball.
In this way, even when faced with unfamiliar rare words, computer language could break them down into more detailed bytes, completely avoiding the 'unknown' dilemma.

In 2019, when OpenAI released GPT-2, it borrowed this concept.
The development team set the starting point for word segmentation directly at the 'byte'—the smallest unit of computer storage—unifying the representation of all languages from the ground up, enabling the model to theoretically process any language or script.
A short paper forgotten for more than two decades thus became one of the underlying logics driving the trillion-dollar AI industry.
This outcome was probably unexpected even to Gage himself.
However, when this ability to 'process all text' is combined with efficiency-oriented algorithms, a new form of 'algorithmic hegemony' quietly emerges.
3
Algorithmic and Encoding Hegemony
The word segmentation method used by AI today may seem 'fair' on the surface: the more a language is used, the more efficient and complete its processing; less commonly used languages are segmented more fragmentedly and are more 'difficult' to process.
But this efficiency-oriented 'fairness' quietly divides the world's languages into two tiers: some languages have a 'fast lane,' while others feel like walking on a gravel road.
Simply put, because the core logic of the BPE algorithm is 'frequency priority,' the most common language—English—naturally becomes the most prioritized language for expression, while other languages are ranked based on their 'digital visibility.'
Therefore, an implicit 'language tax' system has actually formed within AI models: expressing the same meaning costs the fewest Tokens and is cheapest in English; Chinese typically requires 1.5–2 times as many; and languages with fewer resources, such as Zulu or Tibetan, can cost 5–10 times as much as English.
This means that, under a Token-based pricing model, conversing with AI in English is not only faster but also allows for far more computational power to be called upon with the same budget compared to other languages.
This is nothing new; it has always been the case in the information age.
From Morse code to keyboard design, almost every underlying change in information technology has implicitly paved the way for English, forcing users of other languages to pay an additional 'transcoding' cost.
Thus, the efficiency gap of Tokens is merely a repetition of this historical pattern in the AI era.
What is alarming is that this 'starting line' injustice, once written into AI's initial vocabulary, is almost impossible to correct.
Because word segmentation rules are the foundation of how AI models perceive the world—the higher the building, the more unchangeable the foundation.
Fortunately, with China's rapid progress in large-scale models, even models dominated by English corpora have begun to significantly optimize their processing efficiency for Chinese.
This is evident in OpenAI's model iterations.
For example, the same Chinese sentence required 38 Tokens in GPT-3, decreased to 26 in GPT-4, and only 15 in GPT-5.
This indicates that through several generations of GPT evolution, the number of Tokens required to process the same Chinese content has dropped by more than 60%, significantly improving Chinese recognition efficiency.

Domestic large-scale models like Tongyi Qianwen and DeepSeek have gone even further by incorporating high-frequency Chinese phrases and idioms as native Tokens into their vocabularies from the outset, achieving more efficient and 'native-level' processing of Chinese within the same model scale.
In other words, in the AI era, whoever holds the 'semantic segmentation authority'—the power to define the basic units of language—largely controls the expression efficiency and cost advantage of that language in the digital world.
This power to define Tokens has essentially become a 'foundational currency issuance right' in the digital age.
Its strategic significance is no less than mastering chip design and manufacturing.
This efficiency gap may seem like a hurdle, but it is more like an entrance ticket: with sufficient computational power and data, one can bypass others' paths and lay the most solid foundation themselves.
However, to truly transform this advantage of 'defining the basic units of language' into industrial influence requires a complete ecosystem of support, from energy and chips to computational power.
On this path, China happens to be standing at the starting line.
4
China Forges Token Hard Currency
If we were to map China's position in the global Token economy, the chain would start with energy and end with the global AI services market.
Imagine this scene: Wind turbines in the Gobi Desert of northwest China convert wind energy into electricity, which then flows into data centers along ultra-high-voltage transmission lines. GPUs convert this electrical energy into computational power, continuously producing Tokens.
These digital units eventually flow through undersea fiber-optic cables to various regions (all corners of the globe), generating API call revenues denominated in U.S. dollars.
In fact, China's scale in this chain is already large enough to stand independently.
Public data shows that as of March 2026, China's daily Token call volume has reached 140 trillion, growing more than a thousandfold in two years.
During the same period, global monitoring indicates that China's large-scale models have surpassed the United States in weekly call volume for several consecutive weeks, leading by more than twice, firmly ranking first globally.
So, why is China's Token economy so strong?
It starts with cost, but the most critical variable is electricity prices.

In regions rich in hydropower, such as Guizhou and Yunnan, and provinces abundant in wind and solar resources, like Gansu and Xinjiang, industrial electricity prices have long been low. Some areas even offer green power for data centers at as low as 0.15 yuan per kilowatt-hour.
In contrast, industrial electricity prices in most parts of Europe and the United States are several times higher than China's or even more.
For example, generating 1 million Tokens requires approximately 15–20 kilowatt-hours of electricity. At China's northwest low-cost green power rates, the cost is only a few yuan; the same computational task would typically cost $60–$200 in international markets.
By comparison, China has built a cost moat from 'electricity' to 'Token' by leveraging its advantages in energy and computing costs.
More critically, China has precisely aligned its vast amounts of green electricity, which is difficult to fully consume, with the surging demand for computing power, forming a unique industrial closed loop.
By 2025, China’s annual electricity generation will exceed 10 trillion kWh, accounting for nearly one-third of the global total.
Among them, new energy sources such as wind and photovoltaic power previously suffered from significant 'wind and solar curtailment' due to insufficient energy storage and limited transmission capacity.
Data centers, as large adjustable loads, can increase their operational loads during peak wind and solar power generation periods, efficiently absorbing this green electricity that would otherwise be wasted.
This not only reduces energy costs but also improves energy efficiency, creating a systemic advantage that is difficult for other countries to replicate.
The recently implemented 'Compute East, Data West' initiative elevates this logic to the national strategic level, guiding data centers to be deployed in regions rich in renewable energy such as Guizhou, Inner Mongolia, and Ningxia.
This is equivalent to directly connecting computing centers to 'green power sockets,' efficiently converting wind and solar power—which might have been discarded in the past—into usable AI computing power that continuously produces Tokens.
Therefore, while this AI competition may appear to be a contest of algorithms and models, it is actually a new answer shaped by the deep integration of energy transition and digital infrastructure.
And China happens to occupy the intersection of this trajectory.
Meanwhile, as AI moves from technological exploration into the depths of industry, scenarios such as quality inspection and production scheduling in traditional manufacturing, risk control and compliance in financial services, and document processing in government systems are rapidly emerging as new major consumers of Tokens.
These demands are massive in volume, consistently stable, and highly price-sensitive, aligning perfectly with the low-cost structure of China’s Token industry and allowing China to maintain an irreplaceable supply advantage in the global Token competition.
Precisely because of the complete support from energy and computing power to practical applications, Tokens have gradually evolved from pure technological units into universal carriers capable of bearing and exchanging value in the digital world.
This means that Tokens could very well become the 'base currency' of the digital economy in the future.
5
When Tokens become an irreplaceable unit of settlement
Looking back at history, it is clear that any new unit of measurement ultimately dominates not because it is perfect, but because it becomes indispensable—to the point where the cost of switching becomes prohibitively high.
Tokens possess precisely this 'once used, hard to abandon' characteristic.
First is their precise measurability.
Tokens are inherently the billing units for AI services, with each invocation leaving a clear record of consumption. They are easier to account for than electricity prices and more directly tied to value output than data traffic—a trait embedded from their inception.
Second is their exchangeability.
Recently, the National Data Administration proposed 'token trading' in a draft consultation document for the first time, exploring the construction of a quantifiable and priceable data value system centered on tokens.
This means that Tokens now have a 'value standard' domestically and are no longer just a unit of measurement in technical documents.
Meanwhile, a seemingly contradictory trend is unfolding: AI service prices at the user end continue to decline, yet upstream computing costs keep rising.
For example, from October 2025 to March 2026, the annual rental price of H100 chips surged by nearly 40%, with chips in short supply; major domestic and foreign cloud providers also collectively raised prices in early 2026.
Behind this lies the structural shift of AI from 'conversation' to 'autonomous execution,' driving a reconfiguration of computing demand and further highlighting Tokens' role as the core value carrier.
Critically, the way AI is used has changed.

In the past, chatting with AI assistants involved simple question-and-answer exchanges that consumed minimal resources. Today, however, when enterprises task AI with automatically writing reports or conducting analyses, the resource consumption can be hundreds of times greater than that of a chat session.
When the traditional per-use pricing model can no longer cover soaring computing costs, price hikes become inevitable—effectively a market revaluation of AI’s growing ability to 'work autonomously.'
Tokens now find themselves in a situation somewhat similar to that of the U.S. dollar in its heyday.
After the dollar abandoned the gold standard in 1971, its value essentially relied on a shared belief in its worth.
Its continued use stems from the prohibitively high coordination costs of replacement—global trade, finance, and reserve systems are all built around it.
Today, the same logic is repeating with Tokens.
When mainstream models all bill in Tokens, enterprises establish dedicated Token budgets, and policy documents incorporate 'token trading,' Tokens, like traditional currencies, become deeply embedded and irreplaceable.
Thus, there is no longer a debate over whether Tokens will become a new economic unit.
The real question is: Who will define the rules of the Token economy? Who holds pricing initiative in the global computing network?
The answers may already be unfolding within the surging data streams, written with every Token generated, traded, and consumed.
References:
1.Peirce, C. S. (1906). Prolegomena to an Apology for Pragmaticism. The Monist, 16(4), 492–546.
2.Zipf, G. K. (1935). The Psycho-Biology of Language: An Introduction to Dynamic Philology. Houghton Mifflin.
3.Zipf, G. K. (1949). Human Behavior and the Principle of Least Effort. Addison-Wesley.
4.Gage, P. (1994). A New Algorithm for Data Compression. The C Users Journal, 12(2), 23–38.
5.Sennrich, R., Haddow, B., & Birch, A. (2016). Neural Machine Translation of Rare Words with Subword Units. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016), 1715–1725. https://aclanthology.org/P16-1162
6.Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language Models are Unsupervised Multitask Learners [GPT-2 Technical Report]. OpenAI. https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
7.Brown, T., et al. (2020). Language Models are Few-Shot Learners. Advances in Neural Information Processing Systems (NeurIPS 2020), 33, 1877–1901. https://arxiv.org/abs/2005.14165
8.NVIDIA. (2026, March). NVIDIA GTC 2026 Keynote: Jensen Huang. NVIDIA Corporation. https://www.nvidia.com/gtc/
9.Liu Liehong. (2026, March). Speech at the China Development Forum 2026 Annual Conference. National Data Administration.
10.National Data Administration. (2026, April 16). Implementation Plan for Promoting the Construction of High-Quality Industry Datasets (Draft for Comment).
11.National Development and Reform Commission. (2022, February). Notice on Issuing the Implementation Plan for the 'Compute East, Data West' Initiative. National Development and Reform Commission. https://www.ndrc.gov.cn
12.China Electricity Council. (2026). 2025 National Power Industry Statistical Bulletin. CEC. https://www.cec.org.cn
13.J.P. Morgan. (2025). AI & Big Data: Token Demand Outlook 2025–2030. J.P. Morgan Research.
14.IDC. (2025). China AI Agents and Autonomous Task Forecast, 2026–2031. International Data Corporation.
15.Hoffmann, J., et al. (2022). Training Compute-Optimal Large Language Models. arXiv preprint arXiv:2203.15556. https://arxiv.org/abs/2203.15556
16.Touvron, H., et al. (2023). LLaMA: Open and Efficient Foundation Language Models. arXiv preprint arXiv:2302.13971. https://arxiv.org/abs/2302.13971