05/19 2026
473

After telecom operators start selling AI as data packages, large model providers collectively enter a new battleground.
Source | Silicon-Based Quadrant
When users are no longer concerned about whether to upgrade their monthly data plans, they may start worrying about how many Token services to purchase each month.
Tokens are set to be sold as standardized services by telecom operators, just like data, broadband, and SMS.
Recently, China's three major telecom operators have one after another launched Token package products: monthly subscription-based Token plans for individual users, tiered computing power packages for developers and enterprise clients, and the integration of dozens to hundreds of large models onto their platforms, allowing for "monthly purchases, multi-model access, and bill payments."
China Telecom has introduced personal and enterprise Token packages, with monthly fees starting as low as 9.9 yuan for 10 million Tokens. Local operators like Shanghai Mobile and Shanghai Telecom have introduced quota-based or universal Token billing models, with Shanghai Mobile offering 400,000 Tokens for 1 yuan.
As telecom operators begin selling Token services, the cost for users to switch between large models will significantly decrease. For large model enterprises, "user loyalty" will be weakened, and only by becoming more competitive can they retain their market share.
In the future, large model providers like Doubao, Qianwen, and DeepSeek will not only compete on price and "Token quality per unit of energy consumption" but also on "higher-value AI application solution capabilities."
01 What Are Token Services?
To understand Token services, one must first understand what a Token is.
Computers cannot directly recognize text; they can only process 0s and 1s. Therefore, every word, character, voice, or punctuation we input is converted into 0s and 1s through specific encoding mechanisms.
In the context of large models, digital encoding is also the first step, with slight variations in the number of digits each character is converted into.
Tokens are the smallest computational units for large models to process information. User input, contextual memory, and model output are all calculated in Tokens. The more complex the model call, the longer the context, and the deeper the Agent execution chain, the higher the Token consumption.
Typically, in English, one Token roughly corresponds to four letters. In Chinese, due to the higher information density of characters, one Chinese character, punctuation mark, or phrase often corresponds to 1-2 Tokens.
Since large models think and output Token by Token, the industry sells and settles the cost and usage quotas of large models in the form of "Per Million Tokens" or "quota points."
Currently, large model companies charge for Tokens on a tiered basis. Ordinary users can use models like Doubao and Qianwen in standard mode for free, while enterprise-level heavy users can purchase tiered API monthly subscriptions or metered services.
Since last year, telecom operators have opened "computing power supermarkets" for large models. Model providers act as "tenant merchants," with operators collecting "platform fees + computing power fees + channel fees." Users are not purchasing "operator models" but rather accessing any large model on the telecom platform using the operator's computing power, billed by Tokens.
In July 2025, China Mobile launched the model service platform MoMA (Mobile Model Access); in April, China Telecom released the Xingchen TokenHub operation service platform, and in May, China Unicom launched the "Unicom Star Network" Token service platform. These platforms integrate mainstream large models from Baidu, Alibaba, ByteDance, DeepSeek, and others, offering unified APIs, authentication, and billing.
Operators' platforms adapt to multiple large models internally, allowing users to smoothly switch between models by simply changing the model name (Model ID).
02 Why Are Operators Selling Tokens?
The surge in Token services is no accident.
First, changes in billing models. In the traditional cloud computing era, users were accustomed to paying for "server rental time" or "fixed bandwidth" (i.e., IaaS-layer computing power), based on bandwidth speed and time. However, with the development of large models, the capabilities provided by different models and the cost disparities for different tasks vary greatly. For example, stronger models have higher Token costs; longer contexts consume more Tokens; and higher inference complexity leads to higher actual costs. Billing by Tokens aligns "the degree of intelligence consumed by users" with "the computing power costs incurred by providers."
Second, lowering technical barriers and "trial-and-error costs." The research and deployment of large models often require tens or even hundreds of millions of dollars in investment. For most small and medium-sized enterprises and individual developers, building their own models is unrealistic. Token services package "Artificial General Intelligence (AGI)" capabilities into digestible units, allowing developers to call APIs on demand and pay Token fees without worrying about the underlying tens of thousands of GPUs consuming power.
Finally, the urgent demand driven by the explosion of application-layer scenarios. Entering 2026, AI Agents, AI-assisted programming, and multimodal content generation have exploded in popularity. These applications require frequent "throughput" interactions with underlying large models during daily operation. An automated AI code-writing tool might consume millions of Tokens overnight. This high-frequency, massive interaction necessitates more standardized, stable, and price-competitive Token package services.
Over the past two decades, telecom operators' business models have undergone three core changes in measurement units.
The first stage was the voice era, where operators sold minutes; the second stage was the mobile internet era, where they sold data in GBs; and now, in the AI era, operators are beginning to sell Tokens.
Tokens are undergoing an evolutionary process similar to that of data. Initially, they were just technical metrics; then they became billing units; and eventually, they will evolve into standardized commodities.
The entry of operators signifies that Tokens have begun to transcend their technical realm and enter the consumer system.
In the coming years, the way users purchase AI capabilities may fundamentally change: individual users buying "AI monthly packages," enterprises procuring "Token resource pools," family broadband including AI quotas, and government and enterprise dedicated lines integrating Agent services. Tokens will become as fundamental a resource as electricity, water, and data.
However, this does not mean that operators will replace large model providers.
03 How to Buy Tokens Appropriately?
Should Token services be purchased directly from native large model providers or through operator platforms? Currently, both business models have their pros and cons.
The first model is the native large model provider approach, which charges per million Tokens. Providers like OpenAI, Anthropic, DeepSeek, and Qianwen generally adopt this system, with users paying separately for input and output Tokens. Some, like Qianwen, may use a pre-purchase at the beginning of the month and settle at the end of the month.
The second model is the operator's monthly subscription for Token quotas. For example, Shanghai Telecom offers a minimum of 9.9 yuan for 10 million Tokens, with additional charges for excess usage, and plans to integrate Token benefits into the family's "Beautiful Home" digital space, supporting one-click payment via phone bills.
This "all-inclusive" or "phone bill integration" model allows Chinese users to purchase large model computing power as easily as buying data packages.
The overseas market primarily uses API tiered pricing by native large model companies, while the domestic market has pushed Token services into a "package" era similar to mobile phone bills.
Currently, both charging models have their advantages, as the Token package user base can be divided into three main types.
The first type is independent developers and tech enthusiasts (Geeks). They use API interfaces provided by various vendors to build their own personalized AI applications, such as productivity tools, automatic translation plugins, and personal knowledge bases.
The second type is small and medium-sized enterprises, startups, and B-end independent software vendors (ISVs), which are the core customer base for Token services. Whether purchasing Tokens for employee programming, developing AI Agents for specific industries, or embedding AI-assisted functions into existing enterprise ERP and CRM systems, these companies need to subscribe to "team Token packages" from cloud providers or operators.
The third type is "heavy AI-dependent" professionals and ordinary households who frequently use AI for copywriting, coding, or tutoring children at home.
For small and medium-sized enterprises and startups, from a technical economics perspective, the pure Token-based charging model of native large models is more scientific.
The operator's package model, however, has two advantages. On the one hand, independent developers are not tied to a single large model and can autonomously select from multiple models through the platform. On the other hand, Token services may reach mass consumption more quickly. Most people understand what 100GB of data means but have no concept of what 10 million Tokens represent.
By adopting monthly subscriptions, operators are essentially lowering the cognitive barrier. Users do not need to understand Tokens; they can start by purchasing a basic package like 9.9 yuan for 10 million Tokens to understand their needs.
As operators begin selling Token services, "Doubao and peers" are about to enter a three-tiered competition.
From "competing on parameters" to "competing on energy efficiency": Large model companies will no longer be able to blindly pursue larger parameters and higher energy consumption. Instead, they will focus on model distillation, quantization, and inference optimization—capabilities that deliver higher-quality Tokens with lower energy consumption.
Price competition will intensify further. After operators aggregate hundreds of models, user switching costs decrease. If Model A raises prices, users can replace it with Model B through the platform. When the differences in model capabilities are insufficient, price will become the core competitive factor.
The profit centers of large model companies will shift. Selling APIs alone offers limited profits, and future profitability may focus on Agents, industry applications, and enterprise solutions. Models themselves will gradually become infrastructure, while the application layer becomes the value center.
Perhaps a "two-sided market" is forming: operators control the entry points, while model providers control the capabilities.