04/16 2026
356
A bottle of mineral water retails at 1.5 yuan, a sum sufficient for AI to assist in crafting a work akin to 'The Three-Body Problem.' Domestically, large model tokens are priced even lower than water, whereas across the Pacific, GPT-6 demands prices several dozen times higher for an equivalent token volume. After two years of relentless price competition, some firms are stealthily hiking rates, others are gritting their teeth to stay afloat, and yet others proclaim, 'Only those commanding computing power will endure beyond 18 months.' As this 'suicidal' rat race nears its end, who finds themselves unprepared?
01 How Many Tokens Can One Yuan Purchase?
In April 2026, a developer's calculations on Volcengine's pricing page revealed: The input cost for the Doubao main model stands at 0.0008 yuan per thousand tokens, marking a 99.3% reduction from the industry norm. Crafting an 800,000-word tome like 'The Three-Body Problem' incurs a cost of less than two yuan—a Nongfu Spring bottle retails at 1.5 yuan, leaving a 50-cent margin. Alibaba's Qianwen 3.5 escalates the aggression, charging a mere 0.8 yuan per million tokens, a fraction (1/18th) of Google's Gemini-3-Pro's rate. 
One industry insider lamented, 'Selling models now undercuts mineral water sales, yet at least water bottles are recyclable.' Conversely, OpenAI's GPT-6 (slated for an April 14 release) is priced at $2.5 per million tokens for input and $12 for output—exponentially pricier than domestic offerings when converted to RMB. Why does the same token volume command such disparate prices, akin to comparing luxury goods with street vendor fare?

Developers are kept awake by whispers: Some mid-sized AI firms, unable to secure affordable computing power, can no longer sustain operations through API pricing alone and are covertly slashing free quotas. When industry giants engage in price wars, smaller entities perish first. This scenario mirrors the early days of ride-hailing apps, where hefty subsidies were used to capture market share. The crux of the matter is—whose livelihoods hang in the balance amidst this large model price war?
02 Computing Power Chessboard: Who's Caught Off Guard, Who's Dressed for Success?
ByteDance ignited the price war. In May 2024, Doubao slashed prices by 99.3%, compelling Alibaba, Baidu, and Tencent to follow suit, plunging the industry into a frenzy of 'token selling at a loss.' Why did ByteDance dare to initiate this? It boasted the industry's lowest computing power costs. In 2023, it amassed a substantial stockpile of NVIDIA GPUs, criticized then as 'overzealous' but now hailed as 'visionary leadership.' By 2026, ByteDance's capital expenditures soared to approximately 160 billion yuan, with 85 billion yuan earmarked for AI chips. Daily token calls skyrocketed from the hundred-billion mark in 2024 to 120 trillion in March 2026—under economies of scale, unit costs plummeted. An IDC report indicates that Volcengine commands 49.2% of China's large model public cloud market, with ByteDance alone accounting for half. Alibaba and Baidu were compelled to follow suit. Alibaba's Qianwen 3.5 leaned on technical cost reductions: Its MoE architecture condensed 397 billion parameters into 17 billion active ones, slashing memory usage by 60% and amplifying inference throughput by 19 times. Baidu's Wenxin 5.0 pursued a technical path, boasting 2.4 trillion parameters and full modality, outperforming GPT-5-High in 40 tests, though its commercialization trajectory remains ambiguous. Yet, the diminishing returns of low-price strategies are accelerating. Global large model weekly token consumption surged from 9.8T in early February to 14.8T in early March, with agents like OpenClaw escalating token consumption per task by 10 to 100 times. Computing power supply and demand have inverted. On March 11, Tencent Cloud spearheaded price hikes—the input unit price of its Hunyuan core model soared by 463%. Alibaba Cloud and Baidu Intelligent Cloud swiftly followed, raising AI computing power product prices by 5%–34%. The price war is unsustainable. More intriguingly, a ByteDance executive asserted internally, 'Only players controlling the computing power supply chain will survive the next 18 months.' In essence, the price war is mere window dressing; computing power reigns supreme. 
No entity embodies this rationale more than DeepSeek. During the R1 era, it achieved GPT-4-level performance at a cost of $5.86 million, with its paper gracing Nature's cover. However, for V4 (released in late April), it invested heavily in what some deem 'unprofitable ventures': trillion-parameter models, million-length contexts, and full-stack adaptation to Huawei's Ascend domestic chips. Why would a technologically idealistic firm undertake such 'grunt work'? Because without computing power autonomy, even the most advanced technology risks being severed before the 18-month countdown concludes. Alibaba, ByteDance, and Tencent have reportedly pre-ordered tens of thousands of domestic AI chips, planning to integrate DeepSeek's new models via cloud services. These giants can construct DeepSeek-like models but require this 'domestic computing power adaptation line' as a hedge against supply disruptions. DeepSeek is evolving from a 'technical benchmark' into a 'strategic reserve.' Xiaomi, meanwhile, opts to abstain from this contest. Luo Fuli, head of MiMo, bluntly stated, 'Many vendors' ultra-low-price packages fail to even cover computing power costs—the more users they attract, the greater their losses. This is 'suicidal pricing.'' 
The current landscape is fragmented: ByteDance hemorrhages funds on price wars, Alibaba and Baidu barely keep pace with technical cost reductions, Tencent stealthily transitions to value-based competition, DeepSeek bets on domestic adaptation, and Xiaomi observes coolly, predicting others' demise.
03 The Price War's Denouement Isn't About Affordability—It's About 'Value'
Price wars are a well-worn internet strategy. Food delivery and ride-hailing services followed identical trajectories—burn cash to seize market share, eliminate rivals, then hike prices to reap user profits. Large models are retracing this cycle, albeit with a fundamental divergence: Computing power costs won't indefinitely diminish with scale. Moore's Law is decelerating, and chip physical constraints loom. Following the ban on NVIDIA's H20, domestic alternatives offer 70% performance at double the price, with Huawei's Ascend production booked until Q2 next year. Computing power is a finite resource with a cap. Meanwhile, GPT-6 is poised to enter the fray with 'proactive alignment to user intent.' If the past was about humans learning to wield AI, GPT-6 aspires to make AI comprehend humans. The experience chasm cannot be bridged by price wars alone. When it charges exponentially more than domestic models yet delivers an unparalleled experience, how will high-end clients decide?
04 Is the Domestic Large Model Price War 'Inclusive' or 'Suicidal'?
Labeling it 'inclusive' isn't inaccurate—low prices enable small-to-medium developers and individuals to access top-tier AI, which is commendable. Yet, branding it 'suicidal' isn't off the mark either—when the industry descends into loss-making turmoil, who possesses the funds for genuine technological innovation? ByteDance's executive assessment is unequivocal: 'Only players controlling the computing power supply chain will survive.' However, I would augment: Only players commanding computing power AND generating genuine value merit survival. As the price war concludes, the victor isn't determined by affordability but by 'value.' Mineral water bottles may not be recyclable, but at least sellers recognize—water is a resource, not waste. Data and facts are sourced from platforms such as 'China Business Journal,' 'Financial Times,' and IDC.