Huawei and Cambricon Take the Lead as Chinese Chip Companies Step Up Efforts to Capture NVIDIA’s Market Share

06/19 2026 390

Source | Yuan Media Hub

The ranks of leading domestic GPU manufacturers are quietly expanding.

Recent media reports indicate that ByteDance is in discussions with Illuvintech to purchase at least 50,000 AI chips, primarily for inference tasks. The chips under consideration in these negotiations are mainly from Illuvintech’s ZhiKai series, designed for cloud-based inference GPUs, while the TianGai series will be utilized for training scenarios.

Screenshot sourced from media reports

This news has sparked excitement in the market. After all, ByteDance is China’s top buyer of AI computing power, with plans to increase capital expenditures by over RMB 200 billion in 2026.

However, as of now, neither ByteDance nor Illuvintech has commented on these reports.

If this deal goes through, Illuvintech will become ByteDance’s third-largest domestic GPU supplier, following Huawei and Cambricon.

For Illuvintech, which just went public on the Hong Kong Stock Exchange in January 2026, this is not just about securing a significant order but also gaining validation as a key player in the industry.

More significant than the order itself is the signal it sends—domestic AI chips are transitioning from being driven by policy mandates and pilot projects to being genuinely adopted in the application scenarios of major internet companies, shifting from backup options to essential computing power support.

01.

No Choice But to Diversify

ByteDance does have options.

The U.S. has approved the purchase of NVIDIA’s H200 by some Chinese companies under controlled conditions. However, the “backdoor incident” involving the H20 has increased compliance and security review pressures for Chinese buyers. Not putting all their eggs in one basket has become a standard practice for major domestic players.

More importantly, ByteDance’s computing power demands are undergoing structural changes.

QuestMobile data shows that as of March 2026, Doubao, ByteDance’s AI assistant, had 345 million monthly active users. The pressure from user growth comes not only from model training but also from ongoing inference costs post-launch. Inference scenarios have less stringent requirements for chips in terms of interconnect bandwidth, memory, and ecosystem maturity compared to training. Domestic chips have reached a level where they are viable for inference tasks.

Image source: QuestMobile

As of March 2026, Doubao’s large model had surpassed 120 trillion daily token invocations, a thousandfold increase since its initial launch. Based on Volcano Engine’s pricing and user behavior, daily computing power consumption costs have reached tens of millions of yuan—not including one-time investments in smart computing centers and chip procurement.

Despite a 43% improvement in Doubao 2.0’s inference efficiency and a token cost of just 38% of overseas leading models’ compliant chains, the free usage by 345 million monthly active users still leaves a significant loss gap.

Under this pressure, ByteDance has embarked on a high-stakes escalation.

Multiple media outlets, citing the South China Morning Post, reported that ByteDance’s AI infrastructure capital expenditure budget for 2026 has been increased by about 25% to RMB 200 billion. This adjustment is driven by two main factors: the company’s continued investment in artificial intelligence and the rising cost of memory chips.

There are even reports that ByteDance is considering pushing its 2026 spending limit to USD 70 billion. Meanwhile, the company’s net profit shrank by over 70% year-on-year in 2025. With profits and expenditures at opposite extremes, Zhang Yiming’s bet on computing power may be a gamble for a five-year lead.

ByteDance’s computing power supply chain strategy is clear: Huawei’s Ascend and Cambricon’s high-end training cards for training, and Illuvintech’s ZhiKai series for inference, running in parallel. This approach of “training and inference on two legs, domestic and imported options in hand” is becoming the “standard configuration” for major internet companies.

02.

Biren Technology and Its Ecosystem

While the news of ByteDance’s planned purchase of domestic chips dominates headlines, another domestic GPU manufacturer’s move deserves greater attention.

On the evening of June 16, Zhipu officially open-sourced its new flagship model, GLM-5.2. The next day, Biren Technology and Moore Threads successively announced the completion of “Day-0” adaptation. Biren Technology’s Biren 166 series completed adaptation and optimization based on the vLLM inference framework, providing developers with a rapid deployment solution ahead of others. Following the announcement, Biren Technology’s stock price rose by 7.09% that day.

Screenshot sourced from a related WeChat official account post

“Day-0 adaptation” is key to understanding the competitive landscape of domestic GPUs—it means the chips are not just usable but can run on the day the model is released. This requires chip manufacturers to excel not only in hardware but also in software stacks, toolchains, and developer ecosystems. Biren Technology has already established a clear first-mover advantage in this regard.

Over 20 leading domestic large models, including Tencent’s Hunyuan Hy3 preview, Alibaba’s Tongyi Qianwen Qwen3.6, DeepSeek’s entire model lineup, MiniMax M3, Zhipu’s GLM series, and Yuezhi’s Kimi, have completed Day-0 synchronization with Biren Technology’s chips. Notably, DeepSeek completed full-series adaptation with Biren in just a few hours, setting a record for domestic chips’ response speed.

When this adaptation list is viewed alongside ByteDance’s supplier list, a clear signal emerges: Biren Technology can now stand alongside Huawei and Cambricon.

Huawei has long secured its position, with Ascend’s ecological depth and 10,000-card cluster capabilities remaining unmatched by other domestic players. Cambricon entered the commercial market early and has steadily supplied ByteDance, making it a core player in major companies’ computing power supply chains. Biren Technology, with national-level certification, capital favor, and large model ecosystem布局 (layout), has secured an equal seat, becoming a new force.

Everything seems to fall into place naturally.

In May 2026, China established an AI chip category in its security and reliability evaluation for the first time. Nine domestic chips received the highest security and reliability rating (Level I), including Huawei HiSilicon, Alibaba T-Head, Biren Technology, Hygon Information, Illuvintech, MetaX, and Moore Threads. Within this national certification framework, Biren Technology now stands alongside Huawei and Alibaba T-Head.

Capital market votes are even more direct: Biren Technology went public on the Hong Kong Stock Exchange on January 2, 2026, surging 82% on its debut and briefly exceeding a market capitalization of HKD 100 billion, becoming the first GPU stock listed in Hong Kong.

The value of this ecosystem lies in forming a virtuous cycle: the more models run on Biren Technology’s chips, the more mature its software stack becomes; the more mature the software stack, the faster new model adaptations occur; the faster the adaptations, the more model manufacturers choose Biren. This is the “flywheel effect” of ecosystems.

Of course, Biren Technology is not alone in this fight. The entire domestic GPU sector is engaged in an arms race centered around large model adaptations.

As mentioned earlier, Huawei’s Ascend ecosystem remains unmatched. This time, Zhipu’s GLM-5.2 completed inference adaptation with Ascend on Day 0; Cambricon completed Day-0 adaptation for DeepSeek-V4 on its release day. As one of ByteDance’s two existing GPU suppliers, Cambricon’s NeuWare software stack continues to expand its influence.

Moore Threads has successively completed same-day adaptations for MiniMax M3 and Zhipu’s GLM-5.2 since June, with the MTT S5000’s response speed now on par with any competitor.

Enflame Technology is focusing on clustering, jointly releasing the “Liaoyuan” smart computing cluster 3.0 commercial version with Tencent Cloud. It has adapted to mainstream large models like DeepSeek, Tencent Hunyuan, and Zhipu AI, completing deployments of thousands to tens of thousands of cards.

Additionally, it’s worth noting that Enflame Technology just passed its listing review on June 15. If successful, the “Four Little Dragons of Domestic GPUs”—Moore Threads, MetaX, Biren Technology, and Enflame Technology—will gather in the capital market for the first time.

03.

What Will Determine the Final Outcome?

Focusing on a single news report might make it seem like just a few domestic chip manufacturers are vying for orders and headlines. However, when the clues are connected, the logic changes entirely.

The golden window for domestic GPUs has opened, but it won’t remain open indefinitely. NVIDIA’s next-generation Rubin architecture is already on the way. Once the U.S. relaxes export restrictions to China, the “time difference” advantage of domestic chips may quickly disappear.

Major companies’ actions speak volumes. ByteDance’s 2026 AI infrastructure investment exceeds RMB 200 billion, Alibaba’s single-quarter capital expenditures surpass RMB 38 billion, and Tencent is set to massively adopt domestic computing power in the second half of 2026. These moves are transforming domestic chips from “backups” to “mainstays.” However, such systemic replacement hinges on ecosystem maturity. Only those who keep up with “Day-0 adaptation” can secure entry tickets to major companies’ procurement lists.

Today, Biren Technology has secured same-day adaptation for over 20 leading models, while Cambricon remains firmly on ByteDance’s supplier list. The ecological gap is widening, and the window for latecomers to catch up is narrowing.

Even more critical is computing power cost. ByteDance’s net profit has slumped by over 70%, yet it persists with RMB 200 billion in computing power investments, indicating the industry has reached a tipping point where cost reduction is imperative. Domestic chips’ advantage in inference lies not only in security and autonomy but also in significant cost savings for major companies.

For instance, Illuvintech’s ZhiKai series is priced at just 60-70% of NVIDIA’s equivalent products, with further price reductions possible as production capacity ramps up and yield rates improve.

However, beyond cost advantages, production capacity is the true bottleneck. Domestic GPUs are generally constrained by advanced process node capacity limitations, with SMIC’s N+2 process scheduling already overloaded with orders from various chip manufacturers.

Policy dividends have opened up demand-side space, but supply-side ceilings will determine who truly captures the market. In May 2026, nine domestic chips received the highest national security and reliability rating, triggering a surge in information technology innovation demand. However, delivery capabilities will determine who can cash in on these dividends.

The window for domestic GPUs won’t remain open forever. Ecosystem, cost, and production capacity—these three hurdles stand before all players. Time is limited. By year-end, when major companies light up their 10,000-card clusters, shipment volumes will reveal who are the pit fillers, pathfinders, and also-rans.

Note: Some images in the text are AI-generated/sourced from the internet. Please notify us for removal if any infringement occurs.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.