The Four Little Dragons of GPUs Enter the Scene, Cambricon is No Longer Alone

Home

Finance

ICV

Smart City

Digital Live

Cloud

Optics

Home Finance AI ICV Smart City Digital Live Cloud Optics

06/24 2026 538

The Four Little Dragons of GPUs are set to complete their assembly in the capital market.

On June 15, Shanghai Enflame Technology passed the listing review for the SSE STAR Market. According to its prospectus, Enflame Technology plans to raise 6 billion yuan through this IPO, with 3.3 billion yuan earmarked for AI software and hardware collaborative innovation projects, 1.2 billion yuan for sixth-generation chip R&D, and 1.5 billion yuan for fifth-generation chip R&D.

The capital puzzle for domestic GPUs is being completed, with the Four Little Dragons—Moore Threads, MetaX, Biren Technology, and Enflame Technology—set to fully assemble in the capital market.

This is a moment of acceleration.

From December 2025 to June 2026, in just half a year, at least six AI chip companies have listed or are about to list in the capital market. Including previously listed companies like Cambricon, Hygon Information, and Iluvatar CoreX, the total market capitalization of the domestic GPU contingent is approaching 2 trillion yuan.

The value behind these numbers is even more noteworthy.

When Moore Threads achieved a book profit of 29.35 million yuan in the first quarter and MetaX narrowed its losses by 57.7%, clearly outlining a timeline to reach break-even in 2026. Different numbers point in the same direction:

Domestic GPUs are shortening the distance from technological breakthroughs to a virtuous commercial cycle at an unprecedented pace.

On April 24, 2026, DeepSeek released the trillion-parameter flagship model DeepSeek-V4.

Unlike a year ago when the release of V3 sparked debates about whether domestic chips could run large models, this time, multiple domestic AI chips, including Huawei Ascend, Cambricon, Hygon, MetaX, Moore Threads, Kunlunxin, T-Head Zhenwu, and Iluvatar CoreX, completed adaptation on the day of the model's release.

DeepSeek-V4 brings far more than just technical adaptation for domestic chips; it has shifted market expectations for domestic computing power.

Previously, the default framework for evaluating an AI chip was how much of its performance matched NVIDIA's contemporary products. This placed domestic chips in a chasing position.

However, DeepSeek-V4's practice offers a new perspective. Zhang Dixuan, President of Huawei's Ascend Computing Business, revealed that the single-card computing power of Huawei's AI training and inference accelerator card Atlas 350 has reached 2.87 times that of NVIDIA's H20.

When trillion-parameter models can run stably on domestic chips, benchmarking against NVIDIA's most powerful cards is no longer the only standard.

This shift in perception is translating into real money. Market research firm Bernstein Research predicts that by 2026, NVIDIA's market share in China's AI chip market will plummet from 95% three years ago to just 8%, with Huawei capturing 50%, AMD around 12%, and Cambricon ranking third.

Amid this competitive landscape, the overall market share of domestic AI accelerator cards has surpassed 60%. This represents a historic reshaping of the market, as barriers deemed insurmountable just three years ago are rapidly being dismantled by domestic chips.

The rise of the Four Little Dragons of GPUs is equally noteworthy.

On the evening of March 30, 2026, Moore Threads secured a massive 660 million yuan order for its Kuae Intelligent Computing Cluster. The announcement revealed that this single order's contract value is equivalent to 55% of Moore Threads' total revenue in 2024.

This means Moore Threads has overcome the engineering barriers of a 10,000-card cluster, transitioning from chip manufacturing to delivering ultra-large-scale computing power clusters.

Enflame Technology, which is rushing to list on the STAR Market, benefits from its close ties with leading enterprises.

At Tencent's 2025 full-year earnings briefing, President Martin Lau disclosed that Tencent invested approximately 18 billion yuan in AI new products in 2025 and plans to at least double this investment to over 36 billion yuan in 2026.

The explosion in demand is just beginning, and Enflame's share within it continues to expand. In the first quarter of 2026, Enflame's revenue reached 287 million yuan, a staggering year-on-year increase of 1,474%.

Currently, the window of opportunity is still widening.

Take Biren Technology as an example: its 2025 revenue reached 1.035 billion yuan, a year-on-year increase of 207%, with clients covering national-level computing platforms, telecom operators, and AI large model companies. Its gross margin of 53.8% indicates strong pricing power in the market.

Behind this is the market window opened by DeepSeek-V4. From Huawei Ascend's surge in orders to Cambricon's return to profitability and the Day 0 adaptation of eight domestic chips, domestic chips can now handle production-grade inference workloads for top-tier large models.

If only one metric were used to measure the gap between domestic GPUs and NVIDIA, the most appropriate would not be chip computing power but time.

NVIDIA's CUDA ecosystem has accumulated 20 years of development, boasting 4 million developers worldwide and default compatibility with most mainstream AI frameworks globally, forming a moat for its chip empire. For developers to migrate out of the CUDA ecosystem, the cost is not just monetary but also includes years of code accumulation, debugging habits, toolchain dependencies—it's the muscle memory of developers.

However, what's even more noteworthy is that domestic GPU companies are circumventing NVIDIA's solutions in far less than 20 years using multiple approaches.

The first approach is compatibility, exemplified by Moore Threads. Its self-developed MUSA architecture's software stack is highly compatible with the CUDA ecosystem, aiming to help developers migrate applications from NVIDIA platforms at the lowest possible cost.

In other words, Moore Threads provides a low-friction switching channel for the vast CUDA user base. On May 18 of this year, at Moore Threads' Beijing annual event, founder Zhang Jianzhong stated:

“The goal of MUSA has never been to create a CUDA alternative but to enable seamless migration for CUDA developers to domestic platforms, truly achieving plug-and-play.”

The second approach is circumvention, adopted by Huawei Ascend and Enflame Technology through domain-specific architectures (DSAs) tailored for AI training and inference, without pursuing general-purpose capabilities like graphics rendering.

The core idea of this path is to be born for AI, designing dedicated computing units in chips for high-frequency AI training scenarios, such as matrix computing units and vector computing units, concentrating resources on hardware optimization for AI computing, thus achieving higher efficiency and lower power consumption than general-purpose GPUs in AI scenarios.

For example, Huawei Ascend 950PR's single-card performance surpassing NVIDIA's H20 is the best testament to the advantages of the DSA approach.

Enflame Technology's development is particularly typical, breaking the mold of waiting for customers to purchase standard chips and instead proactively collaborating closely with model developers. Tencent proposes requirements, and Enflame optimizes accordingly. Previously, Enflame's three generations of chips have been adapted and deployed in hundreds of business scenarios within Tencent, covering everything from WeChat voice-to-text to Tencent Meeting minutes, from ad recommendations to content moderation.

This strategy has indeed proven effective within the Tencent ecosystem. Enflame's revenue jumped from 301 million yuan in 2023 to 990 million yuan in 2025, with a compound annual growth rate of 81.32%.

Biren Technology has chosen a software-hardware integration model, offering intelligent computing solutions that include self-developed chips, boards, servers, and even complete intelligent computing clusters, along with its self-developed BIRENSUPA software platform, which includes a compiler, operator libraries, communication libraries, and other complete software stacks, while also being compatible with mainstream AI frameworks. At the system level, Biren provides delivery capabilities for 10,000-card clusters.

A set of data can confirm the strength of this combined model. In 2025, its intelligent computing solutions generated 1.028 billion yuan in revenue, accounting for over 99% of total revenue.

Summarizing the growth path of domestic GPUs, it can be described in one sentence: beyond single-card capabilities, they are building their own ecological moats—from general-purpose compatibility to dedicated efficiency, from chips to solutions, from large models to scientific computing, with players pushing forward in every dimension.

China's AI chip market is transforming from a unipolar landscape dominated by NVIDIA, with others following, into a multipolar battlefield defined by sufficiency, affordability, and controllability as new benchmarks.

According to data from IDC and other institutions, China's total AI accelerator shipments reached approximately 4 million units in 2025, with NVIDIA shipping about 2.2 million units, its market share declining from a peak of 95% to around 55%. Meanwhile, domestic vendors shipped approximately 1.65 million units collectively.

In this reshuffling, the domestic camp has formed a clear hierarchy. Huawei Ascend leads with 812,000 units shipped, followed by multiple strong players like Alibaba T-Head, Baidu Kunlunxin, and Cambricon, dismantling NVIDIA's solo act.

In March of this year, a paper published by Ant Group's CTO He Zhengyu's Ling team showed that using optimized low-specification hardware systems, the cost of training 1 trillion Tokens could be reduced from 6.35 million yuan to 5.08 million yuan, a decrease of about 20%.

In other words, without NVIDIA's advanced chips, domestic chips can already support cutting-edge model training.

According to CITIC Securities' projections, by 2026, China's domestic AI chip market size will exceed 300 billion yuan. The explosive demand for large model training and inference, the construction of intelligent computing centers, increased enterprise AI adoption, and the critical stage of import substitution will drive domestic GPUs to surpass 40% market share in the inference market and 25% in the training market by around 2028.

More critical changes are occurring at the structural level. In 2026, the AI industry is forming a dual pattern of "deep cloud cultivation + edge explosion." In edge computing, the deployment of scenarios like industrial internet, autonomous driving, and digital twins is entering an explosive phase. The demand for edge AI nodes, which are vast in number, fragmented in scenarios, and extremely sensitive to power consumption and cost, is set to explode.

This type of demand is precisely not NVIDIA's comfort zone but a massive opportunity for domestic GPUs—not seized from NVIDIA but left behind by it.

Looking deeper, official data from DeepSeek shows that domestic chips' computing utilization rates have increased from the industry's common 60% to 85%, with inference costs reduced to one-third of NVIDIA's solutions.

In other words, leading projects have validated that a closed loop of domestic chips + domestic models + domestic clouds can work.

However, this does not mean the window of opportunity will remain open indefinitely.

NVIDIA's Blackwell and Rubin series continue to iterate, and the lock-in effect of the CUDA ecosystem remains firm.

Whether domestic GPUs can venture into the deep waters of software ecosystems, building a complete native software stack that includes a developer community; whether they can use architectural innovations to compensate for process node disadvantages and break through the ceiling of advanced computing power; whether they can transition from project-based deliveries to platform-based operations, shifting from one-off deals to general-purpose operations.

These thresholds will determine whether domestic GPUs can move from a narrative of substitution to one of originality. The current IPO of Enflame Technology and the assembly of the Four Little Dragons in the capital market are just the beginning. In the future, achieving profitability and incubating native ecosystems will be the new chapter for domestic GPUs.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.

Newest

Links