Over US$10 Billion! DeepSeek Reportedly Set for First Funding Round

04/20 2026 418

Embodied Intelligence I Humanoid Robots I Embodied Intelligence Funding I Robots

The past week has seen the domestic AI and investment circles abuzz with a single piece of news: DeepSeek has launched its first external funding round since its inception, targeting a valuation of no less than US$10 billion and aiming to raise at least US$300 million. The news instantly ignited the venture capital community, with some exclaiming, "The moment has finally arrived," while others wondered: Why would the tech geek team, which once rejected capital and relied on its parent company for self-sufficiency, suddenly turn to embrace the capital market?

This is not the first time DeepSeek has been rumored to be raising funds. As early as February 2025, the market was abuzz with speculation that it was considering bringing in external capital, even naming Alibaba and state-owned funds as potential investors. At the time, the company directly refuted the rumors as "pure fabrication." Now, more than a year later, funding rumors have resurfaced with more specific details and clearer signals. What lies behind this is no longer a simple need for funds but a critical turning point for China's large-scale models from technological breakthroughs to large-scale competition.

As of now, DeepSeek has not officially responded, but multiple sources close to the deal have revealed that the possibility is "extremely high," and the investment circle is already in an uproar. It's worth noting that over the past year, Liang Wenfeng and his team have been the most sought-after yet hardest-to-reach individuals in the entire VC circle. Liu Qin from 5Y Capital had three separate attempts to connect with them turned down, while Chen Datong from Yuanhe Puhua only secured a meeting through their shared connections in the chip sector. Even Baidu Ventures, located in the same building, failed to get involved.

Now, this "capital insulator" has proactively opened its doors to funding. While it may seem sudden, it is, in fact, an inevitable choice under the triple pressures of technology, computing power, and competition. What the market truly cares about, however, has never been the funding itself but rather: What technological trump cards does this company, which has risen to prominence through its technical prowess, hold? How will this funding round reshape the global landscape of Chinese AI?

01

What Does DeepSeek's Billion-Dollar Valuation Mean?

To understand the significance of this funding round, let's first place it within the current AI valuation landscape.

In early 2026, the valuations of global AI unicorns have been soaring: OpenAI's latest funding round values it at US$852 billion, while Anthropic is valued at US$380 billion. The domestic market is equally hot, with Zhipu and MiniMax both surpassing US$50 billion in market capitalization after listing on the Hong Kong Stock Exchange. Yuezhia'an's valuation has skyrocketed from US$4 billion to US$18 billion, and Jieyue Xingchen has entered its listing window.

Against this backdrop, DeepSeek's planned US$10 billion valuation may seem modest, but it hides deeper considerations. On the one hand, as its first external funding round, a low-key start leaves ample room for future growth. On the other hand, unlike other companies whose valuations are propped up by scenarios and ecosystems, DeepSeek's valuation is firmly anchored in its technical prowess, free from excess hype—the core reason for capital's Crazy pursuit (frenzied pursuit).

The intended use of the funds is nearly unanimously agreed upon within the industry: to fully support the research, development, and deployment of the V4 model. Large-scale model development is inherently a "money-burning" track (sector), with costs for computing power, data, and talent growing exponentially as parameters scale from hundreds of billions to trillions. Relying solely on internal support from its parent company, High-Flyer Quantitative Trading, is no longer sufficient to sustain long-term technological leadership, especially for the V4 model, which, as the next-generation flagship, requires massive capital investment for computing power expansion, technological R&D, and team stability.

More critically, DeepSeek has recently experienced fluctuations in its core talent: Luo Fuli, a key contributor to the V2 model, joined Xiaomi, while Guo Daya, a core researcher, moved to ByteDance. Introducing external capital can not only provide more competitive salaries to retain top talent but also solidify a financial moat for long-term technological iteration, avoiding disruptions to R&D rhythms due to short-term funding pressures.

This funding round marks DeepSeek's transition from a "small but elite" technical team to a global technology giant and signifies that China's large-scale models have officially entered a new phase driven by both technology and capital.

02

What Makes DeepSeek's Technology Roadmap So Strong?

Many wonder how DeepSeek, founded just three years ago, has managed to stand out amid global AI giants. The answer lies in its unique technological approach: using algorithmic innovation to offset computing power gaps and leveraging bottom-up breakthroughs to redefine large-scale model R&D logic.

Unlike most players who "stack parameters, buy computing power, and compete on funding," DeepSeek has focused on reasoning-first, code-specialization, and open-source accessibility since its inception. All its technological iterations revolve around "improving efficiency, reducing costs, and breaking bottlenecks," which is the key to its ability to achieve significant results with a small team.

1. Architectural Revolution: MoE Hybrid Experts Redefine Large-Scale Model Efficiency

DeepSeek's most core technological breakthrough is its self-developed Mixture of Experts (MoE) architecture, which is key to its ability to achieve high performance at low cost.

Traditional large-scale models typically use dense architectures, requiring all parameters to be activated during runtime, leading to massive computing power consumption. In contrast, DeepSeek's MoE architecture acts as an "intelligent scheduling system" for the model: while the total parameters can reach hundreds of billions, only a few expert sub-networks are dynamically activated during each inference, significantly reducing computing power consumption.

Take the V3 model as an example: with a total of 671 billion parameters, it activates only 3.7 billion parameters during inference, reducing computational load to 1/10th of traditional models. Training costs were just US$5.57 million, yet it achieves GPT-4-level performance. This "large capacity, low computing power" design allows DeepSeek to maintain global top-tier performance despite limited computing power, perfectly addressing the pain point of insufficient domestic computing resources.

Building on this, the team developed Multi-head Latent Attention (MLA) technology, which optimizes through low-rank factorization. When processing 128K long texts, memory usage is only 13% of industry standards, perfectly solving industry-wide issues of slow long-text inference and insufficient memory.

2. Training Innovation: FP8 Mixed Precision Drives Cost-Effective Efficiency

Beyond architectural innovation, DeepSeek has also achieved breakthroughs in training technology, with FP8 mixed-precision training being another major advantage.

Traditional large-scale models typically use FP16 or FP32 precision for training, which is computing-intensive and slow. DeepSeek innovatively employs dynamic optimization between 8-bit and 32-bit floating-point precision, boosting training speed by 50% while significantly reducing computing power consumption without sacrificing model accuracy. Liang Wenfeng has admitted that domestic models lag behind foreign counterparts in training efficiency, requiring more computing power to achieve similar results, and FP8 technology is key to bridging this gap.

More impressively, DeepSeek has not hoarded its technology but instead adheres to an open-source approach. From its first open-source code model, DeepSeek Coder, to its general-purpose large-scale model, DeepSeek LLM, and the V3.2 model, all have been open-sourced, allowing domestic SMEs and developers to access top-tier technology and breaking foreign technological monopolies.

3. Computing Power Breakthrough: Abandoning NVIDIA, Embracing Ascend for a Fully Autonomous Tech Stack

If architectural and training innovations represent internal strengths, then fully adapting to domestic computing power is DeepSeek's most strategically significant move—and the core reason why Jensen Huang described it as "bad news for the U.S."

The industry's biggest recent technological earthquake was DeepSeek V4's complete abandonment of NVIDIA chips in favor of Huawei's Ascend 950PR chips, migrating from the CUDA framework to the CANN framework and becoming the world's first trillion-parameter large-scale model fully independent of U.S. technology.

This is not a simple chip replacement but a full-stack autonomous reconstruction encompassing hardware, frameworks, operator optimization, and distributed training. The Ascend 950PR delivers 1.56P of FP4 precision computing power per card, 2.87 times that of NVIDIA's H20. After deep optimization by DeepSeek's team, the V4 model's inference speed improved 35-fold compared to its initial stage, with inference costs just 1/70th of GPT-4's.

Jensen Huang has publicly stated that DeepSeek's new model based on Huawei's platform is a "bad result" for the U.S. because once Chinese large-scale models fully adapt to domestic hardware, NVIDIA's moat will be shattered, and the global AI computing power landscape will undergo fundamental reshaping. DeepSeek V4's choice is not just a declaration of technological autonomy but also provides a replicable path for Chinese AI to escape "chip chokeholds."

4. Scenario Deep Dive: Reasoning and Code Dominance in Vertical Fields

Unlike many large-scale models pursuing "all-scenario capability," DeepSeek has consistently focused on reasoning and code—two core scenarios—achieving extreme breakthroughs in single areas.

In reasoning, the DeepSeek-R1 model introduces a self-verification mechanism and GRPO algorithm optimization, enabling autonomous logical checks and error corrections. Its mathematical reasoning and logical proof capabilities rival international top models, even surpassing them in certain Chinese reasoning scenarios.

The code domain is DeepSeek's forte. DeepSeek Coder supports multi-language code generation and debugging, performing exceptionally well in the HumanEval-X test. Its high adaptability to Chinese comment-to-code conversion and API call completion has made it one of the world's most popular open-source code models.

From architecture to training, from computing power to scenarios, DeepSeek has proven through a series of bottom-up technological innovations that Chinese large-scale models are not mere "followers" but fully capable of defining global technological rules. This is the core reason capital is willing to bet on it—investing in DeepSeek is, fundamentally, investing in the technological future of Chinese AI.

03

From Rejecting Capital to Embracing Funding: How Should We Interpret This Shift?

DeepSeek's funding pivot may seem like a company-specific choice, but it reflects broader trends in China's AI industry. Understanding this shift is key to grasping the direction of Chinese AI over the next 3–5 years.

First, even the most hardcore technology requires capital for scaling. Previously, DeepSeek relied on High-Flyer Quantitative Trading's funding to focus on technological R&D, becoming a "tech oasis." However, large-scale model competition has shifted from lab benchmarks to all-out wars involving computing power, talent, and ecosystems. OpenAI has raised US$40 billion, while domestic giants are investing hundreds of billions in computing infrastructure. The era of solo acts is over.

Liang Wenfeng's pivot is not a compromise but a sign of maturity. He understands that only by leveraging capital can DeepSeek rapidly expand computing power, stabilize its team, accelerate V4 model deployment, and gain an edge in global competition. For technical teams, adhering to technological ideals and leveraging capital are not mutually exclusive but necessary for long-term development.

Second, China's AI core competitiveness is shifting from application deployment to bottom-up technology. Over the past few years, domestic AI companies have risen through scenario applications and traffic monetization, relying on foreign algorithms and computing power. However, DeepSeek's rise proves that only by mastering bottom-up technology can true discourse power (voice) be achieved.

From MoE architecture to FP8 training, from Ascend adaptation to open-source ecosystems, DeepSeek has forged a path of "technological autonomy." This points the industry in a clear direction: future AI competition will hinge not on the number of applications but on technological hard power, ecosystem stability, and autonomy.

Third, the global AI landscape is being reshaped, and Chinese players are indispensable. Once dominated by OpenAI and Google, the global large-scale model market now sees Chinese firms like DeepSeek, Zhipu, and MiniMax rising, achieving breakthroughs in technology, performance, and cost—even surpassing foreign counterparts in some areas.

DeepSeek V4's adaptation to domestic computing power has shattered overseas chip and framework monopolies, ushering in a "diversified computing power" era for global AI. In the future, Chinese AI will no longer be a supporting actor in the global industrial chain (supply chain) but a main competitor alongside U.S. giants—a result of countless technical teams' relentless efforts.

Finally, funding is not the finish line but a new starting point for technological long-termism. A US$10 billion valuation and US$300 million funding round are just the beginning for DeepSeek. The real challenges lie ahead: maintaining technological innovation purity without being swayed by short-term capital gains, stabilizing core teams for sustained bottom-up breakthroughs, and deploying the V4 model to truly empower industries.

However, we have reason to believe this team, which emerged from the quantitative trading field and understands long-termism, can balance capital and technology. After all, DeepSeek's original aspiration was never to become a capital darling but to "enable machines to think like humans" and explore the boundaries of artificial general intelligence.

04

China's AI Golden Age Is Just Beginning

Looking back at DeepSeek's journey, from obscurity to global prominence, from rejecting capital to embracing funding, each step has been firm and clear. It has not relied on marketing hype or traffic monetization but on technological breakthroughs and innovative achievements to earn respect from the industry and capital.

This US$10 billion funding rumor is not just a new chapter for DeepSeek but a signal for China's AI industry: technological hard power remains a tech company's core strength, and independent innovation is the only path forward for Chinese technology.

With the V4 model set to debut, the maturation of the domestic computing power ecosystem, and capital accelerating technological deployment, we have reason to expect DeepSeek to continue writing China's AI legend. The entire Chinese AI industry, driven by both technology and capital, will shed its follower status and take the lead, securing its rightful place in the global AI wave.

For us, the focus should not be on the valuation's height or the investors' backgrounds but on this: when Chinese technology possesses autonomous strength and Chinese teams adhere to long-term innovation, no chokeholds can hold us back, and no pattern (landscape) cannot be rewritten.

DeepSeek's story has just begun; China's AI golden age is upon us.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.