04/24 2026
383
The long-awaited preview version of DeepSeek V4 was finally released on the morning of April 24th Beijing time. Previously, media and analysts had repeatedly predicted the imminent release of V4, only to be proven wrong multiple times—this time, they finally succeeded. Given how recently the model was released, third-party and customer evaluations are still in full swing, and effective information is scarce. However, through the technical documentation and preliminary discussions in overseas AI communities, we can still glean some insights.
First, V4's strategic direction is to enhance Agent capabilities, aligning with the global trend since OpenClaw's release. The "lobster-raising craze" has brought significant token growth to domestic large model vendors like MiniMax, Kimi, and Zhipu, but the most crucial and professional growth has been captured by Claude. Agent capabilities are inseparable from programming skills, and Claude Code remains the world's most powerful AI programming tool (by a wide margin), with GPT-Codex unable to shake its position. DeepSeek's official announcement prominently declares a "significant improvement in Agent capabilities" at the outset but also admits that "(according to evaluation feedback) there is still a certain gap compared to the Opus 4.6 reasoning mode."
In overseas AI communities, some users are excited, hoping DeepSeek will become a "Claude killer"—though fairly speaking, this is more of a wishful thinking, given the widespread resentment toward Claude and its developer, Anthropic ("the world has suffered under Anthropic for too long"). Current testing data shows that V4 offers high token cost-effectiveness, but its Agent Benchmark scores do not surpass Claude Opus-4.6 and GPT-5.4. It's important to note that Benchmark scores are only indicative; actual user experience is key. Claude often underperforms GPT and Gemini in many Benchmark tests but remains nearly unparalleled in the Agent space. Thus, I am highly interested in the real-world feedback from professional users leveraging DeepSeek for Agent operations in the coming days.
Expanding the context window to 1M is a significant upgrade, and when combined with lower token pricing, it could unlock substantial productivity gains. However, we still need to await real-world feedback from professional clients tackling complex tasks—which may take another two to three days at least.
My speculation: The delay from the rumored "January/February release" to today likely stems from refining Agent capabilities, as the "lobster craze" generated far more demand than anyone anticipated. Rather than releasing a model that falls short of real-world Agentic Workflow needs, it was wiser to wait a few more months and deliver a model that fully meets those demands. Of course, this is just my personal conjecture without empirical evidence.
Second, initial feedback from overseas clients indicates that when handling "non-deep reasoning, non-mathematical, non-code" tasks, such as brainstorming and creative writing, V4 feels "too dry" and "overly formal," lagging behind the latest versions of Claude and GPT. Some even argue it falls behind GPT-5.2. It's important to emphasize that these fragmented subjective impressions are not definitive, and creative writing tasks are not V4's primary focus. However, this could subtly influence consumer-end user preferences, potentially disrupting the current battle among internet giants for the C-end AI application market.
If V4's responses are indeed "too dry" and "too formal," it might reflect an effort to address the higher hallucination rates seen in V3/R1, as creative freedom often correlates with increased hallucinations, while restricting hallucinations tends to make responses "drier." Of course, this is just speculation, and broader testing results are needed.
Third, and what many care about most: DeepSeek V4's technical documentation discloses many training details but excludes training hardware (GPUs). The document mentions "Huawei" once and "Nvidia" three times (excluding footnotes), with "GPU" appearing fourteen times (excluding footnotes)—but no specific GPU models are identified, except for one instance: "We validated the fine-grained EP scheme on both NVIDIA GPUs and Huawei NPUs platforms." However, this refers to testing environments, not training scenarios (note: this is also the only mention of Huawei's Ascend NPU).
Thus, we still do not know what hardware DeepSeek was trained on. The document mentions CUDA several times, but this does not confirm a purely NVIDIA-based architecture. Did DeepSeek train on a hybrid NVIDIA and Huawei Ascend architecture, as some analysts speculate, or "optimize specifically for Ascend during post-training"? Unfortunately, while anything is possible, DeepSeek has neither confirmed nor denied these claims. In contrast, the V3 technical documentation explicitly stated it was trained on NVIDIA H800 and A100.
Image created by Google Nano Banana Pro
Some have concluded from DeepSeek's announcement that "Pro prices will drop significantly after the Ascend 950 super-nodes become available in the second half of the year" that "V4 has been deeply optimized for Ascend" or even that "the delay from January to now was to adapt to Ascend." While not entirely impossible, this reasoning is overly speculative. It merely suggests that DeepSeek will procure or lease Ascend computing power and has optimized inference for Ascend (which is normal). There is no empirical evidence to support the extent of optimization or whether the release was delayed multiple times specifically for Ascend adaptation.
However, indirectly verifying or refuting this is straightforward. V4 is open-source, and from today onward, countless vendors will run inference on their own hardware. If V4 was indeed trained on Ascend or deeply optimized for it, we can infer that its inference efficiency on Ascend hardware should be higher than—or at least comparable to—other hardware like NVIDIA. Alternatively, Ascend hardware might enable V4 to achieve unique performance unattainable on other platforms. Simply monitoring the news will clarify the situation (if no news emerges, the claim is disproven).
Finally, the impact on the industry: Besides DeepSeek itself, the company most eagerly anticipating V4's release and its potential to shock the world is likely Tencent. Yesterday, the preview version of Hunyuan 3.0 was released, but its spotlight was completely overshadowed by today's DeepSeek V4. I haven't had the chance to fully test Hunyuan 3.0, but given Tencent's significant lag in foundational large models, it seems unlikely they could reach world-class standards in a single version.
Thus, to succeed in AI, whether in B2B (especially MaaS token sales) or B2C markets, Tencent will likely rely on high-quality third-party open-source large models. The most advanced and widely used open-source model in China is Qwen, developed by Tencent's rival Alibaba—which Tencent is reluctant to adopt. That leaves DeepSeek and Kimi (which released a new version just days ago) as their primary options. If DeepSeek opens up for funding, Tencent will undoubtedly invest heavily. The more successful V4 is, the more time Tencent gains to position itself as a hub for China's "domestic open-source ecosystem" while frantically catching up in self-developed large model capabilities.
I imagine Tencent's investors and management are now eagerly awaiting positive user testing feedback for V4. Fortunately, initial feedback appears promising, with overseas communities leaning toward a positive evaluation. However, more information is needed, and "somewhat positive" is insufficient—ideally, it should be "overwhelmingly positive." We may need to wait another week for confirmation.