Tencent Unveils New Model, Yao Shunyu Steers Towards Success

Home

Finance

ICV

Smart City

Digital Live

Cloud

Optics

Home Finance AI ICV Smart City Digital Live Cloud Optics

04/28 2026 497

The release of Hy3 preview marks the commencement of a transformative journey.

In the fiercely competitive arena of large-scale AI models, Tencent has finally unveiled its latest innovation.

On April 23, Tencent officially launched and open-sourced the Hunyuan Hy3 preview language model. This hybrid expert (MoE) model seamlessly integrates fast and slow thinking, boasting an impressive 295 billion (295B) parameters, with 21 billion (21B) actively engaged, and supporting a maximum context length of 256K. Dubbed as the "inaugural model post-Hunyuan's reconstruction" and "the most intelligent Hunyuan model to date," it sets a new benchmark.

Recalling the past four months, Tencent's AI odyssey has been fraught with challenges. At this year's annual event, Pony Ma candidly admitted that Tencent had been "lagging," trailing by 9 months to a year. Martin Lau reflected on Hunyuan's performance, likening it to a high school student cramming for exams—impressive on paper but lacking in real-world application. Meanwhile, ByteDance's Doubao soared to 345 million MAU, Alibaba's QianWen reached 166 million, while Tencent's Yuanbao lagged at around 57 million, with the gap widening.

Thus, when 28-year-old Yao Shunyu, a former OpenAI researcher, Tsinghua Yao Class alumnus, and China's most renowned AI prodigy, was appointed as Tencent's Chief AI Scientist in the "CEO/President Office" last year, the message was clear: Tencent is doubling down on AI.

Four months on, the Hy3 preview is here. The moment to showcase results has arrived.

A 'From-the-Ground-Up' Transformation

The Hy3 preview is not a mere incremental update. In Tencent's own words, it represents a foundational engineering overhaul. In February, Hunyuan revamped its infrastructure for pre-training and reinforcement learning, abandoning the original framework. Yao Shunyu accomplished this entire infrastructure transformation within a month of his arrival.

The overhaul was guided by three core principles: holistic capability development, authentic evaluation, and cost-efficiency. In simpler terms: no "specialized students," no benchmark-chasing, and no models that drain resources.

The principle of avoiding "specialized students" is particularly noteworthy. From the outset, Hy3 preview was designed with intelligent agent scenarios in mind. Yao Shunyu recognized that even a single application, like a code agent, requires deep coordination of multiple capabilities, including reasoning, long-text understanding, instruction following, dialogue, coding, and tool usage. A model that excels in coding but falters in reading documentation or chatting without API integration is inadequate.

Simultaneously, Yao pointed out that the previous Hunyuan model overemphasized benchmark rankings, contaminating training data with benchmark corpora and compromising real-world performance. He directed the team to "abandon benchmark-chasing," voluntarily stepping away from publicly available leaderboards susceptible to manipulation. Instead, they evaluate the model's "real-world effectiveness" through self-built test sets, latest exams, human evaluations, and product beta testing.

From a development timeline perspective, Hy3 preview commenced training in late January 2026 and went live in under three months. Tencent internally views this as the beginning of Hunyuan's shift from "theoretical knowledge" to "practical application"—striving to solve complex real-world problems.

Rebuilding infrastructure, setting directions, training models, and open-sourcing within three months is an ambitious feat for a corporate behemoth.

The core technical philosophy of Hy3 preview is "integrating fast and slow thinking."

This concept aligns with the dual-system theory in cognitive science: System 1 (fast thinking) represents quick, automatic, intuitive responses; System 2 (slow thinking) involves slow, deep reasoning requiring substantial computational resources. Traditional large models typically opt for one path—either fast but limited in capability or powerful but slow in response.

Hy3 preview's approach is to enable the model to automatically select its thinking mode based on task complexity: fast thinking for simple tasks, slow thinking for complex ones, striking an optimal balance between speed and capability.

Engineering-wise, this mechanism leverages the MoE architecture. Out of the total 295B parameters, only 21B are activated per inference, roughly 7.1% of the total. This means the actual computational load is significantly lower than a dense model with 295 billion parameters.

Slow thinking tasks activate more experts and consume more computational resources, while fast thinking tasks activate fewer experts to conserve computing power. The transition between fast and slow thinking is not merely stacking two models but adaptively allocating computational resources within a single model based on the task.

This design is not novel, but accomplishing architectural selection, training, and deployment in under three months showcases formidable engineering prowess.

For a company like Tencent, with products like WeChat, QQ, and Tencent Docs serving billions of users, controlling inference costs is crucial for model integration into products. Hy3 preview's architectural choice reflects practical business considerations.

What Gives It the Confidence to Eschew Benchmark-Chasing?

Since Tencent chose not to chase benchmarks, it had to develop its own evaluation system.

Tencent Hunyuan proposed two evaluation frameworks: CL-bench and CL-bench-Life, focusing on the model's ability to comprehend information in long, complex contexts, adhere to intricate rules, and complete tasks. These frameworks address problems common in real-world production and life scenarios that traditional benchmarks often overlook.

In terms of specific performance, Hy3 preview achieved competitive results in several key benchmark tests. On the programming benchmark SWE-Bench Verified, it scored 74.4%, a 40%+ improvement over the previous Hy2's 53.0%, nearing GLM-4.7's level.

For complex reasoning tasks, Hy3 preview excelled in high-difficulty STEM reasoning benchmarks like FrontierScience-Olympiad and IMOAnswerBench. It also performed outstandingly in challenging reasoning tasks such as the China National High School Biology League (CHSBO 2025), demonstrating strong generalization in complex logical reasoning.

While not deliberately pursuing "state-of-the-art" (SOTA) in any single dimension, Hy3 preview exhibits balanced competitiveness across all areas. This aligns with Yao Shunyu's message at the AGI-Next Summit: the industry needs to break free from "benchmark obsession" and focus on real user value.

However, it must be acknowledged that Hy3 preview's performance in some real-world tests is not flawless.

First-hand testing by an institution revealed that in a full-pipeline task encompassing data scraping, numerical computation, visualization generation, and text analysis, Hy3 preview encountered obstacles during data acquisition, switching multiple data sources after interface authentication failures. Some data had to be replaced with simulated data due to rate limits.

Most critically, despite explicit instructions to output a 500-word cross-market asset allocation memo, the model only provided a few bullet-point-style allocation ratios without any analytical paragraphs.

This indicates significant room for improvement in Hy3 preview's delivery completeness in complex real-world scenarios. Of course, as a preview version, such flaws are largely expected.

Besides performance, pricing is now a key consideration. Hy3 preview's pricing on Tencent Cloud's TokenHub platform is as follows: input starts at 1.2 RMB/million tokens, cached input at 0.4 RMB/million tokens, and output at a minimum of 4 RMB/million tokens. Additionally, Tencent Cloud and Hunyuan jointly offer customized Token Plan packages, with personal plans starting at 28 RMB/month.

In the current market landscape, Hy3 preview's pricing is not particularly aggressive.

For comparison, DeepSeek-V4-Flash costs 0.2 RMB/million tokens for input, and V4-Pro's cached input price drops to just 0.025 RMB/million tokens after limited-time discounts. On the OpenRouter platform, DeepSeek-V4-Flash's average output price per million tokens is only 1.55‰ of GPT-5.5 Pro's.

However, in the "Agent Era" of the "hundred-model war," Tencent's pricing logic is clear: not competing on absolute low prices but pursuing a triangular balance of "capability-cost-scenario."

The 21B activated parameters themselves represent a cost advantage. Combined with the MoE architecture's efficient inference, they provide a relatively controllable cost base for high-frequency, long-chain calls in Agent scenarios.

In other words, it reaches the threshold for Agent deployment.

Tencent AI's Ace: Its Proprietary Ecosystem

A model's true value lies in its application.

Hy3 preview is now available on Tencent Cloud, Yuanbao, ima, CodeBuddy, WorkBuddy, QQ, QQ Browser, Tencent Docs, Tencent Lexiang, and more. Mainline products like WeChat Official Accounts, Peacekeeper Elite, Tencent News, Tencent Self-Select Stocks, Tencent Customer Service, and WeChat Reading are also gradually integrating it.

Notably, it supports integration with popular open-source agent products like OpenClaw, OpenCode, and KiloCode. This means Tencent is not just equipping its own product matrix with its models but also attempting to penetrate the broader open-source agent ecosystem.

However, product-side challenges persist. After Yuanbao integrated DeepSeek-R1, its DAU surged over 20x, but the search experience was fragmented across two systems (Hunyuan and DeepSeek), posing retention and conversion challenges. Whether Hy3 preview can resolve this "diversion" problem upon full integration will be its first real test of combat effectiveness.

Currently, Tencent's largest AI application, Yuanbao, has fully integrated Hy3 preview. From WeChat to QQ, from Tencent Docs to Peacekeeper Elite, Tencent's product matrix is rallying around a unified model foundation. This "proprietary ecosystem + proprietary model" approach forms an interesting contrast to ByteDance's Doubao, which relies on Volcano Engine.

On the day of Hy3 preview's release, OpenAI unveiled GPT-5.5 that same evening. Less than 24 hours later, DeepSeek V4 preview went live.

This is a microcosm of the current landscape. In this year's large-model competition, opponents are playing their cards much faster than outsiders imagined.

Meta recently scored a comeback with Muse Spark, sending its stock soaring. Google's Gemini 3.1 series maintains strong momentum, with its AI chatbot market share climbing from under 6% to over 20%. Domestically, Alibaba's Qwen3.6-Max-Preview and Moonshot AI's Kimi K2.6 emerged earlier. Even earlier, Doubao Large Model 2.0 achieved its first major version upgrade, while Baidu released Wenxin Large Model 5.0, a native full-modal model with 2.4 trillion parameters.

As for DeepSeek, V4-Pro achieved best-in-class performance among open-source models in Agent capabilities, world knowledge, and reasoning, while slashing prices twice within two days—some prices dropped to 1/40th of the original, with V4-Flash's cached input price hitting just 0.02 RMB/million tokens.

The industry consensus is clear: competing on price with DeepSeek is a losing proposition for any vendor.

Against this backdrop, Tencent advances at its own pace with a "pragmatism + ecosystem deployment" strategy. As Dowson Tong previously judged, the capability gaps between mainstream large models are narrowing. Enterprises' core need is no longer possessing the best model but maximizing model capabilities through systems engineering. The true differentiator is "engineering delivery capability."

Yao Shunyu: From "Defining the Second Half" to "Delivering the Model"

The most remarkable aspect of this entire endeavor is Yao Shunyu.

In April 2025, while still at OpenAI, Yao published a blog titled "The Second Half," arguing that AI had transitioned from its first half to the second, where the focus should shift from training stronger models to defining worthwhile problems and evaluating models in more real-world ways.

This blog earned him the label "the person who defined AI's second half."

After joining Tencent, he needed to transition from making judgments to implementing them. Four months, a new infrastructure, a new model, and an open-source release. For outsiders, Hy3 preview is the beginning of an answer.

Yao himself remains clear-headed: "Hy3 preview is the first step in reconstructing the Hunyuan large model. We hope this open-source release will gather authentic feedback from the open-source community and users to help improve the practicality of the official Hy3 version."

These words contain no bravado but rather resemble a phased project report.

Public information shows that besides Yao, Tencent has recruited no fewer than 10 AI luminaries from top teams like Microsoft, Alibaba, and DeepSeek over the past year, including Hu Han, former lead researcher of Microsoft Research Asia's Visual Computing Group, and Xu Can, creator of Microsoft's WizardLM project. Tencent's investment in AI talent—from salaries and titles to responsibilities—offers candidates nearly the highest industry standards.

Hy3 preview isn't the work of a lone prodigy but the first product from a restructured team building on a rebuilt foundation.

For Tencent, Hy3 preview essentially answers one question: Can Tencent's large models still compete? Judging by parameters, architecture, evaluation data, and product integration, this answer at least clears the passing grade.

But a preview version is just the starting point. In this fiercely competitive landscape with accelerating rhythms, Tencent needs a model system that can continuously iterate, truly take root in its proprietary ecosystem, and ultimately deliver differentiated value.

That's the real question worth watching: When will the official Hy3 version arrive? Can Tencent's product matrix form a self-contained "model-application-commercial" loop around it? Can Yuanbao achieve retention and growth on Hunyuan's own foundation? And when the Agent Era truly arrives, can Tencent's ecological depth translate into actual competitive advantages?

Four months prior, Yao Shunyu took his place at a brand-new card table. Now, four months on, the unveiling of the Hy3 preview represents the initial card he has laid down. The manner in which he plays the subsequent cards will serve as a genuine litmus test of his prowess.

This article is an exclusive piece crafted by Xinmou.

— THE END —

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.

Newest

Links