06/08 2026
392

This is the 67th original article from the Thinking AI Club.
Approximately 1,710 words, with an estimated reading time of 6 minutes.
On June 1, the "YD/T 6770—2026 Artificial Intelligence - Key Foundational Technologies - Benchmark Testing Methods for Embodied AI," a standard approved and issued by the Ministry of Industry and Information Technology (MIIT), officially took effect. (This was previously discussed in the article "MIIT Approves First Industry Standard for Embodied AI, Set for Official Implementation in June 2026").
On the same day, Unitree Robotics successfully passed its IPO review for the Science and Technology Innovation Board (STAR Market).
The timing of these two events is no mere coincidence. The former establishes a "capability benchmark" for the industry, while the latter sets a "value benchmark" for the capital market.
The embodied AI industry is transitioning from "storytelling" to "value realization" in a substantive phase.
The embodied AI sector has been on fire for the past two years.
By 2025, China was home to over 140 complete-machine enterprises, with more than 330 humanoid robot products launched and annual shipments reaching approximately 17,000 units.
However, behind the excitement lies an awkward reality: While companies produce dazzling product demonstration videos, real-world performance in factories often falls short. Some robots fail to recognize objects under different lighting conditions, others run out of power after moving a few boxes, and some touted as "autonomous decision-makers" actually require human remote control at every step.
What's the problem? The lack of unified evaluation standards.
Company A measures walking speed on a flat laboratory floor, Company B tests grasping success rates in specific scenarios, and Company C simply releases edited demo videos.
Incompatible data formats, inconsistent interfaces, and vastly different testing environments lead to the widespread phenomenon of "excellent laboratory performance but poor real-world adaptability." In industry terms, this is called "Demo as Capability"—performing well in demos doesn't equate to being functional in practice.
A deeper issue is resource misallocation.
Enterprises don't know their true technological standing, investors can't distinguish genuine capabilities from flashy gimmicks, and purchasers lack credible criteria when selecting products.
As a result, everyone competes to look "the coolest," while truly high-value industrial scenarios remain underexplored.
The core value of YD/T 6770-2026 lies in establishing a quantifiable, comparable, and reproducible evaluation framework for embodied AI.

The standard covers two major scenarios—simulated and real-world environments—and targets two types of test objects: models and complete-machine systems. It defines five core indicators:
The combination of these five indicators is noteworthy—it focuses not just on "whether it can be done" but on "how fast, stable, cost-effective, and resilient it is."
In other words, the standard measures not a robot's "talent" but its "work capability."
Testing methods are divided into four levels: static simulation, dynamic simulation, real-world environment, and combined testing, gradually increasing in complexity from virtual to real to avoid the trap of "invincible in simulation, useless in reality."
Notably, this standard has also been submitted for international standardization at the International Telecommunication Union (ITU-T), indicating that China is not only leading in technology but also gaining influence in rule-setting.
After the standard's implementation, the industry landscape will undergo substantive changes.
For R&D enterprises, the evaluation system serves as a "navigation tool" for technological iteration.
Previously, product optimization relied on intuition or executive decisions. Now, companies can precisely identify shortcomings by comparing against the five indicators—whether the decision-making algorithm is lagging, the actuator precision is insufficient, generalization capabilities are weak, or energy consumption is excessive. R&D resource allocation now has an objective basis.
A unified testing standard reduces R&D and deployment costs, forcing the industry to abandon a focus on demonstration effects and improve actual product performance.
For downstream purchasers, selection finally has a "hard currency."
Industrial clients don't buy robots as toys; they demand ROI (Return on Investment). Previously, purchasers had to rely on experience to evaluate vendors' claims. Now, they can directly request test reports under the standard framework for clear horizontal comparison. This will largely curb the "bad money drives out good" phenomenon—gimmicky products can no longer deceive.
For the capital market, valuation logic is shifting from "concept premium" to "capability pricing."
Unitree Robotics' IPO approval is a signal: the market now evaluates companies using quantitative metrics like "performance + shipment volume." The more mature the standard system becomes, the more pronounced this "de-bubbling" trend will be.
In Q1 2026, total financing in China's embodied AI and robotics industrial chain reached 37.3 billion yuan, but funds are increasingly concentrated in top players—companies without deployment capabilities will find it harder to secure funding.
The implementation of YD/T 6770-2026 marks the end of the "wild growth" phase for the embodied AI industry.
However, this is just the first step. The standard itself requires dynamic iteration, testing capabilities need supporting infrastructure, and industry consensus must continue to be nurtured.
What truly matters is that as "evaluation has a standard" meets "listing has a gateway," the industrialization of embodied AI is accelerating toward closure.
2026 is widely regarded as the "Year of Scaled Applications" for embodied AI, and the implementation of this "ruler" may well be the landmark event signaling its arrival.

All content is sourced from publicly available information.