Humanoid Robots in 2025: The 'Body' is Ready, But When Will the 'Soul' Be Awakened?

01/04 2026 441

Rapid Physical Progress, Intellectual Lag, and Capital Reallocation

The stage is set, and a model, standing at 178 cm tall and weighing 70 kg, strides confidently towards you with the poise of a seasoned catwalk performer.

But don't be misled—this isn't a runway show at an international fashion week. Instead, it's the unveiling event of XPENG Tech Day in November. The striking figure isn't an international 'supermodel' but XPENG's latest humanoid robot, IRON.

This futuristic yet slightly 'unsettling' fashion show quickly sparked a frenzy on social media. Netizens, playing the role of detectives, scrutinized every frame with meticulous attention. Some pointed out that the robot's joint reflections didn't seem metallic enough; others suspected its movements were too fluid to be purely mechanical. Some even dreamed up absurd scenarios, imagining stunt actors cramped inside the robotic shell.

Faced with a barrage of skepticism, during the evening's press conference, XPENG's staff opted for a bold demonstration. They directly sliced open IRON's flexible 'skin' and 'muscle' layers on its calf while the robot was in operation. The exposed mechanical skeleton continued its catwalk under the spotlight, leaving no room for doubt.

Such 'convincingly authentic' performances are becoming increasingly common worldwide as embodied intelligence makes rapid strides.

From 'Xiaoqi,' a robot with a lifelike appearance and dexterous hands providing consultancy and public services at the Zhongguancun Forum, to Protoclone, a robot created by Polish startup Clone Robotics with realistic artificial bones and muscles capable of 'sweating' to cool down...

It's safe to assert that by 2025, humanoid robots have made significant strides in physically resembling humans. The question that lingers, however, is: How close are they to possessing the intelligent core that defines 'humanness'?

01 Rapid Physical Progress, Intellectual Lag

The 'body' of humanoid robots has unveiled more possibilities in 2025.

On one hand, the 'blockade' on core components is being dismantled.

Key components like planetary roller screws, harmonic reducers, and high-precision bearings, long monopolized by foreign manufacturers, are transitioning from 'usable' to 'affordable and high-quality' thanks to breakthroughs in the domestic supply chain.

Take planetary roller screws as an example. Previously dominated by European companies like GSA, Rollvis, and Rexroth, with prices soaring to tens of thousands of yuan and long lead times, domestic players like Shuanglin and Wuzhou Xinchun have successfully slashed redundant costs through process innovation and reverse engineering by 2025 while maintaining industrial-grade performance.

Localization not only promises to reshape cost curves but also ensures supply chain security, accelerating the transition of humanoid robots from 'laboratory curiosities' to 'industrial consumer goods.'

On the other hand, dexterous hands—the 'last centimeter' of interaction with the external world—have witnessed key breakthroughs by Chinese players.

A capable dexterous hand can perform more than basic operations like shaking hands or grasping objects; it can maneuver tiny screws inside precision instruments or even assist surgeons in holding sutures thinner than hair on operating tables.

However, this critical technology remained stuck in the '1.0 stage' of mere mobility for a long time, with a vast chasm separating it from the '2.0 stage' of usability, reliability, and durability.

Fortunately, leveraging a complete industrial chain and massive market demand, domestic dexterous hand manufacturers have risen rapidly. By July 2025, China was home to over 60 dexterous hand companies, capturing half of the global market share.

Some Chinese firms have transitioned from 'followers' to 'peers' and even 'leaders' in certain areas. For instance, in mid-August, Zhiyuan Robotics unveiled its OmniHand2025 series, featuring an 'Interactive' model for services and a 'Professional' model for specialized tasks. Similarly, RoboSense, a leader in LiDAR, launched its second-generation dexterous hand, Papert2.0, earlier this year. Equipped with 15 force sensors on its fingertips, palms, and fingers, it can lift 5 kg and perform complex operations...

The robot's body is becoming stronger, more agile, and stable. Yet, this increasingly powerful 'body' eagerly awaits a sufficiently intelligent 'soul.'

After all, a vast efficiency gap remains between 'showcasing skills' and 'performing tasks.' Humanoid robots still have a long journey ahead before they can 'replace humans.'

A Morgan Stanley report points out that even in standardized tasks like 'box moving,' UBTECH's humanoid robots, while achieving a 99% success rate, still take 1.5 minutes per box—just 30% of human efficiency.

Hardware constructs the 'stage,' but the 'performance's' true value always hinges on the 'soul' named 'intelligence.' Progress in 2025 also underscores that while we can build more powerful bodies, infusing them with common sense and wisdom remains a conundrum for the entire industry.

02 Humanoid Robot 'Brains' Mired in an Evolutionary Quagmire

Five years ago, hardware ceased to be a barrier for humanoid robots. Today, software capabilities—their 'brains'—are the true bottleneck.

Wang Qian, founder of Zibian Robot, stated in an interview with China Business Journal, 'We still lack a sufficiently intelligent 'brain' that enables robots to think, judge, and operate as flexibly as humans.'"

"On July 28, 2023, Google DeepMind unveiled RT-2, the world's first Vision-Language-Action (VLA) model for robot control, pointing the way for 'brain evolution' in humanoid robots.

VLA operates by processing human instructions and multimodal external information (sound, images, videos) through large language models for understanding and planning, ultimately outputting actions to control the robot's body.

This clear technical logic once made VLA models seem like the 'perfect bridge' between digital intelligence (embodied by ChatGPT) and the physical world, leading to widespread adoption by humanoid robot manufacturers.

However, by 2025, this bridge has revealed 'structural cracks.' The scarcity and complexity of physical-world data have emerged as the primary bottleneck limiting VLA models' capabilities.

Large VLA models typically boast billions of parameters, demanding immense computational resources during operation. Moreover, they require vast amounts of high-quality training data to function effectively—data that is difficult and costly to obtain in reality.

(Robot dancers at Wang Leehom's concert)

Wang Xingxing, founder of Unitree Robotics, once noted that VLA models represent a relatively simplistic architecture. While VLA-based robots excel at dancing or punching, training them for entirely new dances requires starting from scratch with each new move.

Evidently, current VLA models resemble 'expert systems' needing meticulous feeding rather than 'versatile students' capable of generalization.

Deeper criticism targets the architecture itself. He Xiaopeng, founder of XPENG Motors, argued that the two 'translations'—from vision to language and from language to action—involve significant information loss. Language, as an intermediary, discards vast amounts of detail from raw visual data and the continuity of the physical world.

Thus, he proposed a relatively 'radical' idea: 'eliminate the L (language) layer' and build a 'World Model' that maps vision directly to action.

The ideal is to enable AI to learn the underlying rules and motion mappings of the physical world directly from massive video data, aiming for deep understanding and predictive capabilities rather than mere language-based reasoning. While the vision is 'ambitious,' no clear technological convergence trend has emerged yet.

Therefore, despite widespread recognition of VLA models' limitations and active exploration of new paradigms like 'World Models,' no 'standard answer' has emerged for the ultimate path of 'brain evolution' in the past year.

03 Humanoid Robots Approach a Night of Value Restructuring

Although the industry has not fully dispelled the 'fog' surrounding technological paths, the market remains bullish on the robotics sector's long-term prospects.

IDC predicts that the global robotics market will exceed $400 billion by 2029, with China accounting for nearly half and a compound annual growth rate of around 15%.

Expanding demand has fueled a wave of financing. By early December, there have been over 550 investment and financing events related to the domestic robotics supply chain, with total funding exceeding 83.9 billion yuan. Despite unclear implementation paths, leading firms like Zhiyuan Robotics and Unitree Robotics have reached valuations in the tens of billions of yuan.

However, as 2026 approaches, capital markets' criteria for evaluating humanoid robot companies are undergoing a fundamental shift.

In November, a startup named Lingqi Wanwu released a demo video: a modified Unitree G1 robot autonomously completed complex tasks like drawing curtains, folding clothes, watering plants, taking out trash, and organizing clutter in a real household environment. Within four months, Lingqi Wanwu secured three rounds of financing totaling nearly 100 million yuan.

To some extent, this further validates the capital market's attitude: 'Can it form a measurable and sustainable commercial closed loop (closed loop) in specific scenarios?' has become more critical than 'Can it perform a difficult backflip?'

Capital markets are now voting with real money for 'pragmatic players' demonstrating potential to solve practical problems.

First and foremost, in automotive manufacturing, 3C electronics assembly, and logistics warehousing—fields with clear processes and highly structured environments—humanoid robots are transitioning from decorative 'tech exhibits' to 'workstation employees' handling actual production tasks. The first batch of 'battle-tested' real orders and stable revenue streams has emerged here.

Penetrating household consumer scenarios is more complex. Robots must perform multiple tasks like folding clothes and organizing dishes, testing model generalization and robustness. Moreover, the user experience during face-to-face human-robot interaction is increasingly valued.

After all, a robot's fluency, human-likeness, and even emotional feedback during direct interaction will directly determine user stickiness and service value. This implies that the next phase of competition may focus less on hardware and more on the depth of human-robot interaction and scene understanding.

In summary, while 2026 may not be the long-awaited 'technological explosion singularity,' it could very well become a more critical 'differentiation singularity.'

The tide will more clearly distinguish between performers obsessed with the spotlight and value creators deeply cultivating user needs. As capital hype subsides, only enterprises that have built genuine competitiveness in core components, scene data, and commercial closed loops will be selected by the era to truly touch the 'starry sky' of humanoid robots.

*Image sourced from the internet. Please contact for removal if infringement occurs.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.