Android in the field of humanoid robots, the national team takes the lead

03/17 2025 400

Author | Xiang Xin

In today's field of humanoid robots, a mainstream view is that the hardware body is not the obstacle to the application of humanoid robots; rather, it is the brain and cerebellum.

In other words, humanoid robots lack a brain capable of intelligent decision-making and a cerebellum capable of finely controlling the movement of all body joints.

These two aspects are the key technologies that enable humanoid robots to perform tasks and enter human life.

To help the humanoid robot industry address this issue, on March 12, the national team in the field of humanoid robots – the Beijing Humanoid Robot Innovation Center (the National-Local Collaborative Embodied Intelligence Robot Innovation Center, hereinafter referred to as the "Innovation Center") – released the general embodied intelligence platform "HSKW".

This is the world's first general embodied intelligence platform with "multi-functional brain" and "one brain for multiple robots", comprising a "brain" responsible for task planning and a "cerebellum" responsible for task execution.

It serves as the thinking and control center of robots, enabling various robots of different configurations to flexibly adapt to various scenarios such as industry, logistics, and households, and autonomously complete complex tasks such as organizing items and logistics packaging.

The biggest feature of "HSKW" is that it allows individuals or enterprises who do not understand algorithms or even robots but want to use robots for work to relatively easily and quickly complete robot application development, enabling efficient utilization of multiple robots in different scenarios and tasks.

Tang Jian, Chief Technology Officer of the Innovation Center, said that "HSKW" is a disruption to the traditional robot application development model and is expected to significantly reduce the investment of human resources and time in robot application development.

Adaptable to multiple robot configurations,

With task generalization capabilities

Traditional industrial and service robot application development typically requires a professional team to collect data in specific scenarios, write a dedicated program for specific tasks, and perform various debugging steps to complete the process.

Such application development solutions not only consume a lot of time and labor costs, but the debugged robots also barely have generalization capabilities and are only suitable for fixed processes and operating objects.

As a result, the robot industry has been unable to address the pain point of poor generalization capabilities in scenarios, tasks, and robot bodies.

"HSKW", on the other hand, is a platform that enables various mainstream robots on the market to rapidly develop for any scenario or task.

Its "general" nature is reflected here.

"Multi-functional brain", "one brain for multiple robots", and high data utilization are the three core highlights of "HSKW":

"Multi-functional brain": Supports robots to adapt to various scenarios from industrial manufacturing to household services and perform various complex tasks, such as industrial sorting, desktop organization, logistics packaging, etc.;

"One brain for multiple robots": Can be adapted to various robots such as robotic arms, wheeled robots, and humanoid robots;

High data utilization: Tasks are broken down into multiple meta-skills such as grasping, twisting, and picking, requiring only a small amount of data for efficient training and successful task execution.

At the press conference, "HSKW" demonstrated real-machine operational applications in four scenarios: industrial sorting, block building, desktop cleaning, and logistics packaging.

This was the world's first live demonstration of multi-scenario, multi-task, and multi-configuration embodied intelligence robot operations.

Staff members only used the "HSKW" app to give instructions to the robotic arm or directly told the humanoid robot what to do, and the robot could autonomously analyze tasks and the environment and perfectly execute the tasks, with the entire process being smooth and fluid.

In the block building scenario, "HSKW" achieved the intelligent disassembly and execution of complex tasks for the first time.

On-site audience members randomly built a block structure, and after receiving the voice command, the humanoid robot "Tiangong" used a visual large model (VLM) to analyze the composition of the block structure, plan the construction sequence of each layer, and rebuild an identical block structure with millimeter-level precision, demonstrating the application potential of "HSKW" and humanoid robots in fields such as education and entertainment, precision manufacturing, etc.

During the process of organizing the desktop, the robot can calmly handle the interference caused by humans constantly moving items, demonstrating excellent autonomous error correction capabilities.

The powerful robot application development capabilities of "HSKW" originate from its brain and cerebellum:

The brain is deployed in the cloud and driven by MLLM (Multi-Modal Large Language Model) and VLM (Vision-Language Model), possessing capabilities such as natural interaction, spatial perception, intention understanding, hierarchical planning, and error reflection;

The cerebellum is deployed on the client side and driven by VA (Vision-Action), VLA (Vision-Language-Action) models, and LLM (Large Language Model), responsible for end-to-end task execution.

At the cerebellum level, it is further divided into two sub-platforms:

Embodied operation platform: Possesses a meta-skill library, enabling functions such as generalized grasping, skill invocation, and error handling;

Embodied motion control platform: Responsible for full-body control of the robot, including dual-arm collaboration, stable walking, mobile navigation, etc.

Among them, the meta-skill library refers to a database containing the basic and general skill sets required for robots to complete various complex tasks.

"HSKW" was trained using the general embodied intelligence dataset and Benchmark – RoboMIND – constructed by the Innovation Center. RoboMIND covers tasks in multiple scenarios such as industry, households, and offices, and possesses high versatility and scalability.

The operation process of "HSKW" involves the embodied "brain" planning the task, then invoking the embodied "cerebellum" skill library to execute specific actions, and transmitting execution feedback to the embodied "brain", forming a closed loop for tasks.

For example, when receiving an instruction to pack a parcel, the robot's brain will understand the instruction, plan the task, and decompose it into multiple sub-tasks, namely picking up the scanner and item, scanning the item, placing the item, closing the carton, and attaching the shipping label.

Subsequently, the task instructions are transmitted to the cerebellum, which invokes the skills required to perform these tasks from the meta-skill library, such as grasping, placing, scanning, and labeling. Finally, the embodied motion control platform of the cerebellum controls the robot's body to complete the actions.

Due to the convenience and multi-scenario applicability of application development, "HSKW" is currently used to support UBTECH humanoid robots, helping them successfully apply the BrainNet software architecture innovatively proposed by UBTECH, thereby enabling them to perform tasks across an entire industrial production line.

Whether for customers from various industries interested in robot applications or for scientists and geeks dedicated to robot research and development, "HSKW" is a powerful tool to rapidly shorten the robot application development cycle.

Multiple key technologies open-sourced

Adding fuel to the fire of embodied intelligence

Tang Jian, Chief Technology Officer of the Innovation Center, introduced that the technical architecture of the "HSKW" platform will be gradually open-sourced as planned this year, including the VLM, VLA models, and related code.

For the embodied intelligence industry, which is still in its early stages of development, open-sourcing is of great significance as it can break down technical barriers, quickly lower industry thresholds and research and development costs, accelerate the diffusion of cutting-edge technologies, and promote rapid and diversified development of the industry.

Founded in November 2023, the Innovation Center was jointly established with an investment of 460 million yuan by 10 industry-leading enterprises and institutions such as Beijing Jingcheng Machinery and Electricity Holding, UBTECH, and Yizhuang Robotics, and was then named the Beijing Humanoid Robot Innovation Center.

In October 2024, under the guidance of the Ministry of Industry and Information Technology and the Beijing Municipal People's Government, the Innovation Center was officially upgraded to the "National-Local Collaborative Embodied Intelligence Robot Innovation Center".

This upgrade Endowed it with the attributes of a national team 。

Since its establishment, the Innovation Center has focused on the research and development of common technologies in the embodied intelligence and humanoid robot industries and has open-sourced multiple major technologies or resources after achieving results:

Open-source robot body: Open-sourced the humanoid robot "Tiangong", including software development documentation, software architecture, robot structure drawings, electrical systems, etc., with multiple partners conducting secondary development for application scenarios based on the "Tiangong" platform;

Open-source embodied intelligence dataset: The general embodied intelligence dataset and Benchmark – RoboMIND – initially open-sourced 100,000 pieces of data, which have been downloaded and used thousands of times by nearly a hundred enterprises, universities, and research institutions.

The humanoid robot "Tiangong" currently has two different versions: Tiangong Lite and Tiangong Pro.

Tiangong Pro is the robot demonstrated at this "HSKW" press conference, standing at 163cm tall, weighing 56kg, and possessing 42 degrees of freedom.

In terms of motor abilities, "Tiangong" can cope with various complex terrains such as grasslands and sandy areas in outdoor environments with temperatures of up to 38°C and can also run on snow. Its maximum running speed on ordinary roads reaches 12km/h.

In February of this year, "Tiangong" also achieved climbing 134 stairs, becoming the world's first humanoid robot capable of continuously climbing multiple stairs outdoors and successfully completed power inspection tasks at the State Grid.

In addition, in terms of the open-source community, the National-Local Collaborative Innovation Center has attracted over a thousand developers to participate in data optimization and model training of the dataset, promoting the dissemination of technical achievements on platforms such as GitHub and Hugging Face, and jointly establishing the AGIROS open-source community with the Institute of Software, Chinese Academy of Sciences.

The development positioning of the Innovation Center is obvious:

They hope to become an enabler in the embodied intelligence industry, sharing leading technological achievements and injecting vitality into the entire industry.

After Deepseek open-sourced the R1 model, it directly triggered a wave of AI democratization, enabling excellent large models to be widely used in industries such as energy, finance, and telecommunications at a low cost. From this, we have already seen the huge driving force of open-sourcing for industry development.

The open ecosystem jointly built by the nation and local governments may be the fulcrum to leverage the robot industry from "laboratory stunts" to "social productivity".

With the continuous expansion of the open-sourcing and application of the "HSKW" platform, small and medium-sized enterprises do not need to reinvent the wheel, and developers can focus on scenario innovation.

The continuous open-sourcing of leading technologies will accelerate technological iteration.

In the future, robots are expected to enter industries, warehouses, logistics, households, and even disaster rescue sites with lower costs and stronger adaptability, changing human production and lifestyle.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.