03/12 2025
322
Has DeepSeek-R1 for the robotics industry been developed by Zhiyuan Robotics?
The experience of Huawei's "Genius Youngsters" program has brought great popularity to "Peng Zhihui" Peng Zhihui and also made Zhiyuan Robotics, which he founded, obtain top-tier industry traffic. However, if technical strength does not match the traffic, it will eventually turn into public opinion attacks—just like Manus, which was recently suspected of over-marketing.
Fortunately, Zhiyuan Robotics has demonstrated its strength time and again. Recently, it launched the world's first large-scale universal embodied intelligence base model, Genie Operator-1 (GO-1 for short), and plans to open source it to core users by the end of the first quarter, allowing users to deploy it on their own robots.
(Image source: Zhiyuan Robotics)
After the release of DeepSeek-R1, multiple global AI companies have open-sourced their large models and acknowledged that open sourcing can accelerate progress in the AI industry. The current open-source level of the GO-1 model is not as high as that of DeepSeek-R1, which adopts the MIT open-source license. However, Zhiyuan Robotics has previously open-sourced the AgiBot World dataset, toolchain, and pre-trained model used to train the GO-1 model, and it is likely to open source the GO-1 core code and model in the future.
VLM+MoE, Zhiyuan leads robots into the AI era
Only by open-sourcing the model can Zhiyuan Robotics have the opportunity to become the DeepSeek of the robotics industry. However, open sourcing does not necessarily guarantee a status comparable to DeepSeek; ultimately, it all comes down to strength.
The GO-1 model developed by Zhiyuan Robotics is based on the Vision-Language-Latent-Action (ViLLA) architecture, which combines a multimodal large model (VLM) and a Mixture of Experts (MoE). The VLM serves as the backbone network of the embodied base model, inheriting the weights of the open-source large model InternVL-2B developed by the Shanghai AI Laboratory. It can achieve scene perception and language understanding and train itself using video and image data from the internet. The model can also integrate multi-view vision and force signals, endowing itself with general scene understanding capabilities to enable more complex operations.
The MoE is divided into two parts: an implicit planner and an action expert. The implicit planner can utilize human or cross-agent videos from the internet to generate latent action tokens based on the intermediate layer output of the VLM, forming a chain of planning to achieve general action understanding and planning. The action expert can optimize the generation and output efficiency of latent action tokens with the help of simulation or real machine data and obtain high-precision action execution capabilities.
(Image source: Zhiyuan Robotics)
Its dynamic adjustment mechanism can also improve task efficiency for tasks such as image description and OCR parsing, reduce data annotation costs, and optimize resource allocation.
Based on the ViLLA architecture, the GO-1 model constructs a digital pyramid. The most basic internet plain text and image-text data can help robots understand general knowledge and scenes; the second layer of large-scale human or cross-agent videos can help robots learn action operation modes of humans or other agents; the higher layer of simulation data can enhance the versatility of robots; finally, real machine demonstration data helps robots train precise action execution capabilities.
Beyond the model architecture, the data used to train the model is equally important. The latest version of AgiBot World contains 1,001,552 trajectories covering five key scenarios: home, retail, industry, restaurants, and offices. It is currently the world's largest real machine demonstration dataset for robots.
Unlike ordinary trajectories that do not exceed 5 seconds, the trajectories in AgiBot World can span about 30 seconds, with some trajectories even reaching 2 minutes. Official data from Zhiyuan Robotics shows that the pre-training mode adopted by AgiBot World improves average performance by 30% compared to Google's Open X-Embodiment training strategy and increases the average success rate of complex operations completed by existing large robot models by 32%.
(Image source: Zhiyuan Robotics)
Compared with traditional robot models, the GO-1 model has improvements in architecture, data, pre-training mode, etc., improving resource utilization efficiency and model capabilities. It can significantly reduce the costs required for robot training and operation execution, with a role similar to the DeepSeek-R model. The only difference between the two may be the open-source model.
NVIDIA CEO Jen-Hsun Huang once predicted that the robotics industry will experience a significant breakthrough within two to three years and will become as prevalent as cars in the future. As robot technology matures from its nascent stage, Zhiyuan Robotics is continuously promoting mass production of its products. Recently, 1,000 robots rolled off the production line, the globally open-sourced Lingxi X1 has completed its first batch of deliveries, and the more powerful Lingxi X2 was launched in Shanghai on March 11. In future competition in the robotics industry, the GO-1 model may become one of Zhiyuan Robotics' core competitiveness.
Lowering the threshold, will GO-1 become the next DeepSeek-R1?
Sun Xiaogang, CEO of Argus Intelligent Technology Co., Ltd., said that according to the current development process of the robotics industry, it is feasible to reduce the price of robots to within RMB 50,000 within three to five years.
Robots priced within RMB 50,000 are within the acceptable range for ordinary consumers, but the premise for purchase is that their functions are powerful enough to help us handle some daily chores. The key to determining a robot's capabilities lies in both hardware and intelligence, and the GO-1 model aims to enhance a robot's intelligence.
The ViLLA architecture endows the GO-1 model with the ability to learn from human videos. The vast video resources on the internet will become the "nourishment" for robot evolution, enabling robots to perform complex operations more efficiently.
Secondly, the rapid generalization from small samples allows the GO-1 model to generalize to new scenarios with minimal or even zero samples, without requiring a huge amount of data, thereby significantly reducing training costs and lowering the threshold for the robotics industry.
Most importantly, GO-1 is a "one brain, multiple forms" universal robot model that can migrate and quickly adapt between different types of robots such as bipeds and wheeled robots. Different types of robots may require completely different AI large models to adapt to the working mode of the robot hardware, which is one of the difficulties for robot development enterprises. The GO-1 model breaks the convention and can easily adapt to various robots, undoubtedly further reducing the development and adaptation costs of robot models.
(Image source: Zhiyuan Robotics)
In addition, the GO-1 model also has the ability to continue evolving. Problems encountered in daily work will have their data fed back into the system for robot training and functional upgrades.
Although the GO-1 model will only be open-sourced to core users by the end of this month, it is not difficult to see from its functional characteristics that GO-1 has already "written" that it will be open-sourced. The entire large model is almost entirely oriented towards improving capabilities and reducing costs. The rapid generalization from small samples and the ability of "one brain, multiple forms" enable robot enterprises with insufficient strength to quickly develop AI systems suitable for robot products with the help of the GO-1 model and realize mass production with the help of the increasingly mature robot supply chain in China.
Similar to the new energy vehicle industry, the arrival of the robot era is an opportunity. Car companies such as BYD, XPeng, and Thalys have already entered the market, and many lesser-known small enterprises have also joined. It is currently uncertain which companies will grow into leading brands. The well-known Unitree Robotics is expected to become the BYD of the robotics industry, while Zhiyuan Robotics, which developed the GO-1 model, has the opportunity to become the "NIO, XPeng, Li Auto" of the industry.
Similarly, many enterprises will fall behind on the development path of the robotics industry. The difference is that in the new energy vehicle industry, industrial strength is more important than AI capabilities, while in the robotics field, the importance of AI capabilities is at least equal to or even greater than industrial strength. Moreover, during the development of the robotics industry, the supply chain will continue to integrate, hardware will converge, and robots will ultimately compete on intelligence.
(Image source: Zhiyuan Robotics)
As Yao Maoqing, Executive Dean of the Research Institute and President of the Embodied Business Department of Zhiyuan Xinchuang Technology Co., Ltd., said, for robot companies, if you do not develop a large model, you will not have a future in the robot industry. Without intelligence and operational capabilities, it is just a hardware device.
Zhiyuan Robotics' multi-category robots such as Yuanzheng, Lingxi, and Juechen cannot fully utilize the value of the GO-1 model. Opening source the model and allowing other enterprises to modify, deploy, and commercialize it can maximize the value of the GO-1 model and play a role in promoting the development of the robotics industry.
Opening source the model to core users is just the beginning. In the future, the GO-1 model is likely to be open-sourced to the entire industry, enhancing the strength of other robot enterprises. Only after being open-sourced can the GO-1 model obtain the status of DeepSeek-R1 and become a driver of industry development. Zhiyuan Robotics itself is also a hardware product development company. With the reputation of GO-1, it may further increase its popularity and product sales, rather than being known primarily for its connection to Huawei's "Genius Youngsters" program when Zhiyuan Robotics and Peng Zhihui are mentioned.