02/08 2025
579
The open-source DeepSeek serves as a catalyst for the evolution of humanoid robots, propelling the field forward.
If ChatGPT and Sora marked the beginning of 2023 and 2024 in the tech industry, respectively, there's no doubt that DeepSeek (R1) will be the highlight of 2025.
Consider this: from China to abroad, technology and internet companies continue to announce their integration with or development of products based on DeepSeek. While many of these announcements may be marketing tactics to stay current, more manufacturers are recognizing the immense potential of DeepSeek R1, including those seemingly unrelated to humanoid robots.
On February 7, UBTech, the pioneering domestic humanoid robot company, officially stated that it is validating the effectiveness of DeepSeek technology in various humanoid robot scenarios, such as multimodal human-computer interaction, command understanding in complex environments, and task decomposition and planning in industrial settings. "We expect to leverage the deep thinking capabilities of large reasoning models to solve challenges in these complex tasks, making humanoid robots more akin to human thinking and behavior."
UBTech may not be an isolated case.
Just two days prior, Figure, a Silicon Valley humanoid robot company that once garnered global attention, saw its founder Brett Adcock suddenly announce on platform X the decision to terminate cooperation with OpenAI and shift to internally developing end-to-end robot AI.
Image/ X
Brett Adcock also revealed that Figure has achieved a "major breakthrough" and will showcase something unprecedented on humanoid robots within the next 30 days. Given the recent AI industry's focus on the open-source model of DeepSeek, many believe that Figure is developing robot AI based on open-source models like DeepSeek R1.
Even Unitree Technology, known for creating affordable humanoid robots, has been rumored to have reached a deep cooperation agreement with DeepSeek. It's plausible that more humanoid robot manufacturers are considering and planning new routes based on the DeepSeek model, akin to UBTech.
Regardless of their current stance, the AI narrative being rewritten by DeepSeek is poised to change the direction of the humanoid robot industry.
How does DeepSeek clear the path for the popularization of humanoid robots?
The shifts by UBTech and Figure are not coincidental. DeepSeek precisely addresses the current pain points of humanoid robots, with "cost" being the most prominent.
Those familiar with humanoid robot research and development know that training an embodied intelligent robot demands significant computing power, which translates into substantial financial investment. Tech giants wield influence in AI not only due to their technological leadership but also because they have the financial resources to support exorbitant computing costs.
However, for most humanoid robot startups, the high cost of computing power is a significant burden. Additionally, the lack of data is a common dilemma faced by the entire industry. Even OpenAI disbanded its humanoid robot team four years ago due to an extreme lack of data.
This explains why the open-source of AgiBot World, a million real-machine dataset by iFlytek Robotics, garnered widespread attention at the end of last year.
AgiBot World dataset on GitHub, Image/ Leitech
Peng Zhihui (Zhihui Jun), co-founder and chief technology officer of iFlytek Robotics, mentioned that the cost and threshold for collecting real-machine data in embodied intelligence are very high. However, some insiders view the "million real-machine dataset" as merely a drop in the ocean. "It can only train the generalization of one action, such as sorting, which is far from sufficient to achieve the ideal state of embodied intelligence."
It's like a talented athlete struggling without a training venue and equipment, ultimately blending into the crowd. But the emergence of DeepSeek R1 may well have changed all that.
The most direct advantage is the input price of 4 yuan per million tokens (cache miss) and 1 yuan per million tokens (cache hit), with an output price of 16 yuan per million tokens, making DeepSeek R1's reasoning cost advantage clear at a glance.
Image/ DeepSeek
In comparison, the API prices for the official version of OpenAI o1 are 55 yuan, 110 yuan, and 438 yuan, respectively. This alone allows humanoid robot companies to escape the shackles of exorbitant computing costs, enabling them to invest more funds in robot body research and development, accelerating product iteration and upgrades.
More fundamentally, DeepSeek R1, a top-tier reasoning model comparable to OpenAI o1 in performance, provides powerful mathematical, coding, and natural language reasoning capabilities at a low cost. As UBTech hopes, it improves humanoid robots' command understanding, task planning, and decomposition for complex tasks, "making humanoid robots more akin to human thinking and behavior."
Moreover, a significant advantage of DeepSeek R1 lies in its algorithm improvements and optimizations, unlike before, which required a vast amount of data. It automatically screens high-value data through data distillation and generates synthetic data through adversarial training, reducing the cost of acquiring high-quality code data from 0.8 yuan per 100 tokens to 0.12 yuan.
The official press release of DeepSeek-R1 also mentions that DeepSeek-R1 extensively uses reinforcement learning techniques during the post-training phase, "greatly enhancing the model's reasoning ability with only minimal labeled data."
Image/ DeepSeek
This is crucial for embodied intelligence grappling with data challenges and points the direction for the entire field. Additionally, DeepSeek-R1 is open-source, allowing developers and manufacturers to freely modify and distill it.
In other words, all humanoid robots can develop top-tier reasoning models more suitable for them based on DeepSeek-R1 or even retrain models truly belonging to embodied intelligence along the DeepSeek technical route. While it's difficult to directly compare horizontally, DeepSeek R1 undoubtedly significantly reduces the training cost of top-tier models while matching the overall performance of OpenAI o1.
The open-source DeepSeek: a catalyst for the evolution of humanoid robots
Today, the most crucial technologies for creating a humanoid robot lie in electromechanics and large models. If the former corresponds to the "body," the latter corresponds to the "brain".
Apart from "cost," technological autonomy is also a significant factor constraining the development of humanoid robots. Generally, humanoid robot manufacturers either develop models independently or choose to cooperate with large model vendors like OpenAI, Google, iFlytek, Huawei, etc. The most typical example is the cooperation between Figure and OpenAI.
The rationale behind cooperation is straightforward. Independent development entails high training costs and technical challenges in keeping up with leading AI giants. However, cooperation also means technological dependence, not only facing substantial reasoning costs but also being constrained by large model vendors in terms of model adaptability and development pace.
Figure 02 at BMW factory, Image/ Figure
DeepSeek's open-source strategy offers a new option for humanoid robot companies. Through open-source, DeepSeek enables these companies to conduct secondary development based on its model, build their technological closed loops, and truly grasp the initiative in core technologies.
Open-source means that more robots can directly utilize advanced AI technologies without having to develop their own solutions from scratch. For humanoid robot manufacturers like UBTech and Figure, this reduction in the technological threshold allows them to focus more resources on hardware development and the optimization of practical applications.
Particularly, the emergence of DeepSeek has encouraged more humanoid robot manufacturers to see the advantages of independent research and development. It not only significantly reduces costs and technical thresholds but also enables the customization of their large models while building technological closed loops.
Taking Figure as an example, a direct reason for its split with OpenAI may be that OpenAI has long been exposed to be internally developing its own humanoid robots. It's foreseeable that compared to Figure, which is a "stepchild" through shareholding, OpenAI will undoubtedly prioritize the needs and development routes of its "own children" when developing models.
However, the deeper issue is that Figure would always be constrained by OpenAI's technical routes and costs. If the cooperation continued, it would likely face a significant cost disadvantage. Conversely, benefiting from the power and economy of open-source models like DeepSeek, not only may there be an advantage in overall costs, but more importantly, it enables the construction of an autonomous technological closed loop that adapts to one's iteration routes and product rhythm.
DeepSeek-VL: One more thing for humanoid robots?
Not only DeepSeek-R1 but also, while many are still stunned by R1's ultra-high cost-effectiveness, they are unaware that DeepSeek has just proposed a brand-new visual language model - DeepSeek-VL.
New DeepSeek paper, Image/ DeepSeek
Although only the 1.3B and 7B versions of the model have been publicly released, DeepSeek-VL pursues leading performance based on real-world scenarios while emphasizing the retention of powerful "language capabilities" - something that may have been overlooked by many large models and humanoid robot manufacturers.
The DeepSeek team emphasized in the paper, "During the training process, we not only conduct multimodal data training but also incorporate a large amount of language data into the training."
Whether this technical route proves correct will ultimately be determined by practice, just like DeepSeek-R1. But for humanoid robot manufacturers, beyond DeepSeek-R1, DeepSeek-VL indeed has the potential to become the game-changer accelerating the implementation of humanoid robots.
Source: Leitech