08/26 2025
505
Editor's Note:
In the surge of embodied intelligence startups, the backgrounds and journeys of the founders shape the technological trajectory and commercial ethos of these enterprises.
Zenith Ventures once categorized entrepreneurs into four types: prodigies, veterans, scientists, and strategists. This framework inspired us to explore the entrepreneurial landscape within embodied intelligence and launch a series spotlighting the collective image of these companies.
Scientists, as university professors and researchers, represent the forefront of academic prowess, dedicated to technological R&D.
Veterans are serial entrepreneurs who navigate the ups and downs with seasoned insight.
Strategists bring mature methodologies and resources from large corporations into new ventures.
Prodigies, born after 1995, infuse their startups with youthful vigor and fresh perspectives.
At the intersection of these diverse backgrounds and paths, we gain a clearer picture of the embodied intelligence entrepreneurial landscape and its future trajectory.
While these four types encapsulate many entrepreneurs, the embodied intelligence sector is teeming with more factions. This series will continue to document the comprehensive view of embodied intelligence jointly painted by these diverse groups.
Author | Xiang Xin
In the nascent stages of startups, the aura of the founding team is often paramount to investors.
In embodied intelligence, this aura is particularly pronounced among scientists who transition from universities and research institutions, where they have long been immersed in technological R&D.
According to IT Juzi statistics, the top-funded robot companies in the first half of 2025 were Universal Robotics, Neolix, Yuanding Intelligence (pool robot), Zibian Robot, and Unitree Robotics.
Excluding Neolix and Yuanding Intelligence, which have a weaker association with embodied intelligence, it's intriguing to note that Universal Robotics and Zibian Robot, both exuding a strong "scientist aura," surpassed the funding scale of the established Unitree Robotics.
This underscores the allure of scientist-led enterprises in the capital market.
Among the 32 core companies in the embodied intelligence sector we analyzed, 16 were founded or led by scientists from prestigious universities like Tsinghua, Peking, Shanghai Jiao Tong, Zhejiang, Harbin Institute of Technology, and overseas institutions such as Berkeley and Stanford.
Their previous domains were laboratories and academic conferences, where their achievements were papers, codes, and prototypes. Now, they lead the industry frontier, pulling robots out of the ivory tower and into factories, homes, and society.
Scientists embody the cutting edge of academic research but also face the most challenging task: transitioning scientific research into commercial applications. Their strength lies in the former, while the latter presents the greatest uncertainty on their entrepreneurial journey.
They have forged distinct technological paths, with both consensus and divergence.
Tsinghua Talent as the Mainstay
We analyzed the backgrounds of 16 scientist-led startups in embodied intelligence, identifying 32 core entrepreneurs. Their academic backgrounds are concentrated in five top domestic and international universities and research institutions: Tsinghua University, Zhejiang University, Chinese Academy of Sciences, Harbin Institute of Technology, and Stanford University.
This is not surprising. In the 1990s, China initiated research on intelligent robots, with Tsinghua University, Zhejiang University, Chinese Academy of Sciences, and Harbin Institute of Technology among the pioneers, establishing robot projects, laboratories, or research institutes focused on mechanical design, robot control, and intelligent perception.
Tsinghua University founded the country's first intelligent robot laboratory in 1985. In 2004, the Tsinghua Robot Football Team emerged, later evolving into the "Tsinghua Vulcan Team," renowned for its achievements in RoboCup. Zhao Mingguo, co-founder and chief scientist of Accelerated Evolution, is the founder of the "Tsinghua Vulcan Team" and has long led the team in competitions.
Zhejiang University began research on humanoid robots in 2006, launching the "Wukong" series, overcoming challenges like dynamic balance and full-body coordinated control. "Wukong I" can even engage in hundreds of rounds of table tennis matches with humans or robots.
The Shenyang Institute of Automation under the Chinese Academy of Sciences is known as the "cradle of China's robot industry." In 1989, the Open Research Laboratory of Robotics at the Chinese Academy of Sciences, relying on the Shenyang Institute of Automation, was formally established and later approved as a State Key Laboratory of Robotics in 2007.
Stanford University has an even longer history, establishing an artificial intelligence laboratory in the 1960s to explore the integration of robots and AI.
This technological accumulation is evident in patents. According to the New Strategic Industry Research Institute, Tsinghua University, Harbin Institute of Technology, and Zhejiang University rank first, third, and fourth, respectively, in patent applications related to humanoid robots. IncoPat global patent database data shows that Zhejiang University and Tsinghua University rank first and second, respectively, in AI patent applications. From the perspective of the current or previous employment institutions of the entrepreneurial teams, Tsinghua University and Southern University of Science and Technology have become key incubators for scientist-led enterprises.
These statistics highlight Tsinghua talent as the backbone of scientist-led enterprises.
Currently, at least four scientist-led enterprises directly originate from Tsinghua University: Accelerated Evolution, Star Evolution Epoch, Star Atlas, and Qianjue Technology. Pan Jia, chief scientist of Zhuji Power, Sun Jie, founder of Dahuan Robotics, and Wang He, founder of Universal Robotics, are also Tsinghua alumni.
Beyond Tsinghua's deep roots in robotics and talent cultivation, Tsinghua-affiliated investment institutions play a crucial role in nurturing these enterprises. For instance, the Shuimuqinghua Alumni Fund has invested multiple times in Tsinghua-background teams like Accelerated Evolution.
For scientist-led enterprises, university research resources and funds provide direct access to the latest international technology, facilitating the transformation of scientific research results.
In terms of talent, most scientist-led enterprises have a unique advantage, with PhD students, postdoctoral fellows, and laboratory assistants often becoming the initial employees, forming a natural team extension.
Even deeper is the conduction of academic networks. Mentor recommendations, joint laboratories, and international academic exchanges constitute the invisible threads behind scientist-led entrepreneurship.
At the financing level, this network is also crucial. Investors often trust entrepreneurs from the "Tsinghua System" or the "Stanford System" due to the implied depth of research and technological accumulation.
Product Layout Driven by Technological Idealism
While all emphasize technology-driven approaches, scientist-led enterprises diverge into three main categories in terms of product and route selection:
Body/Cerebellum Faction: Focuses on the robot's body, movement, and perception capabilities. Representative enterprises include Yuequan Bionic, Accelerated Evolution, and Pacinian Perception.
Full-Stack Faction: Covers "body + cerebellum + brain" and attempts to master the complete ecosystem. Representative enterprises include Star Evolution Epoch, Star Atlas, and Zhuji Power.
Large-Model/Component Faction: Does not produce complete machines but focuses on embodied large models or key components, such as Qianjue Technology, Daimeng Robotics, and Dahuan Robotics.
Most scientist-led enterprises tend towards the full-stack route. This reflects not only the high coupling of robot system software and hardware but also the technological preferences of academia: starting from the system level to build a complete chain for technology implementation.
This "end-goal orientation" is also evident in the choice to develop full-size humanoid robots. Except for Accelerated Evolution, which focuses on 1.2-meter small-size models, most enterprises develop full-size robots over 1.6 meters tall, approximating adult size.
Behind this choice lies the hope that robots can truly adapt to human environments, possess strong load-bearing and complex operation capabilities, thus achieving versatility in broader scenarios.
Another noteworthy phenomenon is that among full-stack enterprises, teams focusing more on large models mostly choose wheeled humanoid robots, such as Universal Robotics, Zibian Robot, Kuawei Intelligence, and Star Atlas.
Wheels lower the R&D threshold and reduce human input compared to bipeds, allowing limited team resources to concentrate on large model development.
Overall, scientists have explored various aspects of hardware and full-stack development, but the real differentiator lies in breakthroughs in large models.
Four enterprises focused on embodied large models and receiving significant industry attention are Star Evolution Epoch, Universal Robotics, Zibian Robot, and Star Atlas.
These enterprises share a commonality: they have all developed end-to-end VLA models, synchronized with international cutting-edge research from Figure, PI (Physical Intelligence), NVIDIA, and others.
They generally believe that only end-to-end large models can achieve task generalization, avoiding the fragmentation of traditional hierarchical architectures. This approach aligns with the technological paradigm validated in autonomous driving.
Whether focusing on the body, covering the full stack, or exploring large models and components, scientist-led enterprises leverage their academic accumulations to find industrial landing points.
Despite different paths, almost all teams converge on a consensus: embodied large models are the "core battlefield" of the future. Therefore, the differences among scientist-led enterprises may ultimately concentrate on their understanding and practice of large models.
Route Differences in Embodied Large Models
The core competitiveness of embodied large models lies in model algorithms and data systems.
From a model algorithm perspective, among Star Evolution Epoch, Universal Robotics, Zibian Robot, and Star Atlas, only Zibian Robot emphasizes the unity of the cerebellum and brain in its model, while others use a similar dual-system architecture, dividing high-level understanding and planning from low-level motion control within a system, akin to Figure AI's Helix model and PI's π0 model.
Star Evolution Epoch has two distinctive features:
First, it integrates a world model into the large model to enhance the robot's understanding of the physical world.
Second, it draws on the Sora concept and utilizes AIGC generative technology to help robots predict future scenarios by generating videos, enabling robots to "look at the answer" and act, significantly enhancing generalization ability.
The current ERA-42 model integrates vision, understanding, prediction, and action, achieving full-body dexterous operation of a high-degree-of-freedom humanoid robot through the same end-to-end VLA model. Voice commands can complete hundreds of complex operations, including flexible item sorting, scanning, using screwdrivers, pipettes, etc.
The architecture of Universal Robotics' model GraspVLA is essentially the same as PI's π0 model, also consisting of a VLM and an action expert model based on flow matching.
GraspVLA's most notable feature is its exceptional generalization ability. As the world's first model capable of zero-shot generalization through pre-training alone, it adapts seamlessly to variations in environmental conditions such as height, plane position, object category, lighting, interference objects, and background. Moreover, it boasts autonomous decision-making capabilities and robust anti-interference abilities.
Building on GraspVLA, Universal Robotics has introduced GroceryVLA, an end-to-end large model tailored for retail commercialization scenarios. In environments where shelves are densely packed with diverse goods, GroceryVLA eliminates the need for individual parameter tuning per good type, enabling a comprehensive range of goods to be grasped without prior scene collection, thereby streamlining deployment significantly.
WALL-A, the end-to-end general embodied large model for autonomous variable robots, also demonstrates strong generalization and versatility. At the recent WRC, it showcased tasks like making sachets, handling household chores, sorting express deliveries, and industrial assembly (learning to assemble belts in less than two days), displaying robust adaptability to soft objects, dynamic environments, and items with diverse appearances.
However, compared to models from other companies, the autonomous variable model integrates the diverse information received by the robot to a deeper extent, achieving end-to-end information fusion.
The autonomous variable robot integrates information by converting all input modalities, including multi-view images, text instructions, and real-time robot states, into a unified token sequence through respective encoders. This enables the robot to efficiently align multiple information channels such as vision, language, and actions, significantly enhancing its context reasoning and self-feedback capabilities in ultra-long sequence tasks.
In terms of data systems, while all four companies utilize multi-modal data like video, language, actions, and teleoperation, their approaches differ, resulting in two distinct data strategies.
The first strategy is scale-prioritized, emphasizing low-cost, large-scale data accumulation to win by volume. This supports end-to-end training of embodied large models, adhering to the principle that "great effort leads to miracles." Xingdong Jiyuan and Yinhe Tongyong fall under this category.
Xingdong Jiyuan uses video data for pre-training, including abundant Internet human videos and AI-generated videos, making them readily accessible.
Yinhe Tongyong, on the other hand, adheres to synthetic simulation data. It has developed a comprehensive synthetic simulation data production pipeline for pre-training end-to-end VLA models, capable of generating the world's largest billion-level robot operation dataset (encompassing video, language, and action modalities) within just one week.
The second strategy is quality-prioritized, focusing on high-quality real-machine data to enhance the model's generalization ability and learning efficiency. Xinghaitu and autonomous variable robots have adopted this approach.
Xinghaitu has constructed the Galaxea Open-World Dataset, the world's first high-quality real-machine dataset for open scenarios. It covers 50 environments, including residences, kitchens, retail, and offices, totaling 500 hours of high-quality mobile operation data, encompassing over 150 tasks, more than 1,600 operational objects, and 58 operational skills.
Wang Qian, founder of the autonomous variable robot, introduced that the company has independently developed a series of data collection devices, with dozens of models supporting the data system of the autonomous variable robot.
Wang Qian believes that in the Scaling Law, data quality is paramount, followed by diversity, and finally quantity. In his experience with large model training, he found that high-quality data can often bring significant improvements with just a few hundred or thousand pieces, while low-quality data, even in billions, may degrade model performance.
The difference in data selection strategies essentially reflects distinct judgments on the bottlenecks in embodied intelligence development. The former aims to address data scarcity, while the latter focuses on data effectiveness.
This also represents two different perspectives of scientist-led enterprises on embodied intelligence large models: whether to rely on scale to stimulate intelligence or to solidify the foundation for implementation through high-quality data.
Technical advantages and commercial challenges are both amplified.
Scientists, as the most technologically valuable force in the embodied intelligence race, possess distinct advantages and equally notable shortcomings.
Their advantage lies in technological foresight, often anticipating future technological directions earlier than the market.
For instance, Xingdong Jiyuan applied the fast-slow system architecture of the HiRT framework (Xingdong Jiyuan founder Chen Jianyu is also one of the authors of the HiRT paper) to its self-developed end-to-end native robot large model ERA-42 in 2024. In 2025, the Helix model released by American star humanoid robot enterprise Figure AI adopted a highly similar architecture to HiRT.
The autonomous variable robot also anticipated technological trends in advance.
Wang Qian introduced that they began developing the any-to-any model in October to November 2024 to achieve multi-modal input and output, and simultaneously completed the research and development of the embodied chain of thought (COT). This aligns closely with the progress announced by Google Gemini robotics in March 2025 and the recent technical direction of the π0.5 model from Physical Intelligence (PI), keeping pace with the international forefront.
Their disadvantage lies in potential insensitivity to commercialization. Scientist founders are accustomed to academic logic and strive for perfect solutions, whereas market requirements often accept good-enough solutions.
Balancing technical ideals and scenario requirements poses a significant challenge.
Currently, there are primarily three commercialization paths for scientist-driven embodied intelligence enterprises.
The first is to cover multiple scenarios, including home, industrial, and commercial services, with participants such as Xingdong Jiyuan, Leju Robot, and autonomous variable robots. This approach is attractive for demonstrating the versatility of humanoid robots and aligns with industry expectations for their ultimate application. However, at the commercialization level, it may be challenging to find a core foothold for a closed loop, often leading to the dilemma of "being proficient in everything but not excelling in anything".
The second path involves initially focusing on specific scenarios or skill directions.
For example, JiaSuJinHua focuses on enhancing robot motion performance, primarily targeting scientific research scenarios for sales. Yinhe Tongyong, on the other hand, focuses on the closed-loop operation of "move-grasp-place," addressing numerous "grasp-place-move" tasks in retail and industrial scenarios.
The third path involves not directly engaging in scenario development but providing embodied intelligence infrastructure, such as Xinghaitu and Zhuji Power. These enterprises primarily offer basic general software and hardware (including embodied general large models + robot bodies) to various developers, enabling individual developers or scenario parties to develop applications.
Currently, embodied intelligence applications are still in the exploratory stage, with an uncertain commercial path, making it difficult to assess their merits. However, the true test for embodied intelligence in an enterprise is not just the technology itself but also its sustained capabilities in organizational building, fund utilization, and strategic landing points.
Technological breakthroughs are merely the starting point. Transforming these breakthroughs into scalable products and markets is the critical juncture that determines survival.
In the early stages, scientist entrepreneurs often quickly gain capital favor due to their academic aura and technical accumulation.
However, as the company progresses into the commercialization phase, investors shift their focus to more direct indicators: order quantity, customer structure, and landing cases. At this juncture, a lack of a clear business model and market foothold can tighten the financing chain and even stagnate enterprise development.
The strengths and weaknesses of scientist entrepreneurs are simultaneously amplified: they stand at the forefront of technological advancements but must also face the harsh realities of the market.
In the future, their ability to find a clear commercial anchor while maintaining scientific foresight will determine whether scientist entrepreneurs will be the "pioneers" leading embodied intelligence or merely "forerunners" confined to the laboratory and demo stage.