2026 China EV100 Forum: Tsinghua University | Information Intelligence Era Nearing Its Peak by 2028, Physical Intelligence Era Just Unfolding with a 15-Year Opportunity Window

04/14 2026 360

The era of Information Intelligence is expected to reach its zenith around 2028. Meanwhile, the Physical Intelligence era is just dawning, with the next 10 to 15 years offering the most promising window of opportunity. Beyond this period lies the Biological Intelligence era, characterized by profound human-machine integration.

This three-phase assessment of AI development was presented by Li Shengbo from Tsinghua University at the 2026 China EV100 Forum. The Tsinghua team, which has been advocating end-to-end training approaches since 2018, not only introduced technical aspects this time but also provided a comprehensive understanding of the industry's developmental rhythm.

After reading this, you will gain insights into: Tsinghua's comprehensive assessment of AI's three-phase development, the three real challenges Chinese companies face in implementing end-to-end solutions and Tsinghua's proposed solutions, the evolutionary trajectory of three generations of simulation technology, and a comparative analysis of training challenges in intelligent driving and embodied intelligence for robots.

Three Phases of AI Development: Information Intelligence Nears Its Peak, Physical Intelligence Emerges

Tsinghua's assessment divides AI development into three sequential phases, each with distinct temporal boundaries and technological characteristics.

Phase One - Information Intelligence: From ResNet, AlphaGo, ChatGPT to DeepSeek, AI in this phase primarily operates in the digital realm, processing text, images, and voice. Tsinghua predicts that this phase will largely peak around 2028, with products like Doubao and ChatGPT already showcasing relatively mature interaction forms.

Phase Two - Physical Intelligence: AI ventures into the physical world, with autonomous driving and robotics as the primary implementation avenues. This phase is just commencing, with the next 10 to 15 years representing the most fertile ground for innovation, where numerous new technologies, methodologies, and companies will emerge.

Phase Three - Biological Intelligence: Quantum computing, artificial life, and deep human-machine integration. Tsinghua anticipates this phase will materialize in about 15 to 20 years or even further into the future.

These three phases are not parallel technological options but sequential historical stages. By 2026, the window for Physical Intelligence has already opened, yet most remain unaware of its magnitude.

End-to-End Implementation: Three Real Challenges for Chinese Companies

In his speech, Li Shengbo from Tsinghua did not shy away from a critical question: Despite numerous domestic companies discussing end-to-end models and VLA models, what are the genuine challenges? Tsinghua posed three direct inquiries.

Data Scale: Can our data volume rival Tesla's? Tsinghua's stance is clear—"Discussing model training without addressing data scale is futile." Data scale determines the fundamental performance boundary.

Computing Power Support: Can our computing power sustain training with hundreds of millions of parameters? Tesla relies on ultra-large computing power cloud platforms for continuous updates and iterations, posing a significant challenge for domestic companies.

Algorithm Path: Are our training paths still confined to supervised learning? The emergence of DeepSeek offers inspiration—more efficient algorithms can outperform paths reliant on data scale and computing power accumulation.

Tsinghua's Solutions: Generate data through simulation to address data scarcity; overcome performance limitations through reinforcement learning rather than imitation learning; and break free from computing power dependency through efficient algorithm design. These three points have been Tsinghua's primary research focuses since 2018.

Three Generations of Simulation Evolution: From Physics Engines to World Models, Far from Complete

Tsinghua posits a core proposition: Simulation is the key to addressing data scarcity, yet simulation itself is far from reaching its peak.

The first generation comprises simulation platforms based on physics engines, simulating sensors like LiDAR, cameras, and millimeter-wave radars. Tsinghua has been deeply involved in this direction since 2018.

The second generation features scene reconstruction technology based on 3D Gaussian reconstruction (NeRF/3DGS), significantly enhancing the realism of sensor simulation.

The third generation encompasses the world models currently under discussion—generative scenes capable of covering long-tail scenarios and directly producing training data for end-to-end models.

Tsinghua's GOPS reinforcement learning platform, continuously developed since 2021, integrates a series of self-developed efficient algorithms: DACC, RAD optimizer, Lipschitz neural networks, safety reinforcement learning RAX, multimodal reinforcement learning DENS, world model algorithm BOOM, perception filter Nano, and STAP for large models. The core objective of these algorithms is to enhance training performance under limited computing power conditions.

From Intelligent Driving to Robotics: Difficulty Increases Significantly

Tsinghua believes that many individuals working on autonomous driving are now shifting towards embodied intelligence, a direction it deems correct—the technology stacks of the two are highly interconnected. However, the difference in difficulty has been underestimated by many.

From a data scale perspective: Autonomous driving may require around 100 million data segments to achieve an entry-level, while robotics may necessitate billions or even tens of billions. From a model scale perspective: Intelligent driving can suffice with 1B to 10B parameters, while robotics requires 100B just to commence. From a training difficulty perspective: Whether supervised learning or reinforcement learning, robotics presents a 5 to 10 times increase in difficulty compared to autonomous driving.

The reason is evident: Cars operate on structured roads with limited interaction objects (pedestrians, vehicles, roadside facilities); robots must navigate arbitrary scenarios with far greater freedom, where everything within sight may be an interaction object.

Tsinghua's assessment: Cars represent the initial step in embodied intelligence and the optimal engineering training ground for the robotics era. The accumulated data flywheel, algorithm experience, and simulation capabilities will all become core assets for entering the robotics domain.

Tsinghua's Research Achievements and Open-Source Tools

Tsinghua also showcased a practical track record in its speech: The first domestic end-to-end autonomous driving model was completed by Tsinghua, fully neural network-based, covering the entire chain of perception, prediction, decision-making, planning, localization, and control, and verified through real-vehicle experiments in 2023.

Additionally, Tsinghua has made autonomous driving simulation software and reinforcement learning training software available to the industry, aiming to support the entire industry's development through open-source methods. The GOPS platform integrates mainstream algorithms, testing environments, and datasets, striving to solve enterprise training challenges in a one-stop manner.

The Role of Universities: Tsinghua provided a clear positioning in this speech—not to manufacture products but to offer simulation data generation capabilities, efficient algorithm research, and open-source toolchains to assist the entire industry in advancing under conditions of data scarcity and limited computing power.

After Reading This, You Have Gained

Era Judgment: Information Intelligence will reach its zenith by 2028, and the 10–15-year window for Physical Intelligence has just opened, representing the most crucial historical milestone.

Three Real Challenges: Data scale, computing power support, and algorithm efficiency—Tsinghua's direct inquiries into the current state of domestic end-to-end implementation, along with solutions through simulation + reinforcement learning.

Three Generations of Simulation Evolution: From physics engines to 3DGS to world models, the research thread of Tsinghua's GOPS platform and a series of efficient algorithms.

From Intelligent Driving to Robotics: Data requirements increase from 100 million to billions, model requirements from 1B to 100B, and training difficulty increases by 5–10 times. Cars represent the optimal training ground for the robotics domain.

"Physical Intelligence is actually just beginning to flourish now. The next 10 or 15 years will witness a surge of new technologies, methodologies, and companies in this field." — Tsinghua University · 2026 China EV100 Forum

This article is compiled based on the speech transcript of Li Shengbo from Tsinghua University at the 2026 China EV100 Forum, incorporating Jack's insights and AI skills, striving to objectively present the core information and industry trends of the speech, offering some information and inspiration to the industry, and does not represent the position of Vehicle.

*Unauthorized reproduction and excerpting are strictly prohibited.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.