What Has Huawei's Pangu Large Model Done in the Autonomous Driving Field?

Home

Finance

ICV

Smart City

Digital Live

Cloud

Optics

Home Finance AI ICV Smart City Digital Live Cloud Optics

06/16 2026 371

Recently, Huawei's Pangu Large Model has been widely discussed due to certain events. As a platform focused on technical content, Intelligent Driving Frontier only discusses technology, not gossip, but still wants to ride the wave of this hot topic. Today, based on publicly available information online and some official promotional content, we will sort out the Pangu Large Model and its technical layout in the field of autonomous driving. We also hope for rational comments from everyone!

Technical Architecture and Evolution of the Pangu Large Model

The Pangu Large Model was first officially released at Huawei's Developer Conference in April 2021. Initially, it mainly included three foundational models: NLP (Natural Language Processing), CV (Computer Vision), and scientific computing. The Pangu NLP Large Model was the industry's first Chinese pre-trained large model with hundreds of billions of parameters, while the CV Large Model had 3 billion parameters, both of which were leading in the industry at the time.

In April 2022, the Pangu Large Model was upgraded to version 2.0, formally establishing a hierarchical development architecture of L0, L1, and L2. In the same year, Huawei successively released industry-specific large models for vertical scenarios such as mining, meteorology, and ocean waves, marking the extension of Pangu from a general-purpose large model to industry applications. In July 2023, the Pangu Large Model 3.0 was officially released, establishing a 5+N+X hierarchical architecture and explicitly positioning itself as 'not writing poems, but getting things done,' focusing on B-end industrial scenario implementation.

Image Source: Internet

Subsequently, the Pangu Large Model maintained an annual upgrade pace, releasing version 5.0 in June 2024, which introduced Spatio-Temporal Controllable Generation (STCG) technology; version 5.5 in June 2025, with comprehensive upgrades to the five foundational models; and officially releasing openPangu 2.0 in June 2026. It plans to open up seven core components, including pre-training code, post-training code, and training operators, in batches starting from June 30, 2026.

The underlying training of the Pangu Large Model is based on Huawei's self-developed Ascend AI cloud services. In terms of hardware, the new generation of Ascend AI cloud services released in June 2025 adopts the CloudMatrix 384 super-node architecture, integrating 384 Ascend NPUs and 192 Kunpeng CPUs through full peer-to-peer interconnection into a super AI server. The single-card inference throughput reaches 2300 Tokens/s, approximately four times higher than non-super-node architectures. This cloud service also supports mainstream AI frameworks such as PyTorch and TensorFlow and provides operator migration tools to transfer most operators developed on GPU platforms to the Ascend platform.

At the software architecture level, the Pangu Large Model adopts a 5+N+X three-layer design. The L0 layer includes five foundational large models: natural language processing, computer vision, multimodal, prediction, and scientific computing, forming a general-purpose capability base through pre-training with hundreds of billions of parameters. The L1 layer builds industry-specific large models through industry data injection training on top of the foundational models, covering fields such as government affairs, finance, manufacturing, mining, and meteorology. The L2 layer focuses on fine-tuning for specific business scenarios, providing scenario-based model services. This hierarchical and decoupled design allows customers to independently load datasets, upgrade foundational models or capability sets separately, and choose deployment forms such as public cloud, large model cloud zones, or hybrid cloud based on data security and compliance requirements.

Image Source: Internet

The Pangu Large Model version 5.5 was officially released in June 2025, with upgrades to all five foundational models.

The NLP Large Model introduced a 718-billion-parameter MoE deep thinking model composed of 256 experts, enhancing capabilities in knowledge reasoning, tool invocation, and mathematics. Adaptive fast-slow thinking integration allows the model to automatically switch thinking modes based on problem difficulty, providing quick responses to simple questions and mobilizing more computational power for complex reasoning, improving overall reasoning efficiency by eight times. Additionally, Pangu DeepDiver undergoes exploratory training in real internet environments through search intensity scaling technology, with the 7B-scale DeepDiver performing comparably to the 671B-scale DeepSeek-R1 in multiple benchmark tests.

The CV Large Model was upgraded to a 30-billion-parameter MoE architecture visual large model, claimed to be the largest visual model in the industry at the time, fully supporting multi-dimensional perception, analysis, and decision-making across images, infrared, laser point clouds, spectra, and radar.

The prediction large model adopted a triplet transformer unified pre-training architecture, encoding data from different industries (such as tabular data of process parameters, time series data of equipment operation logs, and image data of product inspections) into unified triplets for efficient processing and pre-training within the same framework.

The upgrade direction of the multimodal large model is the World Model, which will be discussed separately later.

The scientific computing large model collaborated with the Shenzhen Meteorological Bureau to upgrade the ZhiJi Large Model, achieving AI ensemble forecasting for the first time; and with the Chongqing Meteorological Bureau to create the Tianzi·12h Meteorological Large Model for intra-day forecasting and warning of disastrous weather.

Overall, the technical route of the Pangu Large Model emphasizes industry implementation rather than general-purpose dialogue capabilities, with its hierarchical design and optimization for computational efficiency centered around this goal.

Pangu World Model and STCG: A New Path for Autonomous Driving Development

In autonomous driving development, data has always been the most critical bottleneck. To achieve sufficient reliability in autonomous driving systems, theoretically, training based on hundreds of billions of kilometers of driving data collected from real roads is required, which is an unbearable cost for any automaker. The Pangu Large Model proposes solutions to this problem in two stages: from STCG to the World Model.

1) STCG: Enabling Models to Understand the Physical World

Spatio-Temporal Controllable Generation (STCG) technology, introduced in Pangu version 5.0, focuses on enabling large models to generate driving videos that are not only visually realistic but also comply with physical laws. Unlike traditional simulation tools that rely on game rendering engines, STCG directly embeds modeling of spatial structure and temporal changes within the model. Vehicle transitions between different camera perspectives are smooth, and vehicle behavior aligns with real-world logic under different weather and lighting conditions, such as automatically turning on taillights in rainy conditions. During the HDC 2024 live demonstration, the model generated scenarios ranging from deserted streets to complex traffic conditions with multiple vehicles and simultaneously changed vehicle details when switching between sunny and rainy conditions with a single click.

Image Source: Internet

From a technical implementation perspective, Pangu added three input modules—3D bounding box encoder, BEV road network encoder, and camera trajectory encoder—to the VAE and DiT architectures of the video generation large model. Through joint processing of 3D bounding boxes and BEV road network maps, multi-view associative learning can be achieved. Its training data uses camera data from six perspectives, with a cumulative collection and processing of 200,000 high-quality frames. Combined with scenario video generation, 4D BEV video generation, autonomous driving simulation libraries, and road network information, STCG can mass-generate physically consistent driving video data and flexibly add control conditions to customize training data for different road conditions, lighting, and weather. STCG can also generate random, occasional, and adversarial scenarios—edge cases that are difficult to obtain in large quantities through real-world road collection in autonomous driving development.

2) World Model: From Video Generation to Building Digital Spaces

The Pangu World Model, released in 2025, is built on the multimodal large model. It requires minimal input—in the field of intelligent driving, only the initial driving scene, driving control information, and road network data are needed—to generate driving videos from each camera's perspective and corresponding LiDAR point cloud data. In other words, starting from an initial state, the model can continuously imagine the entire subsequent driving process, generating video continuation capabilities at 30 frames per second.

A typical application of the World Model in the autonomous driving field is the reconstruction of complex edge scenarios. GAC Group, in collaboration with Huawei Cloud, achieved pixel-level precise correspondence between 2D videos and 3D point cloud data based on the Pangu multimodal large model, enabling complex scenario restoration within minutes. GAC also developed the Shenxing Simulation Platform on this basis, improving geometric consistency in controllable video generation by 80%. Since traditional simulation scenario construction requires extensive manual modeling, while the World Model can directly generate complete simulation environments from limited input for iterative training of end-to-end autonomous driving models, this ability to quickly reconstruct edge scenarios is difficult to achieve with traditional simulation tools.

Image Source: Internet

The World Model also demonstrates Pangu's capabilities in broader physical simulation scenarios. In a Mars exploration demonstration, based on a single image of the Martian surface, the World Model could generate high-precision digital physical spaces for obstacle avoidance training and robotic arm operation simulation for Mars rovers. Although not directly related to autonomous driving, this reflects the model's foundational capabilities in multimodal generation and physical law modeling.

It must be supplemented here that there is still ongoing discussion in the industry about whether simulated data can fully replace real-world road collection data. Issues such as distribution bias in simulated data and overfitting of models in simulation environments have not been completely resolved. However, STCG and the World Model at least provide a method to increase data diversity and compensate for the scarcity of real-world data. Their value lies in helping developers more efficiently cover more edge scenarios rather than completely replacing real-world road testing.

Octopus Platform: Engineering Integration of Technical Capabilities

The capabilities of the Pangu Large Model do not exist independently but are open to automakers and developers through Huawei Cloud's Octopus Autonomous Driving Cloud Service Platform. Octopus is a one-stop, fully managed autonomous driving development platform that integrates toolchains for data annotation, model training, and simulation testing.

Image Source: Internet

In the data annotation link (process), the Pangu Large Model provides automatic annotation capabilities, supporting 2D, 2.5D, and 3D automatic annotation, with an claimed annotation accuracy exceeding 90%. In terms of scenario understanding capabilities, the model can replace manual labor in classifying and tagging video clips, processing tens of thousands of video segments within minutes. For data retrieval, the platform supports multimodal retrieval capabilities such as text-to-image and image-to-image searches, enabling minute-level retrieval in million-image libraries.

Huawei's Octopus Autonomous Driving Cloud Service Platform also provides parallel simulation capabilities, utilizing cloud resources to run 1000+ simulation nodes simultaneously, achieving virtual testing mileage of tens of millions of kilometers per day. The platform includes a built-in library of 200,000+ structured simulation scenarios and supports users in flexibly constructing exclusive (custom) scenario combinations and evaluation metrics through custom tag systems and programmable assessment scripts, fully supporting automakers in efficiently verifying algorithm performance and accelerating the mass production of autonomous driving functions.

It is worth mentioning that the Octopus Platform has been deeply integrated with the Pangu World Model. The video and point cloud multi-view generation capabilities of the World Model are directly used for parallel simulation of end-to-end intelligent driving models, allowing automakers to conduct rapid iterative testing of models based on generated simulation data. According to GAC, this technological combination supports an iteration rhythm of two versions per day for end-to-end models. However, from an industry-wide perspective, this iteration speed primarily reflects model tuning efficiency in simulation environments, and real-world road testing still needs to proceed step-by-step according to safety regulations.

Current Status of Industry Applications and Challenges Faced

Based on publicly available information, the Pangu large model has been deployed on a certain scale in the automotive industry. Huawei Cloud was recognized by Frost & Sullivan as the leader in China's automotive large model market in 2024, with over 300 automotive industry clients adopting its solutions. FAW Jiefang and Huawei have conducted validation tests in multiple scenarios based on the Pangu large model; Huawei Cloud's autonomous driving development platform has been deployed in various automakers such as Changan, FAW, BYD, and GAC, as well as in commercial vehicle scenarios including mining trucks, port ARTs, and dedicated logistics heavy trucks.

Image Source: Internet

In the commercial vehicle sector, the Pangu large model is used for the development, validation, and optimization of autonomous driving algorithms, helping to reduce testing costs and risks. However, most of these collaborative projects are still in the validation testing phase and are some distance away from large-scale mass production applications.

In the field of autonomous driving, the domain gap between simulated and real data has always been a common challenge faced by the industry. Although the videos generated by STCG are visually close to reality, simulated environments cannot fully replicate all the uncertainties of real roads. A model performing well in simulation does not necessarily mean it is equally reliable on actual roads. Additionally, defining the boundary range for edge scenario generation is difficult. Verifying whether the generated scenarios cover a sufficient variety of dangerous situations and whether there are any uncovered blind spots incurs high costs. Furthermore, the architecture and some technical details of the Pangu large model have not been fully disclosed. The industry's assessment of its technical capabilities mainly relies on benchmark test results released by Huawei, with limited independent third-party verification.

Final Remarks

From a technological development perspective, the Pangu large model provides a technical pathway for autonomous driving development that differs from traditional reliance on large-scale road-collected data—namely, using generative simulation to drive data supplementation and model iteration. STCG and world models have demonstrated feasible methods for physically consistent generation and multimodal data alignment, and the Octopus platform integrates these capabilities into a developer-friendly toolchain.

Of course, this does not mean that road testing for autonomous driving can be replaced. A more accurate understanding is that the Pangu large model offers a method to reduce data acquisition costs and improve the efficiency of edge scenario coverage. It will play an important supporting role in the toolchain for autonomous driving development but still has a considerable way to go before becoming a complete solution for autonomous driving technology.

-- END --

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.

Newest

Links