02/10 2025
374
Source: Smart Auto Technology
The domestic smart auto industry is undergoing an AI-driven technological revolution. DeepSeek, with its open-source ecosystem and efficient training capabilities, has quickly garnered empathy and cooperation from domestic automakers. Amid the transition from traditional cars to "software-defined vehicles," advancements in technologies such as the Internet of Vehicles and autonomous driving have placed higher demands on computing power and R&D costs.
Open Source + Efficient Training: The Logic Behind Automakers Embracing DeepSeek
DeepSeek's open-source model offers fresh perspectives to the automotive industry. On February 6, Geely Automobile announced that its self-developed large model Xingrui has completed deep integration with DeepSeek. Geely will conduct distillation training for the Xingrui vehicle control FunctionCall large model and the automotive active interaction end-side large model. This will enable Geely's smart auto AI to not only precisely understand users' vague intentions, thereby accurately invoking about 2,000 in-vehicle interfaces, but also proactively analyze users' potential needs based on scenarios inside and outside the vehicle, providing proactive vehicle control, active dialogue, after-sales services, etc., significantly enhancing the intelligent interaction experience.
On February 7, Lunar Automobile's model, Lunar Zhiyin, completed deep integration with the DeepSeek model, becoming the first mass-produced model in the automotive industry to adopt this technology. Previously, Lunar's cabin had completed access and deployment of the full series of DeepSeek models, with the AI agent integrating with the DeepSeek large model in the cloud, offering a precise intelligent interaction experience.
Figure 1: Geely Lingdu Cabin, Equipped with Geely and DeepSeek Integrated AI Model
There is a consensus in the industry that the automotive industry is evolving from traditional electric vehicles (EVs) to intelligent electric vehicles (EIVs). AI and cloud computing are driving the competitiveness of new energy vehicles. The integration of AI and cloud computing has become a key trend in China's new energy vehicle (NEV) market, with automakers leveraging these technologies to enhance vehicle performance and provide users with a more personalized driving experience.
At the recent World Economic Forum in Davos, Pan Jian, Co-Chairman of CATL, emphasized that intelligence is inseparable from the rapid growth of the Chinese electric vehicle market. He stated that the seamless integration of electricity and intelligence is crucial for this transformation, as electric power provides advanced intelligent functions for vehicles. When combined with large models like DeepSeek, it will further consolidate China's leading position in the global intelligent electric vehicle field.
In the pursuit of efficient model deployment, distillation technology has become another strength for DeepSeek in empowering automakers. While traditional large models are powerful, their high storage and operating costs make them challenging to adapt to in-vehicle scenarios. DeepSeek R1 uses knowledge distillation to transfer the "experience" of large language models to lightweight models, preserving core capabilities while significantly reducing hardware load. It is reported that Geely will conduct distillation training for the Xingrui vehicle control FunctionCall large model and the automotive active interaction end-side large model, infusing DeepSeek R1's semantic understanding and priority decision-making capabilities into the vehicle control module. This "human-like thinking" decision-making logic will help automakers balance cost and performance, accelerating the transition of AI models from the lab to mass-produced vehicles.
Future Expansion of DeepSeek in the Smart Vehicle Industry: Brain-Like End-to-End Autonomous Driving
In the realm of autonomous driving, no autonomous driving model is universally recognized as superior to human drivers based on statistical testing with large datasets. How to make machines understand the "operational priority expressed by humans using semantics" is precisely one of the best entry points for DeepSeek and other potential large language models. Moreover, this technological entry point aligns with the latest trend of end-to-end autonomous driving technology.
Figure 2: Comparison of Three-Stage and End-to-End Autonomous Driving, Image from Research Gate
Traditional autonomous driving systems are often composed of multiple independent modules or stages, such as perception, decision-making, and control. End-to-end autonomous driving technology aims to integrate these modules into a unified large neural network model, directly mapping raw sensor inputs (images, point clouds, ultrasonic radars, etc.) to vehicle control commands to achieve autonomous driving functionality.
Specifically, end-to-end autonomous driving technology takes raw sensor data (from cameras, radars, and LiDARs) as input and processes it through deep learning networks to directly output vehicle control commands (steering wheel angle, throttle, and brakes). This eliminates intermediate steps and modules in traditional autonomous driving systems, simplifying system design and implementation.
The primary advantage of end-to-end autonomous driving technology lies in its ability to improve system performance and robustness through end-to-end training using large datasets. It can learn higher-level features and abstract representations from raw data, automatically discovering and understanding complex traffic scenarios and driving behaviors. Additionally, end-to-end autonomous driving technology can adapt to different roads and environmental conditions, exhibiting better generalization capabilities.
While end-to-end autonomous driving technology offers many potential advantages, its benefits also come with challenges, such as the need for large-scale data, high computing resource requirements, and issues related to interpretability and safety. The biggest potential disadvantage is that the unified single model of the final result may be extremely large, leading to substantial storage, training, operational overhead, and energy consumption. Although the industry generally accepts that the end-to-end approach is the future of autonomous driving, Tesla remains the primary practitioner of this technology.
Generally, current autonomous driving systems often adopt a hybrid approach, combining traditional modular methods and end-to-end technologies, to achieve more reliable and safe autonomous driving functionality.
What if there was a game-changer with low training and usage costs, a final model minimized using distillation technology, and the ability to understand "operational priority expressed by humans using semantics"?
The answer appears to be DeepSeek R1, which boasts a training cost of $5.6 million, a per-token usage cost about one-thirtieth that of ChatGPT, and does not rely on high-end overseas chips, similar to its multimodal predecessor DeepSeek V3.
Figure 3: Academia Using Multimodal Large Language Model LLM to Process Multimodal Data from Autonomous Driving Sensor Inputs (from next-GPT)
As intelligent driving moves towards "full-scenario applications," end-to-end technology has become a competitive high ground in the industry. Traditional autonomous driving relies on a modular architecture with perception, decision-making, and control executed in stages, which has shortcomings such as system complexity and response delay. DeepSeek leverages multimodal large models to directly map sensor data to control commands, establishing a new paradigm of "input-output" integration.
End-to-end solutions represented by DriveGPT4, SenseTime's DriveAGI, and the Senna architecture have demonstrated significant advantages:
1. DriveGPT4: An important milestone in applying large models to interpretable end-to-end autonomous driving, promoting the technology towards a more intelligent direction by introducing stronger perception capabilities and higher decision-making transparency.
2. Senna Architecture: Adopting a unique decoupled behavior decision-making and trajectory planning approach, using large-scale driving data for fine-tuning to enhance understanding of driving scenarios. This architecture can output high-dimensional decision instructions through natural language, further enhancing the system's flexibility and adaptability.
3. SenseTime Absolute Shadow DriveAGI: SenseTime Absolute Shadow's new generation of autonomous driving large model DriveAGI has become a highlight in the industry due to its wide range of applicable scenarios, high performance, and low threshold. This model particularly focuses on enhancing the interpretability and interaction capabilities of end-to-end autonomous driving solutions.
Additionally, we can include Geely's hybrid model obtained by integrating its own AI model with DeepSeek as mentioned earlier.
Perhaps an independent Driving AGI from DeepSeek will emerge soon. These models generally include the following characteristics as basic components in their technical architectures:
1. Multimodal Data Fusion: Multimodal large models can process data from different sensors, such as cameras, LiDARs, and millimeter-wave radars. These data are integrated into a unified representation, providing comprehensive information support for subsequent decision-making and planning.
2. End-to-End Learning: End-to-end learning refers to the process of directly outputting driving commands or trajectory planning from raw sensor data. This eliminates complex intermediate steps in traditional methods, simplifies system design, and improves system real-time performance and robustness. For example, EMMA (End-to-End Multimodal Model for Autonomous Driving) technology can directly map raw camera data to specific driving actions.
3. Natural Language Processing and Interaction: Some advanced multimodal large models are not limited to visual information but also incorporate natural language processing capabilities, enabling them to understand and generate driving intentions or instructions described in human language. This allows the system to better understand and respond to dynamic changes in complex environments while enhancing the possibility of human-computer interaction.
Its application scenarios include:
1. Passenger Vehicle Autonomous Driving: In various road conditions such as urban roads and highways, multimodal large models can help vehicles safely complete lane changes, obstacle avoidance, parking, and other operations. It is reported that Li Auto has achieved mass production and on-board applications based on the idea of a fast-slow dual system.
2. Commercial Vehicles and Logistics Distribution: Commercial vehicles and logistics distribution robots can utilize multimodal large models to achieve L1-L4 level autonomous driving in NOA mode, especially in unstructured environments such as ports (e.g., unmanned container trucks in Xiamen Port and Mawan Port, where Dongfeng Commercial Vehicles and China Merchants Group have collaborated), mining areas (automatic mining trucks in the underground mining area of Yuan'an Phosphate Mine), construction sites, and warehouse interiors. These models can improve transportation efficiency, reduce the need for human intervention, and lower the mental strain on drivers and passengers.
Conclusion
DeepSeek is breaking Tesla's monopoly on end-to-end technology through its open-source ecosystem and efficient training framework. Its model supports distributed training and edge computing deployment, enabling low-latency inference even on domestic chips. With further optimization of distillation technology, the size and energy consumption of future in-vehicle models are expected to decrease further, paving the way for the large-scale deployment of L4 autonomous driving.
From open-source empowerment to distillation efficiency enhancement, to end-to-end reconstruction of driving logic, DeepSeek is reshaping the domestic smart auto industry with a "technology for all" attitude. In this transformation, automakers leverage the "intelligence leverage" of large AI models to achieve a comprehensive upgrade in safety, efficiency, and humanized driving experience. When DeepSeek's "brain-like thinking" truly integrates into wheels, China's global leadership position in the smart auto industry may not be far off.
Disclaimer:
Any works on this official account marked with "Source: XXX (non-Smart Auto Technology)" are reproduced from other media. The purpose of reproduction is to deliver and share more information and does not represent the platform's endorsement of its viewpoints or responsibility for its authenticity. The copyright belongs to the original author. Please contact us for deletion if there is any infringement.