Why has the autonomous driving solution stopped emphasizing maps?

03/31 2026 353

I wonder if you've noticed that many automakers no longer emphasize or even mention high-definition maps when promoting their autonomous driving solutions. Why has a technology once highly relied upon by the autonomous driving industry become increasingly marginalized?

How did it go from being a lifesaver to a hindrance?

In the early days of large-scale autonomous driving adoption, high-definition maps were widely regarded as essential for achieving full autonomy. These maps are vastly different from the ordinary maps used for everyday smartphone navigation. Ordinary maps typically have errors ranging from a few meters to over ten meters and are primarily used to guide human drivers through general road networks.

In contrast, high-definition maps offer centimeter-level precision, detailing lane positions, curb heights, accurate coordinates of traffic signs, and even minute features like utility poles, manhole covers, road slopes, and curvatures.

During a time when onboard sensors and computing platforms were immature, high-definition maps served as the safety foundation for intelligent driving systems. They provided vehicles with a god's-eye view, enabling them to anticipate road conditions hundreds of meters ahead and significantly reducing the burden on perception algorithms.

However, as intelligent driving functions expanded from relatively simple highway scenarios to complex urban roads, the limitations of high-definition maps became increasingly apparent.

The most pressing issue is 'freshness,' or the real-time nature of map updates. Urban roads change daily due to construction, detours, temporary traffic controls, or repainted road markings. While human drivers can adapt to these changes with a moment's observation, autonomous driving systems, which rely heavily on maps, face severe logical conflicts if even minor changes are not promptly reflected in the maps.

Currently, domestic map providers primarily rely on expensive data collection vehicles equipped with LiDAR and professional surveying equipment. This method allows most cities' high-definition maps to be updated only every three months, whereas an ideal intelligent driving system requires real-time updates on an hourly or even minute-by-minute basis.

Cost and qualification requirements also pose significant challenges for automakers. Producing a high-definition map covering all Chinese cities requires tens of billions of yuan in investment, costs ultimately passed on to automakers and consumers. Automakers must pay substantial ordering fees and annual licensing fees per vehicle, severely limiting the scalability of intelligent driving functions.

More critically, national approvals for surveying and mapping qualifications are becoming increasingly stringent. Only organizations with Class A navigation electronic map production qualifications can collect and process high-definition data. In recent years, the number of enterprises passing qualification reviews has significantly decreased, exposing automakers attempting to build their own map data to compliance risks and forcing the industry to wait in line for map providers when expanding into new cities.

What technologies are used after abandoning maps?

To reduce over-reliance on pre-made maps, the autonomous driving industry has embraced a 'perception-centric' approach. This means enabling vehicles to observe and understand the world in real-time, like humans, rather than following a pre-written 'script.'

Bird's-eye view (BEV) perception technology plays a pivotal role in this transition. Traditional intelligent driving systems could only process independent 2D images from each camera, making it difficult to form a cohesive spatial understanding—akin to viewing separate, unrelated photographs. BEV technology, however, uses large models to fuse image data from multiple cameras around the vehicle into a unified 3D top-down coordinate system in real-time.

This approach allows the vehicle to essentially 'draw' a real-time map in its 'mind' while driving. It can not only detect immediate obstacles but also identify topological relationships between lane markings—such as which lines lead where and how intersections connect—through this bird's-eye perspective.

Although this dynamically generated 'living map' may be slightly less precise than pre-made high-definition maps in absolute terms, its greatest strength lies in its authenticity. Since it captures real-time conditions, the system can immediately detect and respond to road construction or other changes. This capability represents an evolutionary leap for intelligent driving systems, shifting from 'following a map' to 'adapting to situations.'

To address unpredictable obstacles not recorded in maps—such as cardboard boxes on the road, oddly shaped construction barriers, or various irregular objects—occupancy network technology has emerged. This technology focuses on whether space is 'occupied' rather than identifying 'what' an object is.

It divides the space around the vehicle into countless tiny 3D voxels, like building blocks, with the system only needing to determine whether each voxel is empty or solid.

Through this physically continuous judgment, the vehicle gains a fundamental understanding of the physical world: if something occupies a space, the vehicle must avoid it. This approach perfectly compensates for the inability of pre-made maps to record dynamic changes, enabling vehicles to handle unexpected situations with remarkable resilience.

While perception capabilities have improved, the industry has not radically abandoned all maps but instead adopted a more rational 'light map' approach.

Light maps represent a significant streamlining of traditional maps, retaining only core navigation elements such as road connectivity and beyond-line-of-sight traffic predictions. They no longer pursuit (zhui qiu - strive for) centimeter-level precision in static elements but instead delegate detailed mapping tasks to the vehicle's own perception system.

This approach not only drastically reduces mapping costs but also enhances the adaptability of intelligent driving systems. With basic navigation maps, vehicles can activate intelligent driving in cities or even rural areas nationwide, eliminating the slow 'city expansion' problem.

How do end-to-end large models give cars human-like driving intuition?

After the perception system solves the 'seeing' problem, 'how to drive' becomes the next critical challenge in autonomous driving evolution. Traditionally, autonomous driving logic consisted of lines of rule-based code—an approach known as rule-driven development, essentially comprising numerous 'if-then' logic statements.

However, human driving behavior in complex urban traffic involves subtle, intuitive decisions. Code struggles to account for all traffic scenarios, such as polite negotiation at narrow intersections or finding gaps at unsignalized crossings. This mechanical logic often causes intelligent driving vehicles to hesitate in complex environments or freeze due to triggered safety protocols.

End-to-end intelligent driving models have gained significant traction recently, aiming to break down barriers between perception, prediction, decision-making, and control. Simply put, this approach trains a large AI system by feeding it massive amounts of high-quality human driving data. By analyzing tens of millions or even billions of kilometers of experienced drivers' footage, the system learns to determine optimal steering angles and brake pressures for various situations.

In this process, the system no longer needs to memorize every line on a map but instead develops driving intuition similar to human drivers. Given a navigation target, it can make the most appropriate driving decisions based on real-time visual inputs.

This end-to-end architecture transforms autonomous driving from 'coding rules' to 'teaching skills.' It not only significantly reduces system response latency but also enables vehicles to handle unknown scenarios.

When encountering an unfamiliar, complex intersection, traditional rule-based systems might fail due to missing matching code, but end-to-end models can navigate smoothly by applying general understanding accumulated from large-scale data, using logical reasoning and imitation like human drivers.

To make this intuition more reliable, visual language models have been introduced as the vehicle's 'slow thinking' system, enabling cars to understand complex traffic semantics such as traffic police gestures, temporary instructions written on signs, and even the intentions of surrounding pedestrians.

This human-like technological approach not only makes driving smoother and more natural but also fundamentally reduces reliance on high-definition maps. For a truly intelligent driving system, maps should serve merely as general directional guides rather than precise operation manuals.

As end-to-end technology matures, autonomous driving systems are evolving from 'map-dependent machines' to 'thinking drivers.' This transformation not only raises system capabilities but also enables rapid deployment across different regions and cultural contexts.

Final Thoughts

The 'map-free' trend in autonomous driving represents an inevitable technological evolution. With explosive growth in computing power and continuous algorithm refinement, vehicles' environmental understanding will increasingly approach or even surpass human capabilities. Maps will gradually marginalize, returning to their essential role as navigation tools. This shift will dramatically reduce costs for intelligent driving functions, making advanced driver-assistance systems accessible to ordinary family cars priced around 100,000 yuan. More importantly, it will accelerate the arrival of the fully autonomous driving era.

-- END --

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.