11/07 2025
341
When discussing autonomous driving, the technology of SLAM is often mentioned. SLAM, an abbreviation for 'Simultaneous Localization And Mapping,' is known as 'simultaneous localization and mapping' in Chinese. SLAM can address a crucial issue: enabling a mobile device to construct a map of its surroundings while simultaneously determining its position within that map in an unknown environment. This process is akin to walking while drawing a map and continuously marking one's position in real-time.
In the field of autonomous driving, SLAM is not a specific algorithm but rather a comprehensive technical framework and engineering system encompassing sensors, state estimation, feature extraction, data association, backend optimization, and more. It typically integrates sensors such as odometers, Inertial Measurement Units (IMUs), cameras, or LiDAR, and continuously optimizes localization results and map information using graph optimization or filtering methods.
The two core tasks that SLAM can accomplish are localization and mapping. Localization is responsible for estimating the device's position and orientation in space, while mapping organizes the perceived environmental information into a navigable map format. Although these two tasks can be performed separately, SLAM enables their synchronization and interaction, improving localization accuracy through existing maps and continuously updating the map with new observations, thus forming a self-enhancing closed-loop system.

The specific role of SLAM in autonomous driving
For autonomous driving systems, SLAM provides real-time localization and environmental modeling capabilities in scenarios lacking prior maps or with unknown environments, enabling vehicles to operate autonomously to a certain extent. It can also perform online corrections when discrepancies arise between existing high-precision maps and actual conditions, mitigating potential risks associated with complete reliance on offline maps by autonomous vehicles.
The requirements for SLAM vary across different driving scenarios. For instance, in low-speed urban roads or enclosed campuses, visual or laser SLAM can construct detailed local maps to assist vehicles in identifying lane markings, static obstacles, and other detailed structures. However, in high-speed scenarios, SLAM is primarily used to complement inertial navigation systems, providing short-term, high-frequency positional compensation to enhance system continuity and robustness.
Furthermore, SLAM establishes a crucial link between perception and localization modules. The perception module identifies objects and determines drivable areas, while SLAM places this information within a unified spatiotemporal coordinate system, forming a stable and reusable environmental representation. The planning and control modules rely on accurate pose and map information for decision-making. Without SLAM support, vehicles are prone to localization drift in areas with poor GPS signals, affecting driving safety.
SLAM also enhances the redundancy and fault tolerance of autonomous driving systems. Autonomous driving systems typically integrate GNSS, IMUs, wheel odometers, and visual or laser SLAM for localization. If one type of sensor fails or loses signal, other sensors can take over, reducing the risk of overall localization failure due to a single component failure. Therefore, SLAM should not be viewed simply as an independent algorithm but as an indispensable key component of the localization system.

Common SLAM implementation methods and sensor collaboration
There are diverse technical paths for implementing SLAM, considering specific scenarios, costs, computational resources, and accuracy requirements. From the perspective of sensor types, mainstream solutions include visual SLAM, laser SLAM, radar SLAM, and multi-sensor fusion SLAM.
Visual SLAM primarily relies on cameras, offering advantages such as low cost and rich information capture, including color and texture, suitable for semantic understanding and detailed recognition. However, it is sensitive to lighting changes and weather conditions. Laser SLAM, based on LiDAR point cloud data, is less affected by lighting, providing clear geometric structures and accurate ranging capabilities, commonly used for constructing high-precision 3D maps, albeit with higher hardware costs and computational overhead. Millimeter-wave radar performs stably in adverse weather conditions and can detect high-speed moving objects, typically serving as an auxiliary sensor and rarely used alone for mapping.
From the perspective of backend algorithms, SLAM can be categorized into filtering-based and graph optimization-based methods. Filtering-based methods, such as Extended Kalman Filters (EKF), are suitable for online real-time estimation with high computational efficiency but are prone to error accumulation over time. Graph optimization-based methods construct various observation data and loop closure constraints into a 'graph' and ensure global consistency through overall optimization. They excel at correcting long-term drift through loop closure detection but are more computationally and storage-intensive. Currently, many technical solutions combine the two approaches, leveraging their strengths to ensure real-time output from the frontend filter while the backend graph optimization silently handles keyframes and loop closure corrections.
Multi-sensor fusion is key to enhancing SLAM performance and robustness. IMUs provide high-frequency attitude changes, maintaining motion prediction during brief absences of visual or laser data; wheel odometers offer relative displacement estimates; GNSS provides absolute position references. Fusing this information based on time synchronization and error modeling significantly enhances system adaptability in complex environments. In recent years, the introduction of semantic information into autonomous driving systems has garnered increasing attention. By identifying stable elements such as streetlights and building corners, SLAM can incorporate temporary dynamic objects into the map, improving the semantic quality and long-term usability of the map.

Challenges in SLAM applications
Implementing SLAM technology in real-vehicle environments involves numerous considerations. The primary challenge is interference from dynamic environments. Traditional SLAM assumes a static surroundings, but in real-world traffic conditions, vehicles and pedestrians are constantly moving, easily contaminating the map and misguiding localization. To address this, dynamic targets can be detected and removed or modeled separately to prevent these 'temporary features' from affecting the construction of the static map.
Besides dynamic objects, changes in environmental conditions directly impact sensors. Visual systems are prone to failure under strong light, shadows, or at night, while LiDAR point cloud quality degrades in rain or snow. This necessitates autonomous driving systems to possess multi-sensor adaptive capabilities, dynamically adjusting sensor weights based on real-time data quality to achieve smooth degradation and functional complementarity.
Another inescapable issue is scale uncertainty and drift accumulation. Monocular visual SLAM cannot determine true scale independently and requires correction from IMUs or odometers. Over extended periods, even small errors can accumulate, causing significant localization deviations. Loop closure detection is relied upon to correct drift, but its effectiveness is constrained by the accuracy of scene recognition and matching. Therefore, combining visual and laser loop closure information, along with keyframe selection and map management mechanisms, is typically employed to balance accuracy and computational load.
Real-time performance and computational resources are strict constraints. Autonomous driving imposes high requirements on localization frequency and latency, necessitating SLAM systems to complete all processing within limited computational power. To ensure real-time responsiveness for critical tasks, systems often adopt acceleration methods such as feature point sparsification, local map optimization, and asynchronous backend processing.
Time synchronization and extrinsic calibration between sensors are also common sources of failure. Minor time offsets or coordinate transformation errors can lead to mismatched observation data. Therefore, systems must support online calibration and health monitoring, promptly triggering recalibration or switching to a safe mode upon detecting parameter abnormalities.

When to use SLAM?
SLAM is not always the core localization method in all autonomous driving systems. In scenarios with good GPS signals and high-precision prior maps, such as highways, vehicles can primarily rely on GNSS, IMUs, and landmark matching for localization, using SLAM as a backup or local enhancement. However, in areas with limited satellite signals, such as tunnels, underground garages, and urban canyons, SLAM is crucial for maintaining localization continuity.
-- END --