Xiaomi City NOA, catching up with Huawei in 8 months?

11/04 2024 463

In March 2021, Lei Jun launched Xiaomi's first foldable phone, the MIX FOLD, and officially announced the company's entry into the electric vehicle industry. That same year, Huawei also released the Mate 40 series, and its two-year-old Huawei Intelligent Automotive Solutions BU (IAS BU) introduced the first generation of Huawei Kunlun Intelligent Driving System, Huawei ADS 1.0. Three years later, Huawei has iterated its intelligent driving system twice, and Xiaomi's electric vehicle has undergone its first City NOA upgrade. From practical tests, the overall logic of Xiaomi's intelligent driving system tends to mimic human drivers, with results seemingly very close to Huawei's ADS 3.0. If we count from the launch of Xiaomi's SU7, it has taken nearly 7 months to almost catch up with Huawei's level of iteration over the past year. So, the question arises: how powerful is Xiaomi's new intelligent driving system, and is it possible for it to catch up with Huawei next year?

No longer conservative at complex intersections, but the next generation will handle details better?

Before discussing the effectiveness of Xiaomi's City NOA 1.4.0, it is necessary to briefly review Xiaomi's end-to-end large model for electric vehicles. At present, almost all domestic automakers involved in high-level autonomous driving technology have switched from segmented networks to end-to-end integrated large models. The specific technical logic will not be elaborated here, but in simple terms, it means no longer needing to write code rules for the control end, instead integrating the perception and planning ends into a single large model. The goal is to reduce data transmission time, allowing the system to draw on past experience when facing situations it has encountered before and learn to handle new situations, creating an active learning process that ultimately enables the system to "see and execute." Xiaomi's intelligent driving technology follows the same principle.

From a foundational logic perspective, it is actually very similar to Li Auto's One-Model structure, both being end-to-end + Vision Language Model (VLM), and both utilizing two NVIDIA Orin-X chips with 508 TOPS of computing power. The difference lies in Xiaomi's use of zoom technology in its BEV network, enabling dynamic adjustment at the perception level. When precision is critical, such as on narrow roads or in parking lots, the pixel grid size is adjusted to 0.05 meters, while in relatively open scenarios, this size expands to 0.2 meters. This is the basic condition for achieving unprotected left turns across two-way ten-lane roads. Although Li Auto does not use zoom BEV technology, it has an additional cloud-based world model beyond its fast and slow systems. When the system encounters unsolvable problems, it uploads collected data, reconstructs the problematic scenario in the cloud, analyzes the root cause with big data, and then conducts targeted training. After learning, it is handed back to the end-to-end processing. In summary, the three models complement each other. Compared to these two intelligent driving systems, each has its advantages in different scenarios.

In the previous generation of Xiaomi's intelligent driving system, many car owners reported that the system was quite cautious at complex intersections, with a high frequency of takeovers due to mixed pedestrian and vehicle traffic. In fact, compared to later adjustments at the hardware level, such software-level issues are easier to resolve through OTA optimizations. After the intelligent driving framework is established, the next step involves continuous optimization of details. Therefore, in this new version, special emphasis is placed on optimizing the function upgrade/downgrade logic and improving speed detection for vehicles and pedestrians. The goal is to avoid situations where, when encountering heavy pedestrian and vehicle traffic at intersections, the system cannot accurately determine the appropriate starting timing and must downgrade proactively.

The new version adds the ability to make U-turns and navigate independently based on traffic light instructions, even at intersections without preceding vehicles. It also supports navigating around obstacles such as parked cars, cones, and tripods occupying the road. From the experience of most practical tests so far, the new version's navigation path and details at complex intersections are indeed more human-like. For example, in lane-changing logic, the common approach among most intelligent driving systems on the market is to turn on the turn signal in advance and merge into the planned route at an appropriate opportunity. However, in this new version of Xiaomi's electric vehicle, the turn signal is activated almost simultaneously with executing the turn after ensuring a safe distance from the following vehicle and determining its approximate trajectory. Another example is in the logic for unprotected left turns. Instead of following the trajectory of the preceding vehicle, the new version makes a wider left turn, navigating around to the "safe zone" behind the side of the preceding vehicle, cleverly utilizing the position of the preceding vehicle to avoid mixed traffic with oncoming vehicles, pedestrians, or non-motorized vehicles.

These two improvements in detail basically resolve most lane-changing and turning challenges. However, there is still room for improvement in this generation. For instance, the optimization of traffic efficiency is insufficient. Due to the system's added safety redundancies at complex intersections, the overall response appears slightly slower. At T-junctions, when a suitable starting opportunity arises, the system may miss the chance to proceed directly due to a few seconds of hesitation in considering whether to continue when it detects moving obstacles behind. Although the number of system downgrades at complex intersections has decreased, traffic efficiency remains inadequate.

Additionally, when encountering static vehicles occupying the road in the middle of the road, the system does not always choose to navigate around them, even if the road markings are dashed. The system may prompt a takeover, but in most cases of lane changes and turns, smooth navigation around obstacles can be achieved. This discrepancy indicates that decision-making is not yet "mature." Part of the reason may be that Xiaomi SU7 deliveries are still increasing, and data accumulation is insufficient. Lei Jun also mentioned in a recent City NOA experience that, over a 50-kilometer urban road in Beijing, there were four takeovers due to an accident that prevented navigation around, road construction, queuing to enter and exit a toll station, and encountering forced merging under a pedestrian bridge. Therefore, it is not surprising that these details may be optimized in the next version.

Without high-line lidar, can Xiaomi catch up with Huawei ADS 3.0 next year?

Judging from the effectiveness of Xiaomi's new City NOA version, it can be said that it is already very close to Li Auto's version 6.4 and relatively close to Huawei's ADS 2.0, which had not yet integrated BEV into the GOD network at that time. So, is there a chance to catch up with ADS 3.0 next year?

The answer is that it is highly probable. First, looking at the perception hardware, to achieve Huawei ADS 3.0's full functionality at this stage requires one lidar (192 lines), three millimeter-wave radars (one 4D), 11 cameras, and 12 ultrasonic radars. In terms of the number of different perception hardware, Xiaomi's electric vehicle is similar, except that it uses a 128-line lidar provided by Hesai, which has a 50-meter shorter detection range than Huawei's self-developed lidar. Additionally, Xiaomi does not equip 4D millimeter-wave radars; Huawei uses those from Sunbird with a 280-meter detection range, skilled at capturing irregular obstacles, and capable of establishing 3D coordinates for collected data through echoes, complemented by a 192-line lidar for real-time mapping. This combination fully exploits the advantages of the GOD network. This is one of the reasons why, on October 25 this year, Huawei only released the ADS Pro V3.1 version with optimized details rather than version 4.0.

Theoretically, the more lines (96, 128, 196), the higher the detection accuracy. Moreover, the price of automotive-grade lidar has dropped to the thousand-yuan level. There seems to be no reason for Xiaomi's electric vehicle not to upgrade to a high-line lidar, so why does it still use Hesai's 128-line lidar? This is because high-line lidars inherently consume a portion of computing resources, and Huawei's ADS 3.0 has a computing power of 1000 TOPS. Xiaomi's electric vehicle uses two NVIDIA Orin-X chips with a combined computing power of only 508 TOPS, but this is sufficient for the 128-line lidar. Crucially, as long as Xiaomi does not abandon BEV technology, it does not require a high-line lidar.

The characteristic of the BEV network is that it assigns coordinates to obstacles in the entire scene and then analyzes and predicts the motion trajectories of each obstacle individually. As mentioned earlier, Xiaomi's electric vehicle also employs zoom BEV technology to ensure precision in different scenarios. Moreover, the BEV network's computing power requirements are far less than those of high-line lidars, effectively freeing up a significant portion of computing resources for the VLM. In Huawei's ADS 3.0 system, the perception tasks for road details are handled by 4D millimeter-wave radars. Therefore, Huawei's GOD+PDP and Xiaomi's end-to-end+VLM represent two distinct technological approaches. Although their foundational logics seem similar, the perceptual requirements differ. Regarding their large model capabilities, both currently possess the ability to upgrade from weekly to daily updates. Next, the competition will focus on data accumulation, as richer practical and effective data will result in a "smarter" large model after learning and training. Finally, a prediction can be made: Xiaomi SU7 has almost reached Huawei's one-year upgraded ADS 2.0 level just seven months after its launch. With subsequent delivery and data increases, the next generation is expected to be released soon, catching up with ADS 3.0, which is likely to happen next year. If we refer to Xiaomi's initial release node (June this year), it is not ruled out that a new version will also be released in June next year, leaving only eight months until Xiaomi catches up with Huawei.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.