Is the final goal of intelligent driving technology or marketing chaos?

Home

Finance

ICV

Smart City

Digital Live

Cloud

Optics

Home Finance AI ICV Smart City Digital Live Cloud Optics

11/13 2024 397

The pace of technological evolution has surpassed consumers' cognitive speed, and the domestic intelligent driving market is in turmoil.

Since the beginning of this year, the automotive industry, accustomed to creating buzzwords, has hyped up a new term - "end-to-end" - and repeatedly ground it into consumers' ears at an unprecedented frequency. Creating buzzwords is the business of enterprises, but the explanations are left to salespeople.

From the training scripts of major brands, almost all salespeople summarize "end-to-end" intelligent driving with one word: AI. "Our latest intelligent driving system uses AI technology and is currently the most advanced."

Musk once introduced the capabilities of end-to-end deep learning, calling it "image input, control output". Based on this, Tesla launched the FSD v12.3 version of its autonomous driving system in the United States, which received widespread acclaim.

This praise came not only from American users but also from leaders of emerging domestic automotive companies.

Xiaopeng Motors' He Xiaopeng said he tested Tesla's FSD in the United States and found it very smooth.

Lin Bin, Vice Chairman of Xiaomi, said he tested Tesla's FSD in the United States and found it very smooth.

Yu Chengdong, Chairman of Huawei's Consumer Business Group, said his team tested Tesla's FSD in the United States and found Huawei's intelligent driving far ahead.

Regardless of their public attitudes, after Tesla, all automakers began investing heavily, aiming for "end-to-end" autonomous driving in the future.

The "End-to-End Autonomous Driving Industry Research Report" jointly released by Chentao Capital and two other parties found that 90% of the over 30 frontline industry experts interviewed said their companies had invested in researching and developing end-to-end technology, and most technology companies believed they could not afford to miss out on this technological revolution.

This has formed a consensus to a certain extent, unifying the previously chaotic jargon of intelligent driving, including NOA, NGP, NCA, NOP, etc.

From which end to which end?

In fact, end-to-end is not a brand-new concept. In the field of artificial intelligence, it is a commonly used method. For example, in various AI translation and speech-to-text applications, end-to-end is used: raw data is fed into a neural network, and after a series of calculations, the final result is directly given.

This is also true in the field of intelligent driving. The radar and various sensors on the car perceive road information and directly respond through decision-making, which is reflected in the actions of the car's intelligent driving, including steering wheel angle, throttle pedal opening, etc.

This is in stark contrast to almost all previous intelligent assisted driving systems that relied on predefined rules for judgment.

Before the emergence of end-to-end, intelligent driving systems needed to first identify lanes, pedestrians, vehicles, signs, and other key information through sensors, and then engineers wrote hundreds of thousands of lines of C++ code to handle various scenarios such as stopping at red lights and proceeding at green lights. Each behavior had corresponding rules and conditional judgment equations, but this approach was ultimately difficult to cover the complex and ever-changing real-world driving conditions.

In contrast, end-to-end directly responds through continuous AI learning and calculation, omitting almost all intermediate logic.

Because AI involves large models and deep learning, end-to-end naturally comes with some complex cutting-edge technological attributes. Just like many people know about ChatGPT but still don't understand what large models are all about. Analogously, ChatGPT is a typical end-to-end model that directly provides answers upon inputting text statements.

So far, no company has attempted to convey the most basic understanding of end-to-end to consumers in the most accessible language, or even directly visualize it using concepts like door-to-door, from the starting point to the endpoint - letting the vehicle automatically take you from point A to point B.

The previous interpretation of end-to-end is also the ultimate version, and there is still a significant gap between it and the vast majority of end-to-end intelligent driving systems currently being promoted on the market.

Xia Yiping, CEO of JiYue, said, "End-to-end is not something that can be accomplished overnight. First of all, I don't think there is any company on the market that is 100% end-to-end. No one in the world is completely end-to-end. I think whether it's end-to-end or mapless, they're all marketing gimmicks. For ordinary people, I think the most important thing is still a good experience."

From the perspective of the evolution of autonomous driving architecture, end-to-end can also be divided into several stages or technical routes. In the most basic "perception end-to-end", the entire autonomous driving architecture is divided into two main modules: perception and prediction decision-making planning. Among them, the perception module has achieved module-level "end-to-end" through BEV (Bird Eye View) technology based on multi-sensor fusion. By introducing the transformer neural network model, the accuracy and stability of recognition results have been significantly improved compared to previous methods. However, the final planning and decision-making module is still primarily based on rules.

The second stage is end-to-end decision-making, where the functional module from prediction to decision-making to planning have been integrated into a single neural network.

Single-model end-to-end is considered the ultimate version. In this context, there is no clear division of functions such as perception, decision-making, and planning. The same deep learning model is used directly from raw signal input to final trajectory planning output. This is truly end-to-end.

Recently, some domestic companies have claimed to be end-to-end perception or end-to-end decision-making, but these are only minor aspects of "end-to-end" and can only be considered as purely data-driven perception and purely data-driven decision-making planning stages.

In other words, the better ones are only the fusion of the first two modules and cannot achieve the result of output control (execution).

Lou Tiancheng, CTO of Pony.ai, said, "End-to-end is not a particularly large model. For example, the end-to-end solution of Xpeng Motors can actually run on an orin-x, which also involves a lot of rules."

Compared to modular solutions, the single-model end-to-end solution, although more complex in training and debugging, theoretically has a higher ceiling in terms of final effects.

The "deified" end-to-end

Amid the overwhelming "end-to-end" bombardment, a divide has emerged regarding the broad and narrow definitions of the technology.

End-to-end-related companies are divided into two factions: one is the "technological fundamentalist faction" consisting mainly of technical personnel and scholars exploring and researching cutting-edge technology. They believe that the end-to-end promoted by many companies on the market is not truly end-to-end.

Zhu Xichan, a professor at the School of Automotive Studies at Tongji University, once bluntly stated, "Automakers promote end-to-end more for traffic, and in reality, few domestic automakers have the technical strength to do "end-to-end". But they can't afford to lose verbally; it's like fighting a war; once you lie down, you won't get up again."

The other faction is the "pragmatist faction" consisting mainly of automakers' suppliers eager to implement projects. They argue that as long as the basic principles are met and product performance is improved, the precise connotation of end-to-end is not important.

Wang Naiyan, CTO of TuSimple, called on the industry earlier this year to avoid falling into the misconception of narrow end-to-end, as it is detrimental to the mass production of intelligent driving.

After all, by adding more adjectives, any car can be the best-selling model; similarly, by defining a sufficiently narrow scope, any enterprise can claim to have end-to-end capabilities in a certain sector.

In June 2017, Musk poached a Slovakian researcher from OpenAI. His name is Andrej Karpathy, who later became Tesla's AI director.

Later, Andrej Karpathy led a team at Tesla to rewrite the autonomous driving algorithm and develop BEV pure visual perception technology, which is currently the hot end-to-end technology, taking Tesla's autonomous driving to a new stage. This has also influenced the technical paths of a large number of domestic enterprises.

Seeing the future, Tesla did not hesitate to rewrite its autonomous driving algorithm and reconstruct the infrastructure for training deep neural networks. However, this does not mean that end-to-end, or Tesla's end-to-end, is the optimal solution in the field of intelligent driving at this moment.

Zhang Qi from Wenjie Automotive Intelligent Driving Academy said in a public class to BC, "End-to-end is not omnipotent. Its inherent 'black box' characteristic determines that it cannot simply constrain the safety boundaries of the system through clear and explainable rules, posing safety challenges."

To visualize this, Zhang Qi gave a few simple examples. Taking the currently effective Doubao large model in China as an example, in some specific issues, the large model may also give irrelevant and nonsensical answers.

"The underlying algorithm of AI is statistical logic that calculates the correlation between things. The derived causal chain may violate common sense or even provide a wrong and unpredictable answer. This is known as the 'hallucination' tendency in the industry."

While chatting can be nonsensical, in the field of intelligent driving, any incorrect output can have fatal consequences.

On the other hand, end-to-end cannot reproduce complex and occasional extreme events, challenging its interpretability and versatility. While raising the ceiling, it lowers the floor, resulting in the so-called "seesaw effect". Therefore, in addition to end-to-end, almost all automakers adopt rule-based backup methods.

Taking Wenjie as an example, there is an instinctive safety network to guard the red line, and Xpeng also uses some series of rules based on XNPG as a backup.

Whether it's that end-to-end cannot cover all extreme scenarios or that it is not realistic in the short term to learn to select and distinguish, and stably output the optimal solution, it is not yet practical, or at least risky.

In this regard, Song Yang, founder and CEO of Zhixing Technology, said, "The end-to-end solution has the characteristics of 'high ceiling but low floor'. In layman's terms, if done well, it can achieve excellent results, but if done poorly, it will be worse than traditional solutions."

The storm is coming

From the earliest intelligent driving relying on high-precision maps to later mapless intelligent driving and now to various forms of end-to-end, the pace of technological evolution has surpassed consumers' cognitive speed, stirring up turmoil in the domestic intelligent driving market.

The first to be affected are map providers supporting intelligent driving businesses. In the transition to mapless and even end-to-end intelligent driving, map providers are the first to be abandoned. High-precision maps, once considered indispensable for advanced intelligent driving, are being marginalized.

Accompanying automakers' "de-mapping" actions towards end-to-end development, Cheng Peng, CEO of NavInfo, publicly attacked, "The reason why some automakers emphasize the 'mapless' technical route is mainly due to the lack of mapping qualifications, intellectual property rights, and safety awareness."

Some automaker executives have made it clear that if the freshness cannot be guaranteed and high-precision maps are forcibly used in cities, it will only increase costs without enhancing effectiveness and cannot guarantee accuracy. However, in Cheng Peng's view, in recent years, everyone has been shouting about being mapless, but in fact, every automaker and every autonomous driving solution provider is still using high-precision maps.

Although they are unwilling to compromise verbally, their actions still show honesty. Map providers are also quickly responding and adjusting. For example, in the past, high-precision maps were all installed on the in-vehicle side, but now they are equipped on the training side: The development model of automotive intelligent driving has formed a closed-loop development model of cloud + vehicle. The model is trained and validated in the cloud and then deployed to the vehicle side for application and data collection and feedback.

NavInfo, Gaode, and Baidu have successively launched their own lightweight map products. Compared to the centimeter-level accuracy of high-precision maps, lightweight maps generally have meter-level accuracy but can achieve higher freshness.

Compared to the transformation of map providers, intelligent driving professionals may be the most suffering group in technological progress. If a grain of dust in an era is a mountain for an individual, then end-to-end is more like a mountain of an era for them.

The director of autonomous driving at an AI chip company once told the media that an overall end-to-end change is equivalent to starting over. A large number of once-hot intelligent driving engineers are facing the harsh reality of either relearning or leaving.

Previously, during the launch of various brands' city NOA, a large number of extreme scenarios were encountered, requiring a certain number of regulatory programmers and test engineers to handle them. After switching to an end-to-end architecture, "high-quality data" and "top AI talent" may become more important resource elements.

A research and development team that once had more than a thousand people now only needs two or three hundred people. In 2023, NIO's intelligent driving team had over 1,000 people. Facing media questions at this year's NIO IN, Li Bin also responded positively that while intelligent driving does not require a large number of people from other fields, the company will redistribute them internally.

Ideal Motor's 1,300-person intelligent driving team also initiated a large-scale layoff in April this year, even to the extent of cutting deep into the core and then having to urgently rehire.

The product team of Xiaopeng P7+ also told BC that many people have left the intelligent driving side since the end of last year, and those who do not learn will be eliminated. Even the former head of planning control at Tesla left earlier this year.

The number of people required for the end-to-end team has decreased, but the talent threshold has become higher. Large models themselves require the team to have a strong background in deep learning. During the solution building stage, there is an even greater need for infrastructure (infra) talent who have a deep understanding of each module of perception, planning, and control, as well as knowledge of the support and different AI reasoning frameworks of various chip computing power platforms.

After all, most of the regulatory programmers who previously wrote rules did not have a deep learning AI background. In the surging tide of the times, if they are not swept forward, they can only be stranded on the shore.

Although from an industry-wide perspective, engineers who traditionally do rules-based algorithms have not yet encountered large-scale layoffs, it is foreseeable that the crossroads are just ahead.

Note: Some images are sourced from the internet. If there is any infringement, please contact us for deletion

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.

Newest

Links