08/19 2024 406
In the 19th century gold rush, the most profitable were not those who dug for gold, but those who sold shovels and jeans. Just as shovel sellers emerged as the biggest winners during the gold rush, AI Infra plays a similar role in today's AIGC era.
Using the three-tier cloud computing architecture as an analogy, AI Infra is similar to the PaaS layer, serving as the middleware infrastructure connecting computing power and applications. It includes hardware, software, toolchains, optimization methods, and more, providing a one-stop platform for model computing deployment and development tools for large model application development. Computing power, algorithms, and data can be seen as the IaaS layer, while various open-source and closed-source models are the new evolution of SaaS in the era of large models, known as MaaS. As the application of large models continues to accelerate, the value potential of AI Infra is further unleashed. According to CICC data, the AI Infra industry is currently in its early stages of rapid growth, and each sub-sector is expected to maintain a high growth rate of 30% over the next 3-5 years. When large models enter the stage of large-scale application deployment, providing the necessary infrastructure for training, deployment, and application of large models becomes a crucial link, making AI Infra the best business opportunity behind the explosion of large model applications, akin to 'selling shovels during the gold rush'.
Unlocking AI Productivity through the Middle Platform Model
Judging from the evolution of the ICT industry, a three-tier architecture seems to be the inevitable endgame. In the traditional on-premises deployment stage, foundational software such as operating systems, databases, and middleware addressed the complexity of underlying hardware systems by controlling hardware interactions, storage management, and network communication scheduling, allowing upper-level application developers to focus on business logic innovation. In the era where everything is defined by the cloud, a classic architecture of IaaS, PaaS, and SaaS has evolved, with the PaaS layer providing application development environments and data analysis management services, laying a solid foundation for the accelerated penetration of cloud computing. After a long dormant period, AIGC has accelerated the generalization of artificial intelligence, prompting rapid industry restructuring amidst rapid progress. While computing power and applications are undoubtedly the most prominent players, the gap between them is vast, posing a risk of large models becoming suspended or disconnected.
In this sense, AI Infra serves as a bridge, akin to foundational software or the PaaS layer of the past, by constructing a new software stack and integrated services to empower computing power exploitation, model optimization, and application development, thereby becoming the backbone connecting computing power and applications. AI Infra encompasses all tools and processes related to development and deployment. As cloud computing continues to evolve, concepts such as DataOps, ModelOps, DevOps, MLOps, and LLMOps have emerged, each focusing on enhancing development and deployment efficiency. From a macro perspective, all XOps aim to improve efficiency in the development and deployment lifecycle. For instance, DataOps enhances storage efficiency in the IaaS layer and data processing in the PaaS layer, while DevOps and MLOps boost development and deployment efficiency in the PaaS layer, and LLMOps targets efficiency in the MaaS layer. In fact, before the surge of AIGC, theories and practices regarding the AI middle platform were already in full swing.
However, the AI middle platform at that time was more like a 'firefighter,' performing a myriad of complex and tedious tasks but struggling to gain recognition from upstream and downstream stakeholders. Large models have created a broader stage for AI platformization, making the 'selling shovels during the gold rush' logic of AI Infra more definitive and offering considerable room for development. Forecasts indicate that the AI Infra industry will maintain a high growth rate of over 30% in the next 3-5 years. Just as there are countless sandwich fillings between two slices of bread, AI Infra, positioned between computing power and applications, similarly embraces diversity. Broadly speaking, AI Infra encompasses foundational AI technologies involving various underlying facilities for large model training and deployment. Narrowly defined, the foundational software stack forms the core of AI Infra, with the primary objectives of optimizing computing power and algorithms and facilitating application deployment. The relative openness of the AI Infra definition provides more possibilities for exploring different paths. Based on their respective resource endowments and market positions, industry veterans and emerging players are actively expanding the boundaries of AI Infra, with many practices worth emulating.
Will AI Infra Be the Next Application Hotspot?
Compared to model value, rolling out AI applications has become an industry consensus. Robin Li firmly believes that millions of applications will emerge atop foundational models, exerting a greater transformative impact on existing business models than disruptive innovations from scratch.
The supply of AI applications is continually increasing. IDC predicted earlier this year that over 500 million new applications will emerge globally in 2024, equivalent to the total number of applications introduced over the past 40 years. Recently, video generation models have emerged in clusters, with Kuaishou's Keling, ByteDance's Jimeng, and SenseTime's Vimi making their debuts, alongside AI search products and AI companionship products. The explosion of large model applications is a foregone conclusion. According to InfoQ Research, the AGI application market will reach 454.36 billion yuan by 2030, attracting participants from virtually every industry due to the immense opportunities in the model application layer. Beneath these applications, AI Infra serves as a hidden catalyst for their explosion.
Currently, the large model industry chain can be broadly divided into three tiers: data preparation, model construction, and model products. While the AI large model industry chain is relatively mature abroad, with numerous AI Infra companies, this market segment remains largely undeveloped in China. In an uncertain landscape, it is crucial to identify clear paths and establish significant milestones swiftly. The AI Infra market is still in a chaotic phase, with each tech giant aiming to form closed loops within their respective ecosystems. In China, tech giants have their own training architectures. For instance, Huawei's model adopts a three-tier architecture, with the bottom tier comprising general large models boasting robust generalization capabilities. Above this lie industry-specific large models and deployment models tailored to specific scenarios and workflows. This architecture's advantage lies in eliminating the need for retraining when deploying trained large models into vertical industries, resulting in costs merely 5%-7% of the previous tier. Alibaba, on the other hand, has created a unified foundation for AI, accommodating models such as CV, NLP, and text-to-image within this base for training. Alibaba's M6 large model training requires only 1% of the energy consumed by GPT-3.
Baidu and Tencent also have corresponding layouts. Baidu boasts a Chinese knowledge graph covering over 5 billion entities, while Tencent's warm-start curriculum learning reduces the training cost of trillion-parameter models to one-eighth that of cold starts. Overall, while major players have distinct focuses, their primary characteristic is cost reduction and efficiency enhancement, largely attributed to their closed-loop training systems. In contrast, the mature AI industry chain abroad boasts numerous AI Infra companies.
If developing AI applications is likened to building houses, AI Infra serves as the construction crew providing cement and steel. The value of the AI Infra construction crew lies in its role as an integrated platform, bridging the lower-level computing chip layer with the upper-level AI application layer, enabling developers to invoke functions with a single click, thereby reducing computing costs, enhancing development efficiency, and maintaining model performance. Simplifying applications and facilitating AI implementation are the missions of AI Infra. In essence, the market for AI applications dictates the opportunities for AI Infra. Some AI Infra companies specialize in data annotation, data quality, or model architectures, allowing them to outperform large enterprises in terms of efficiency, cost, and quality within specific segments due to their professionalism.
For instance, data quality company Anomalo, a supplier to Google Cloud and Notion, leverages ML for automated assessment and generalized data quality inspection capabilities to facilitate in-depth data insights and quality checks. These companies, akin to Tier 1 suppliers in the automotive industry, enable large model enterprises to avoid redundant efforts by integrating supplier resources, thereby reducing costs. However, China lags in this regard due to two primary reasons: Firstly, major players in China's large model ecosystem primarily comprise large enterprises, each with their training systems, leaving little room for external suppliers. Secondly, China lacks a robust entrepreneurial ecosystem and small-to-medium enterprises, making it challenging for AI suppliers to find a niche outside of large enterprises.
Taking Google as an example, it willingly shares its training data results with data quality suppliers, helping them enhance their data processing capabilities. As suppliers improve, they, in turn, provide Google with higher-quality data, fostering a virtuous cycle. The inadequacy of China's AI Infra ecosystem directly elevates the barriers to entry for large model startups. If building a large model in China is compared to cooking a hot meal, one must start by digging the soil and planting seeds. Currently, a salient feature of the AI 2.0 wave is polarization, with the most popular segments being the large model layer and the application layer. The middle layer, akin to AI Infra, remains largely unexplored and could represent the next opportunity.
Shovels Are Hard to Sell, Gold Mines Are Hard to Dig
Despite the immense business potential lurking within the AI Infra layer amidst the explosion of large model applications, AI Infra companies, even those formidable in their respective domains, remain vulnerable amidst changing tides. NVIDIA's CUDA ecosystem has evolved over two decades, with the most advanced models and applications initially running on CUDA. Each hardware possesses distinct interfaces, and CUDA unifies these interfaces, enabling users to employ a standard language across different hardware. In model development, developers tend to converge on a common language system for their endeavors, thereby fostering the thickness of the NVIDIA CUDA ecosystem. Presently, the CUDA ecosystem commands over 90% of the AI computing market. However, as AI models standardize and structural differences diminish, eliminating the need to orchestrate multiple model sizes, the thickness of the NVIDIA CUDA ecosystem thins. Nonetheless, NVIDIA remains the unchallenged leader in the computing market. Industry insiders anticipate NVIDIA to maintain its dominance as the leading AI hardware provider, with a market share exceeding 80% over the next 3-5 years. For AI Infra shovel sellers, NVIDIA guards the mine's entrance, selling tickets and shovels. After arduously discovering a path into the gold mine, they find miners accustomed to digging with their bare hands, rejecting new shovels. In China, enterprises exhibit low willingness to pay for software and prefer integrated services. Domestic SaaS investments have plummeted, rendering it challenging for AI Infra companies to commercialize solely through hardware or software sales.
As AI applications proliferate, those who can provide efficient and convenient one-stop large model deployment solutions for diverse scenarios are poised to emerge victorious. This endeavor necessitates the harmonious development of underlying technologies, middle platforms, and upper-level applications. Only by fostering comprehensive and balanced capabilities can we traverse the AI journey steadily and robustly. Looking ahead, the process of artificial intelligence reshaping myriad industries has just begun, and the thick snow-covered slope paved by AI Infra will facilitate a stable and sustainable trajectory for this super race. This year, data infrastructure has secured an independent status in top-level design, foreshadowing the imminent elevation of AI infrastructure's strategic position.
'Large Models + Robots': Embodied AI Ushers in the 'Intelligent Machinery Era'
Large Models Lead the Technological Tide, as AI Security Governance Faces a 'Grand Examination'
AI Luminaries Gather at WAIC 2024: What Have They Accomplished?
Where Will Generative AI Head After Its 'First Year'?
Who Dares to Slash Large Model Prices?
After Building 'Large Models,' Embodied AI Will Lead the Next AI Wave?
[Original Report by TechCloud]
For reprints, please indicate 'TechCloud' and attach this article's link