12/25 2024 524
Just as bicycles, watches, and sewing machines defined the 'Big Three' of the industrial era, generative AI, data, and cloud services are emerging as the new 'Big Three' of the intelligent era. Coupled with the accelerated construction of global AI infrastructure, these elements are accelerating humanity's digital migration, fostering an increasingly tight coupling between the new Big Three. The transition from the physical to the digital world has reached a pivotal moment.
Over the past two years, large-model-driven generative AI technology has emerged as the most significant technological singularity igniting digital nativity. We have witnessed rapid advancements in various text-to-content applications. Gartner predicts that by 2026, over 80% of enterprises will utilize generative AI APIs or models, or deploy applications supporting generative AI in production environments, presenting immense opportunities and challenges for industrial development.
The development of large models and generative AI has ushered in a paradigm shift in AI, propelling AI infrastructure into a period of intensive investment. Investment scale, policy support, and product application scale are all growing exponentially.
In the next decade, all enterprises will harness the three nativities (cloud-native, digital-native, AI-native) in their strategies to disrupt their businesses, construct their second and third growth curves, rewrite their business models, and achieve leapfrog growth in the digital era.
Generative AI Blossoms in the Cloud
Undoubtedly, generative AI has become one of the essential engines for technological development and application innovation today.
Over the past year, we have witnessed how generative AI, with its transformative and even disruptive power, has revolutionized numerous industries, reshaped productivity in enterprises, and significantly impacted the global economy.
McKinsey's report, 'The Economic Potential of Generative AI: The Next Wave of Productivity,' highlights that generative AI can significantly boost overall economic productivity, potentially adding $2.6 trillion to $4.4 trillion to the global economy annually.
While the era of Artificial General Intelligence (AGI) has yet to arrive, the future of generative AI is upon us, leading to iterative innovations in enterprise IT infrastructure. Behind the massive growth in computing power lies the upgrading of essential capabilities such as underlying servers, chips, and data, with the cloud reshaping everything.
Behind the booming large models, every component—from underlying chips to intermediate platforms to upper-level applications—is vastly different from the past. If enterprises continue to adopt traditional IT architectures, the interface between CPUs and accelerators will limit product performance, making it difficult to support the new demands of the generative AI era.
Simultaneously, the substantial resource consumption caused by AI models is a significant concern for enterprises. Therefore, enterprise architecture design that meets future needs must fully consider cost and sustainability.
In the Chinese market, intelligent computing services hosting generative AI are shaping new growth momentum for cloud computing. IDC's latest 'China Intelligent Computing Services Market (H1 2024) Tracker' report shows that China's intelligent computing services market grew by 79.6% year-on-year in the first half of 2024, reaching RMB 14.61 billion.
Among them, the intelligent computing integration services market increased by 168.4% year-on-year, with a market size of RMB 5.7 billion; the generative AI IaaS market grew by 203.6% year-on-year, with a market size of RMB 5.2 billion; the Other AI IaaS market contracted by 13.7% year-on-year, with a market size of RMB 3.71 billion.
Currently, computing power expenditure for generative AI has become the mainstay of the intelligent computing services market. Taking the AI IaaS market as an example, the generative AI IaaS market has surpassed the Other AI IaaS market in just one and a half years of development, accounting for 58% of the AI IaaS market. In the intelligent computing integration market, newly built intelligent computing centers are designed to meet the future demands of generative AI.
Generative AI cannot create value alone; its workload is highly computationally intensive, requiring more powerful underlying data and computing services. Therefore, having cost-effective infrastructure is one of the key factors for successful application development.
Moreover, intelligent systems are more disruptive due to their broader adaptability in perception, understanding, learning, reasoning, and interaction, as well as their friendlier multimodal interaction capabilities. Therefore, the architectural design must fully consider feasibility, controllability, and versatility to meet rapid switching between multiple scenarios, demands, and tasks.
An intelligent system does not consist of a single large model. Architectural designers need to align preferences based on different business scenarios, possessing capabilities such as multimodal indexing, model selection, model computing power scheduling, and model inference. Enterprises should also choose suitable intelligent architecture upgrade paths based on different business scenario needs and technical support capabilities.
User-friendliness in AI interactions, the openness of large models, the decreasing prices of APIs, and the prosperity of application ecosystems brought about by plugin services make AI technology potentially become an infrastructure like water, electricity, and the internet, permeating and transforming various industries.
According to the '2024 AIGC Development Trends Report,' in the medical field, AI applications can accurately assist in diagnosis. For example, Google Health's deep learning models outperform human experts in breast cancer screening. Leveraging powerful image recognition and pattern analysis capabilities, these models can identify subtle changes easily overlooked by the human eye from thousands of X-rays.
The financial sector has also undergone AI-driven transformations. Financial institutions utilize complex algorithms to predict market trends, manage risks, and even execute trades automatically. Machine learning technology analyzes large-scale historical data to identify patterns difficult for humans to detect. For instance, through deep learning, AI can capture minuscule market changes in high-frequency trading and respond within milliseconds, unmatched by any human trader.
AI applications in autonomous driving demonstrate AI's capability to perform tasks in highly complex and dynamic environments. Autonomous driving systems like Tesla's Autopilot and Google's Waymo use advanced sensor arrays and AI algorithms to enable autonomous vehicle navigation and decision-making. Their performance is increasingly close to, and even surpasses, human drivers in some scenarios.
Unlocking the Value of Generative AI
The development of generative AI is akin to a marathon, still in its nascent stages. It is not only a long-term competition but also a bridge for global enterprises to collaborate on technology and explore the future technological landscape together.
However, when it comes to practical applications, what should enterprises do amid this tide of change? As a leading global cloud service provider, Amazon Web Services (AWS) has its own answer.
AWS not only continuously innovates at the core cloud service level but also achieves breakthroughs in every technology stack, from chips to models to applications, enabling innovations at different levels to empower and co-evolve with each other. Only such large-scale, full-stack collaboration can truly meet the development needs of today's enterprises, accelerate the value release of cutting-edge technologies, and help various industries reshape their futures.
At the recent 2024 re:Invent China Tour Beijing event, Chen Xiaojian, General Manager of AWS China Product Team, said that almost all applications can be decomposed into several core building blocks. AWS builds excellent core units that users can freely assemble to meet their different business needs in specific scenarios.
Chen Xiaojian, General Manager of AWS China Product Team
Chen believes that a significant change will undoubtedly occur in 2025, as many enterprises will transition from the prototype verification stage to the production stage, which is inevitable. At that time, enterprise needs will become more complex, requiring not only model selection but also various technical supports.
This year, AWS has comprehensively upgraded its generative AI technology, data strategy, and cloud services. In generative AI technology, AWS has strengthened its infrastructure, model tools, and application three-tier technology stack, introducing the Amazon Nova series of base models, including Nova Micro, Lite, Pro, Premier, as well as Nova Canvas focused on high-quality image generation and Nova Reel for video generation. These models excel in performance and reduce application costs by at least 75% compared to the top models in Amazon Bedrock.
Last April, AWS launched the first-generation large model Titan, which only supported a single language modality. If Titan was a small test, the Amazon Nova series models represent AWS's true capabilities and significant strides. What considerations lie behind this?
Chen Xiaojian said that AWS has launched the Nova series with six differently positioned models this year and will introduce speech-to-speech and any-to-any models in the future. The focus of launching these models is to provide users with better choices, enabling better integration with products. The models are based on the reverse working method, determined by understanding customer needs, such as constructing models according to user demands at different levels like Micro, Lite, Pro, Premier, and introducing more models with different capabilities and positions in the future.
Meanwhile, AWS has also enhanced core services such as Amazon SageMaker, Amazon Bedrock, and Amazon Q, providing more diverse model options, deepening the integration of application scenarios, and reducing training and inference costs. AWS is committed to making it easier and more economic for enterprises to integrate generative AI technology into their business practices, comprehensively accelerating the pace of generative AI innovation.
The Amazon Bedrock platform has added Luma AI and poolside models, updated the latest models from Stability AI, and provided over 100 popular, emerging, and specialized model options through the Bedrock Marketplace. Additionally, Bedrock has introduced low-latency optimized inference, model distillation, prompt caching, and other functions, significantly improving inference efficiency. It also enhances data utilization capabilities through knowledge base functions like GraphRAG. Innovations such as automatic inference checking and multi-agent collaboration further enhance AI security and agent development.
Regarding underlying model training, Amazon SageMaker AI's four innovative features are particularly noteworthy, including the new training recipe function of Amazon SageMaker HyperPod, flexible training schedules and task governance, and the introduction of popular AI applications from AWS partners. These features not only help customers start training popular models faster but also save weeks through flexible training schedules and reduce costs by up to 40%, providing strong support for enterprises exploring the generative AI field.
Regarding the data strategy, AWS has launched a series of innovative initiatives. The new generation of Amazon SageMaker integrates data, analysis, and AI functions, providing a one-stop solution with a unified studio to facilitate collaboration on data insights and AI projects. These initiatives align with the trend of customers integrating analysis, machine learning, and generative AI to gain deep insights, helping them gain a first-mover advantage in the data-driven era.
In the field of cloud services, AWS continues to make breakthroughs in core areas such as computing, networking, storage, and databases. In computing, AWS has launched Amazon EC2 Trn2, equipped with Trainium2 and new compute instances, as well as Amazon EC2 Trn2 UltraServers specifically designed for trillion-parameter models.
In storage services, Amazon S3 has added metadata functions and introduced an optimized S3 Tables storage type, significantly enhancing query and transaction processing capabilities.
In database services, AWS has launched Amazon Aurora DSQL, a serverless distributed SQL database, to meet customers' high demands for running workloads across multiple regions and ensure strong consistency across regions. These updates will provide users with more powerful computing capabilities and a more efficient and reliable cloud service experience, further consolidating AWS's leading position in the cloud computing field.
'Underlying Architect' in the Era of Generative AI
Cloud services are the key productivity driver supporting digital innovation. It is evident that cloud vendors play crucial roles behind every technological advancement.
Behind this AI wave, cloud vendors provide infrastructure, AI services, and application tools for AI research and development, actively promoting AI research and practical applications.
AWS is no exception. Besides AI services and application tools, AWS also provides the market with abundant computing resources and powerful cloud services.
Faced with the explosion of computing power demand in the era of generative AI, AWS provides better cost-effectiveness through self-developed chips, optimizes computing power costs through various product combinations such as computing, networking, and storage, and fully meets users' diverse computing power needs.
Cloud vendors must not only act as 'underlying architects' in the era of generative AI but also overcome challenges such as data security and privacy protection, providing users with safe and convenient services, enabling the application of generative AI to penetrate more widely and deeply into every industry and field.
Looking into the future, we anticipate cloud providers continuing to serve as the 'backbone architects,' spearheading the evolution of generative AI technology, and empowering society to tap into the vast potential of AI.
The Battleground for Cloud Services Shifts to AI Applications
Fake Open Source or Genuine Innovation? Understanding the Real Difference in Open-Source Large Models
From Cloud Computing to AI Large Models: A Pivotal Leap in Ecological Transformation by Cloud Titans
Where Will Generative AI Head After Its 'First Year' Milestone?
From 'Singularity' to 'Big Bang': Generative AI Heralds a New 'Decade Cycle'