01/08 2025 345
Reflecting on 2024, the large model industry transitioned from a frenzy to a state of rationality, marked by an initial surge, a mid-year rational regression, and steady progress by year's end. During this period, AI large models not only witnessed significant technological advancements but also achieved broad expansion in application fields, presenting unprecedented opportunities and challenges to various industries.
Industry experts hereby summarize the top ten trends in the development of AI large models in 2024, highlighting the changes and the advent of industrial AGI.
Author | Dou Dou
Editor | Pi Ye
Produced by | Industrialists
In January 2025, the global technology community once again turned its focus to the CES conference in Las Vegas, USA. At this prestigious gathering of top technology enterprises, innovative products, and cutting-edge technologies, AI shone as the brightest star, particularly large model technology, which showcased its immense power and vast application prospects through a series of astonishing real-world applications.
This not only brought a satisfying conclusion to the 'wild' year of AI large models in 2024 but also signaled that AI technology will continue to spearhead the new wave of technological development in the coming year.
From the inclusion of 'AI+' in the government work report for the first time in March last year to the rapid decline in the price of large model services, more enterprises were able to acquire powerful AI capabilities at a lower cost, propelling the widespread adoption of large models across various industries.
Simultaneously, the rise of domestic open-source large models broke the market monopoly held by foreign large models, providing robust support for the independent innovation and development of domestic AI technology.
2024 was undoubtedly a year of transformation and breakthrough for AI large models.
At the beginning of the year, continuous breakthroughs in large model technology and the expansion of application scenarios fueled market enthusiasm for large models to new heights. Numerous enterprises and capital poured into this field, hoping to achieve business transformation, upgrading, and innovative development leveraging the power of large models.
However, as the market matured and competition intensified, people gradually realized that large models were not omnipotent and faced numerous challenges and limitations in practical applications, such as high training costs, data privacy and security issues, model interpretability, and reliability.
This led to a gradual return to rationality in market expectations for large models, with a greater emphasis placed on their actual application effects and commercial value in specific scenarios rather than blindly pursuing model size and performance.
Reflecting on 2024, from the initial surge to the mid-year rational regression and the steady progress by year's end, the large model industry underwent a transformation from frenzy to rationality. During this period, AI large models not only made remarkable technological progress but also achieved extensive expansion in application fields, bringing unprecedented opportunities and challenges to various industries.
Industry experts hereby summarize the top ten trends in the development of AI large models over the past year, highlighting the changes and the advent of industrial AGI.
1. The 'Disillusionment' of Large Models in 2024
In 2024, the field of large models underwent significant transformations.
In terms of financing, according to Juzi IT data, from January 1 to December 5, 2024, a total of 439 financing cases occurred in the domestic AI field, with a cumulative financing amount exceeding 56.4 billion yuan, approximately 80% of last year's amount. The average monthly financing amount was less than 5 billion yuan, indicating the market's cautious approach to AI investments.
Statistics from the Zero2IPO Research Center also revealed that the number of domestic early-stage investments, venture capital (VC), and private equity (PE) institutions in the first half of this year decreased by 23.9%, 19.2%, and 25.2% year-on-year, respectively, reflecting investors' increasing caution and rationality in the face of high investments and uncertain returns in the AI field.
In terms of technology application, in 2023, many large model vendors focused on optimizing model parameters, enhancing model performance, and competing for rankings, eager to become China's OpenAI.
However, entering 2024, industry participants became more pragmatic and began to prioritize the real-world application scenarios and commercial applications of AI technology. The gradual 'disillusionment' of large models in the market revealed their limitations in practical applications, and investors also paid closer attention to the actual application effects and commercial value of AI technology rather than solely technical indicators and rankings.
This trend prompted AI enterprises to focus more on the practicality and market adaptability of their products, driving the deeper integration of AI technology in various fields.
In terms of market competition, price wars for large models persisted throughout 2024, with the price per million tokens plummeting from hundreds of yuan to mere cents. This price war not only lowered the threshold for using large models but also posed new challenges to enterprises' profit models.
Overall, the development of the AI industry in 2024 exhibited more pragmatic and rational characteristics. The market's focus on AI technology shifted from simple technical indicators to practical applications and commercial value. Enterprises also placed greater emphasis on the practicality and market adaptability of their products. Meanwhile, intensified market competition also urged enterprises to constantly adjust their profit models.
2. Continuous Emergence of Innovative AI Architectures
In 2024, numerous innovative architectures emerged in the AI field. These architectures rivaled traditional Transformer models in performance and demonstrated significant advantages in memory efficiency and scalability.
Since its inception, the Transformer architecture has achieved remarkable success in natural language processing, image generation, and other tasks through its self-attention mechanism (Self-Attention, SA).
However, as model parameters continued to increase, the computational demand and complexity of Transformers grew exponentially, gradually becoming a bottleneck in large-scale tasks.
To address this challenge, scholars and researchers worldwide actively explored new architectural designs from various angles.
For instance, Meta Platforms introduced the 'memory layer' technology, significantly reducing the computational cost of models in storing and retrieving data by incorporating efficient query mechanisms. This technology added 12.8 billion memory parameters to a base model with only 130 million parameters, making the model's performance comparable to larger models but with significantly reduced computational power requirements.
Additionally, Mixture of Experts (MoE) models gradually gained traction. The MoE architecture significantly improved the computational efficiency of models by decomposing them into multiple expert sub-models, each activated only for specific tasks.
Besides these architectural innovations, Yuanshi Intelligence's RWKV architecture also garnered widespread attention. RWKV achieved breakthroughs in efficiency and language modeling capabilities by combining the efficient parallel training of Transformers with the efficient inference capabilities of RNNs. Although RNNs were previously considered inferior to Transformers, RWKV introduced reinforcement learning methods, enabling the model to reread previous text when necessary, thereby enhancing its memory capacity and overall performance.
The emergence of these new architectures not only opened new avenues for the technological development of AI but also provided effective solutions to address computational overhead issues. As these innovative architectures continue to mature and be applied, future AI systems will better balance performance and resource consumption, driving breakthroughs and applications of AI technology in a broader range of fields.
Image Source: Quantum Bit's '2024 Annual AI Top Ten Trends Report'
3. Reduced Model Training Costs
With the rapid evolution of AI technology, the cost of AI model training has always been a focal point in the industry. In 2024, through algorithm optimization, hardware upgrades, and the proliferation of cloud computing services, this cost was significantly reduced.
Algorithm optimization was a pivotal factor in lowering training costs. For instance, the DeepSeek v3 model achieved performance comparable to top models like Claude 3.5 Sonnet with a training cost of only 5.57 million USD through the adoption of advanced algorithm optimization techniques.
Hardware upgrades also laid a solid foundation for cost reduction. With the continuous improvement of hardware performance, such as GPUs, the cost per unit of computing power gradually decreased. DeepSeek v3 utilized 2048 H800 GPUs during the training process and completed training in less than two months. This hardware advancement made the training of large-scale models more economical and efficient.
The proliferation of cloud computing services provided another crucial pathway to reduce training costs. Cloud service providers optimized resource allocation and management, enabling enterprises to flexibly rent computing resources based on actual needs, thereby reducing initial investment and operating costs. Additionally, cloud platforms offered robust data storage and processing capabilities, further supporting the training and deployment of AI models.
The combined effects of algorithm optimization, hardware upgrades, and the proliferation of cloud computing services significantly reduced AI model training costs, making AI technology more affordable and enhancing model performance. This enabled more enterprises and research institutions to undertake the development and application of AI models, fostering the widespread adoption and innovation of AI technology.
In summary, the substantial reduction in AI model training costs in 2024 presented new opportunities and challenges for the development of AI. With further technological advancements, AI model training costs will continue to decline, driving the application and breakthroughs of AI technology in more fields.
4. RAG: From a 'Universal Key' to Focusing on 'Small and Difficult' Problems
In 2024, RAG (Retrieval Augmented Generation) technology underwent significant architectural changes and market trend shifts.
Composed of retrieval and large model generation, RAG's core strengths lie in its ability to circumvent the limitation of large models' context window length, better manage and utilize customers' proprietary local data files, and control hallucinations.
However, as the context window length of large models increased, RAG's advantage in addressing context window limitations gradually diminished, but its capabilities in managing and utilizing proprietary knowledge files and controlling hallucinations became more crucial.
In the first half of 2024, the market's expectations for AI were 'omnipotent, large, and comprehensive,' and RAG technology was viewed as a universal key to solving complex problems.
However, with the in-depth application and actual deployment of technology, the industry gradually regained rationality and began to prioritize solving 'small and difficult' problems. Enterprises started incorporating large model technology into their businesses, with stringent requirements, rigid demands, and prompt payments, making RAG's characteristics of 'many white-box processes' and 'easy to control' favored by enterprise customers and developers.
Data showed that the adoption rate of the RAG architecture in enterprise-level AI design patterns increased from 31% to 51%, becoming a mainstream trend.
This change reflected the gradual emergence of RAG technology's value in practical applications, particularly in fields such as enterprise knowledge management systems, online question-answering systems, and intelligence retrieval systems. The application of RAG technology not only improved the accuracy and efficiency of information retrieval but also provided enterprises with more personalized and precise solutions.
At the technical level, the architecture of RAG was also continuously optimized and deeply applied.
For example, by enhancing retrieval efficiency, expanding context length, and improving system robustness, RAG technology could better handle complex information retrieval tasks. Additionally, the advent of multimodal RAG extended RAG's capabilities to broader areas beyond text, such as images and videos, enabling seamless interaction between text and visual data.
Looking ahead, RAG's value will become more evident in practical applications, serving as a core engine driving the deployment of AI. With the continuous development of technology and evolving market demands, RAG technology will continue to play a vital role in enterprise-level AI applications, helping enterprises better manage and utilize knowledge resources, improve business efficiency, and enhance competitiveness.
5. Agents: Leading a New Wave of Change
In the second half of this year, AI Agents emerged as a hot topic in the technology community.
Global technology giants such as Microsoft, Apple, Google, OpenAI, and Anthropic announced related progress. In the domestic market, enterprises like Baidu, Alibaba, and Tencent also successively launched their respective intelligent agent platforms.
Data showed that the agent architecture has successfully supported 12% of implemented projects.
This indicated that AI-driven solutions would be fully operationalized through software, thereby improving efficiency and flexibility. With continuous technological advancements, more and more enterprises began adopting AI Agent technology to achieve higher levels of automation and more efficient operations.
For instance, in the retail industry, AI Agents can function as personal shopping assistants, offering users a tailored shopping experience. In healthcare, Agent technology facilitates the management and analysis of medical records, enhancing the efficiency of medical services.
However, despite the high hopes for AI Agent technology, its credibility issues have garnered widespread attention.
Large Language Models (LLMs) are susceptible to misinformation, which may lead to errors in AI Agents' task execution. To mitigate this, researchers are exploring various methods to bolster Agents' credibility.
For example, by integrating Retrieval Augmented Generation (RAG) technology and external knowledge bases to guide text generation, the accuracy and reliability of models can be enhanced. Moreover, transparent operation processes and flexible correction mechanisms are crucial foundations for building trustworthy Agents.
6. The Rise of Multi-Model Strategies
In 2024, a notable trend emerged in the corporate world: rather than relying solely on a large model, enterprises adopted a pragmatic multi-model strategy. The crux of this strategy lies in selecting the most suitable model for deployment based on different usage scenarios and business needs. This shift not only enhanced the flexibility and adaptability of models but also better met the diverse business needs of enterprises.
Data revealed that OpenAI's market share declined from 50% to 34%, signaling a weakening of its first-mover advantage.
Concurrently, Anthropic's market share doubled from 12% to 24%, making it the primary beneficiary. Anthropic's Claude series models, particularly the latest Claude 3.5 Sonnet, significantly improved their capabilities in multidisciplinary comprehensive reasoning, prompting many enterprises to switch from GPT-4 to Claude.
This market shift underscores that when selecting AI vendors, enterprises prioritize model security, price, performance, and scalability.
The proliferation of multi-model strategies enabled enterprises to more flexibly select and combine different models when confronted with complex business scenarios.
For example, in the financial services sector, enterprises might require a model capable of handling complex data and adhering to stringent regulations, whereas in the media and entertainment sector, a model adept at generating high-quality content is essential. Through multi-model strategies, enterprises can choose the most appropriate model for different business departments and application scenarios, thereby enhancing efficiency and effectiveness.
Furthermore, multi-model strategies fostered technological innovation and collaboration within enterprises. Enterprises can develop and optimize specific models tailored to their business needs, thus gaining a competitive edge. For instance, some enterprises achieved more efficient business processes and superior user experiences through the collaborative efforts of large and small models.
Overall, the prevalence of multi-model strategies has not only reshaped the competitive landscape of the AI market but also introduced new ideas and methodologies for enterprises' digital transformation. As technology continues to advance and applications deepen, this strategy will continue to drive enterprise innovation and development, generating substantial commercial value for enterprises.
In 2024, embodied intelligence, a pivotal branch of artificial intelligence, gradually emerged as a hotspot in research and application.
In daily life, the application of embodied intelligence is becoming increasingly prevalent.
For instance, smart sweeping robots can autonomously plan cleaning routes by sensing the environment, avoid obstacles, and efficiently complete cleaning tasks. Autonomous vehicles have also demonstrated driving abilities akin to human drivers in real-world road tests, capable of recognizing traffic signals, pedestrians, and other vehicles, and making real-time driving decisions.
Humanoid robots are also considered one of the ideal platforms for realizing embodied intelligence.
Specifically, they can not only mimic human appearance but also perform intricate tasks by integrating advanced sensors and algorithms.
With continuous technological advancements, embodied intelligence has demonstrated its unique value and potential in various fields. In the industrial sector, embodied intelligent robots can boost production efficiency and safety, executing complex assembly, handling, and inspection tasks. In the service industry, they can offer more efficient and personalized services, such as hotel reception and restaurant services.
Moreover, embodied intelligence is exploring novel application scenarios, such as replacing humans in hazardous areas for search and rescue operations during disaster relief.
While embodied intelligence has made significant technological strides, it still faces some challenges.
For example, there are issues with hardware stability and cost, the integration and processing of multimodal data, and adaptability in complex environments. However, with in-depth research and technological breakthroughs, these issues are expected to be gradually resolved, enabling embodied intelligence to achieve commercial applications in a broader range of fields.
In summary, embodied intelligence, as a fusion of artificial intelligence and robotics technology, is gradually transforming people's lives and workstyles. It not only provides humans with more intelligent and convenient services but also presents new opportunities and challenges for the development of various industries.
Vector databases, an emerging database technology, have rapidly gained traction in the field of artificial intelligence in recent years, gradually becoming a vital supplement or even replacement for traditional databases.
Unlike traditional databases, vector databases can more accurately represent the characteristics or categories of data by converting it into vector form, enabling efficient similarity search and range queries.
With the rapid development of artificial intelligence technology, particularly the widespread adoption of large models, the demand for vector databases is also on the rise. Large models typically require processing a vast amount of high-dimensional data, and vector databases can effectively support the storage and retrieval of this data.
For example, applications such as generative AI and retrieval-augmented generation (RAG) architectures rely on vector databases to store and retrieve numerous knowledge base embeddings.
A set of data revealed that in 2024, the global database market size surpassed 100 billion USD for the first time, reaching approximately 101 billion USD, of which the Chinese database market size was 7.41 billion USD, accounting for 7.34% of the global market.
This data indicates that vector databases, as an emerging technology, are gradually becoming a significant component of the database market.
Currently, there are 167 global database vendors with 269 products. With the continuous development of technology and increasing market demand, it is anticipated that the market share of vector databases will continue to expand.
The market trend of vector databases also underscores its immense development potential. On the one hand, the integration of vector databases and traditional databases is continuously deepening, with many traditional database vendors gradually incorporating vector retrieval capabilities. On the other hand, the cost of vector databases is gradually decreasing, with a predicted cost reduction of 3-5 times in the next few years. This will further propel the application and adoption of vector databases across various industries.
In summary, with their unique advantages in processing high-dimensional and unstructured data, vector databases are gradually altering the landscape of the traditional database market. As artificial intelligence technology continues to evolve, the application prospects of vector databases will broaden, and they are expected to play a pivotal role in more fields.
In 2024, multimodal capabilities emerged as a fundamental standard for large AI models. Almost all major model vendors released multimodal models capable of processing image, audio, and video inputs.
For instance, ByteDance launched two video generation models, PixelDance and Seaweed, in 2024, which significantly improved the quality and efficiency of video generation.
Tencent's Hunyuan large model was also upgraded to a Mixture-of-Experts (MoE) architecture in 2024, boasting trillions of parameters and excelling in handling complex and multitask scenarios.
Zhixiang Future Technology Co., Ltd. released the Zhixiang Multimodal Generation Large Model 3.0 in December 2024, achieving a comprehensive upgrade in image and video generation capabilities.
iFLYTEK's iFLYTEK Spark Large Model 4.0 Turbo also performed impressively in multimodal applications, supporting multilingual speech recognition and highly anthropomorphic speech synthesis capabilities.
The release of these multimodal models not only spurred technological progress but also introduced new possibilities for practical applications.
For example, multimodal models can be utilized for more complex scene understanding, such as better grasping users' needs and intentions through the combination of images and audio. Additionally, the enhancement of multimodal generation capabilities makes it easier to produce high-quality image and video content.
With the proliferation of multimodal models, their application scenarios are also continually expanding. In education, multimodal models can be used to develop more interactive learning tools, improving learning outcomes through the integration of images and audio. In healthcare, multimodal models can assist doctors in better analyzing medical images and patient data. In the entertainment and creative industries, multimodal generation models can be employed to create novel artworks and film and television content.
In 2024, small models demonstrated substantial advantages in specific fields.
These small models are favored due to their lower computational complexity and resource consumption, especially in resource-constrained environments such as mobile devices and edge computing nodes. They can operate efficiently and are often optimized for specific tasks, making their performance comparable to large models in certain application scenarios, sometimes even surpassing them.
Moreover, small models are more interpretable and easier for users to understand and accept. Take OpenAI's GPT-4o mini as an example; despite the reduced cost, its performance has been enhanced thanks to improvements in datasets and training methods.
In specific fields, vertical models have also demonstrated the ability to outperform general models.
For example, in domains such as legal consultation, chemical research, and medical services, customized AI models can more deeply understand and process domain-specific knowledge, providing more accurate and targeted services.
These specialized models not only help address unique industry challenges but also accelerate the development of related industries. With the growing demand for AI technology across various industries, it is anticipated that more vertical models will emerge in the future, further driving the intelligent transformation of diverse industries.
The emergence of these models signifies the evolution of AI technology towards a more refined and specialized direction, offering more efficient solutions to various industries.
From the 'disenchantment' of large models in the market to the emergence of innovative AI architectures and the reduction in model training costs, 2024 witnessed rapid advancements in AI technology and the expansion of its application scenarios.
RAG technology transitioned from being a 'universal key' to focusing on solving 'small and difficult' problems, while AI Agent technology spearheaded a new wave of change.
The prevalence of multi-model strategies and the increased focus on embodied intelligence further propelled the application of AI technology in various fields.
The rise of vector databases and the popularization of multimodal models indicate the enhanced capabilities of AI technology in processing unstructured data and multimodal information.
Lastly, from the significant advantages of small models in specific fields to the capability of vertical models to outperform general models, AI technology is evolving towards a more refined and specialized direction, offering more efficient solutions to various industries.
The developments of this year not only underscore the potential of AI technology but also lay a solid foundation for future intelligent transformation.