01/17 2025 366
In the past year, large models have emerged as a top trend in the tech industry, evolving rapidly from concept to application. Once deployed, they inevitably resorted to the price wars that the internet is most adept at. According to incomplete statistics, enterprises engaged in this large model price war include ByteDance, Alibaba, Baidu, Tencent, iFLYTEK, and others.
At the end of 2024, Alibaba announced another significant price reduction for large models, with a drop exceeding 80%.
Initially, the large model price war did generate substantial traffic for enterprises within a short period. In August of last year, Baidu revealed that the average daily API calls for Baidu Wenxin Large Model surged from 200 million in May to 600 million in August. The average daily token consumption also jumped from 250 billion in May to 1 trillion in August.
Similarly, ByteDance's Doubao exceeded 500 billion tokens in daily usage in July, marking a 22-fold increase in average daily token usage compared to May. However, like any industry, a prolonged price war inevitably erodes corporate profits. Data indicates that before May 2023, the gross margin of domestic large model inference computing power was over 60%, but after successive price reductions by major companies in May, the gross margin of inference computing power plummeted to negative territory.
Despite this daunting figure, the large model price war reignited in 2025.
The confidence in "price wars" has increased somewhat.
In the new year, the intensity of the large model price war has intensified rather than waned. Besides Alibaba, ByteDance and Yuezhi Anmian have also joined the latest wave of price reductions. Will large model enterprises, which once spent lavishly and incurred losses to the point of uncertainty, be able to sustain another price war by 2025?
Firstly, it is undeniable that large model enterprises have witnessed notable turns in practical applications, financing, and profitability. Since the second half of last year, the domestic large model landscape has taken shape, with applications spanning information processing, customer service and sales, hardware terminals, AI tools, learning and education, among others, collectively painting a promising future for large models.
From January to November 2024, a review of winning projects related to large models showed that there were 728 domestic large model projects with a total winning amount of 1.71 billion yuan, which were 3.6 times and 2.6 times the annual data for 2023, respectively. Consequently, large model enterprises have also started generating revenue in this domain.
According to Baidu data, in the third quarter of last year, Baidu Intelligent Cloud's revenue reached 4.9 billion yuan, a year-on-year increase of 11%, with the proportion of AI-related revenue continuously rising to over 11%. Similarly, Alibaba Cloud's quarterly revenue increased to 26.549 billion yuan, a year-on-year increase of 6%, with AI-related product revenue achieving triple-digit growth.
Secondly, after experiencing a two-year decline in financing, the large model sector finally saw a recovery in 2024. Data shows that in the first nine months of 2024, the AI sector completed a total financing amount of 37.15 billion yuan, more than double that of the same period in 2023. In short, after achieving small-scale revenue from application implementations and regaining the favor of capital, having the financial means is the confidence large model enterprises need to continue the price war.
But can the current profitability of large model enterprises truly support chaotic and disorderly price wars?
To date, the operating costs and subsequent losses of large models remain high. Overseas leading large model enterprises such as OpenAI will incur operating costs exceeding 8.5 billion dollars in 2024, with an estimated loss of about 5 billion dollars, and a projected total loss of 44 billion dollars from 2023 to 2028.
As for model training costs, OpenAI predicts they will reach as high as 9.5 billion dollars by 2026.
Although China has made some achievements in the large model industry, it is still too early to stop or reduce training and research compared to the average global development pace of large models. In September last year, the Stanford Institute for Human-Centered Artificial Intelligence released a ranking, with the top ten model vendors including AI startup Anthropic's Claude 3.5 series, Meta's Llama 3.1 series, OpenAI's GPT-4 series, and Google's Gemini 1.5 series.
Currently, only Alibaba's Tongyi Qianwen 2 Instruct from China has made it into the top ten, with the global number of large AI models exceeding 1,328. In the future, domestic investment in the large model sector will only increase. In 2024, despite the overall financing of the entire AI industry doubling year-on-year, the number of financing transactions only increased by about 10%.
In other words, the large model sector has entered a brutal elimination phase. As of November 2024, a total of 309 generative AI products in China have completed filing. The top enterprises are thriving, while small enterprises struggle not only to obtain investment but even to survive. To stay afloat, they must either engage in price wars or focus on marketing.
However, leading enterprises eager to occupy the market in advance are willing to continue price wars while lavishly spending on marketing.
Data shows that in early June last year, the amount of new large-scale advertising investment by Doubao soared to 124 million yuan. In the first 20 days of October, Kimi's advertising expenses amounted to 110 million yuan. In 2025, although the gradually maturing large model sector has gained a bit more confidence, the road ahead is long, and there are countless areas where money needs to be spent.
"Computing Power" is the Primary Productive Force
In 2024, large model enterprises utilized continuous price wars and overwhelming advertising to repeatedly popularize various large model products in the real world. However, as the number of users increased, accidents caused by insufficient computing power resources leading to service collapses once again plunged the entire large model sector into contemplation.
According to incomplete statistics, over the past year, Kimi, ERNIE Bot, ChatGPT, and others have all experienced periods of unavailability. ChatGPT even temporarily suspended new user registrations due to excessive demand. In China, during thesis seasons, products like Kimi, which focus on text processing, often "crash".
How crucial is computing power to the development of large models? Computing power, algorithms, and data were once considered the "three carriages" of large model technology. In the past two years, innovations in algorithms have kept the demand for computing power in a state of high growth. Comparing GPT-3 with the newly released LLaMA 3-405B, although the model size only increased by 2.3 times, the required computing power grew by 116 times.
Therefore, computing power has gradually become the primary productive force in the large model sector, and global leading large model enterprises have already embarked on their layouts in computing power.
It is reported that the giant data center project between OpenAI and Microsoft is expected to cost over 115 billion dollars and be equipped with millions of GPUs. However, OpenAI seems unsatisfied and has partnered with Oracle to build a data center in Texas that will house tens of thousands of NVIDIA GPUs in the future. Meta plans to stockpile 350,000 NVIDIA H100 GPUs, with future computing power reserves reaching 600,000.
The demand for computing power in China has also further surged. On the one hand, user experience necessitates the support of computing power resources. On the other hand, products among major enterprises tend to be homogeneous, and there has been no significant technological differentiation, leading to repeated price wars. Computing power may be the key to breaking the stalemate in the future.
Some institutions have predicted that by 2030, 100% of domestic inference demand will need to be fulfilled by super-large data centers. The global large model sector has sparked a wave of enthusiasm for intelligent computing centers. As of the first half of 2024, there were over 250 intelligent computing centers that have been built or are under construction in China, with 791 bidding-related events for intelligent computing centers in the first half of 2024, a year-on-year increase of 407.1%.
However, there is a point that cannot be ignored regarding the current domestic computing power supply: chips.
Data shows that NVIDIA occupies 80% of the domestic AI training chip market. Before the formation of a robust computing power supply chain, this is undoubtedly a deadlock that must be addressed. Shanghai's "Computing Power Pujiang" intelligent computing action implementation plan states that by 2025, the proportion of domestic computing power chips used in newly built intelligent computing centers will exceed 50%.
Besides chips, there are many practical issues to confront in the actual construction of the 100,000-card clusters advocated in the global large model sector.
Firstly, data centers consume a vast amount of electricity. Some data indicates that the daily power consumption of a 100,000-card cluster can reach 3 million kWh, equivalent to the average daily residential power consumption of a city. Secondly, the larger the computing power cluster, the higher the failure rate, with a 100,000-card cluster potentially experiencing a failure every 20 minutes. Thirdly, computing power is currently in short supply and expensive, but the effective utilization rate of computing power for training large models by many enterprises is often less than 50%.
Of course, the entire large model sector, from enterprises to relevant departments, is striving to resolve various accidents in the computing power supply process. Firstly, in terms of energy consumption, many international companies overseas have opted for a distributed deployment strategy, with Google and Microsoft also promoting collaborative training across multiple data centers.
As for chips, many domestic enterprises are engaged in multi-chip mixed training. For example, under the unified management of heterogeneous computing power, Baidu has achieved 95% mixed training efficiency and shortened cluster fault recovery time to minutes. Judging from the utilization rate of some domestic computing power clusters, the situation of wasted computing power is improving, with the computing power utilization rate of an artificial intelligence computing center in Xi'an reaching 98.5%.
Various signs indicate that the global large model market has "crossed the Rubicon" and there is no turning back. Fortunately, this time, the tech industry should not repeat the metaverse tragedy.
2025: Time to "Roll out Applications"?
Creating real value has become the main theme of the large model sector in 2025. Currently, large model applications have gradually penetrated into various scenarios such as finance, healthcare, education and training, search, and office work. Robin Li once bluntly stated that the industry should no longer compete in models but should directly create application value.
According to statistics from the Economic Observer, as of October 9, 2024, the Cyberspace Administration of China had approved a total of 188 generative AI filings, but over a third of the large models did not further disclose their progress after filing. Only about 10% of large models are still accelerating model training, while nearly half have directly shifted to the development of AI applications.
The reasons for this shift are not hard to guess. On the one hand, whether the industry's price war continues or not, its effectiveness is no longer what it once was. Under the mutual pressure of major players, the entire market is trending towards healthy competition. On the other hand, the current state of technology development in computing power resources necessitates investing hundreds of millions of dollars in basic models at a time.
Musk once estimated that the training of GPT-5 may require 30,000 to 50,000 NVIDIA H100 chips, with chip costs alone exceeding 700 million dollars. Shifting towards applications has naturally become a major strategy for a large number of enterprises that cannot compete in technology and capital to find a roundabout path to success.
Although leading enterprises are barely adequate in terms of technical resources and funds, market competition accelerated at the outbreak of large models. If they do not seize the initiative with applications, they may be buried in the dust of history. In China alone, general large models and industry-specific large models have emerged endlessly in the past two years.
According to the "Interim Measures for the Administration of Generative AI Services," the approved general large models include Baidu's ERNIE Bot, SenseTime's SenseChat, Baichuan Intelligence's Baichuan Large Model, and Thunip's Zhipu Qingyan. Industry-specific large models include Kunlun Wanwei's Tiangong Large Model, Zhihu's Zhihaitu AI Model, Kingsoft Office's WPS AI, TAL Education's MathGPT Large Model, and NetEase Youdao's Ziyue Education Large Model.
Some enterprises have already adopted a "model sea tactic." A typical example is Alibaba. At the 2024 Yunqi Conference, Alibaba not only announced another price reduction but also listed over 100 models at once, including large language models, multimodal models, mathematical models, and code models. The surge of large models may be a good thing for the entire sector, as it promotes a vibrant competitive landscape.
However, for a particular enterprise, the successive launch of similar products will greatly diminish the uniqueness of its offerings, especially when the large model sector is currently mired in homogeneity. Taking Baidu as an example, although Baidu's large model revenue increased last year, its growth rate dropped significantly.
Data shows that in the third quarter of 2024, Baidu Cloud's quarter-on-quarter growth rate fell from 14% to 11%, and the quarter-on-quarter growth rate of generative AI cloud revenue plummeted from 95% to 17%. This decline is inseparable from intensified market competition. To maintain market share, the value of "applications" must be enhanced.
Nonetheless, should enterprises hastily pursue applications at the expense of technological advancement? It is crucial to note that the current efficacy of securing orders in the large model market is intrinsically tied to the model itself. Over the past year, the landscape of large model bidding projects has witnessed a notable upsurge, with Alibaba Cloud, Baidu Cloud, Tencent Cloud, and ByteDance's Huoshan Cloud emerging as frequent winners.
Upon closer inspection, however, we observe distinct variations: Tencent Cloud secured a total of 28 bids amounting to 210 million yuan; Alibaba Cloud won 20 bids totaling 570 million yuan; Baidu Cloud captured 37 bids with a combined value of 500 million yuan; whereas Huoshan Cloud, despite winning 24 bids, only garnered a total of 61.86 million yuan.
What accounts for such disparities among these four players? The explanation lies in the fact that while Huoshan Cloud has secured orders across diverse sub-sectors of intelligent agents, the complexity and customization challenges associated with these agents are relatively modest. Consequently, unit prices can vary significantly based on the scale of research and development. In essence, the "financial potential" of large models is invariably positively linked to technological advancement. As we progress into 2025, large models must continue to expand their applications while advancing technologically.
Consumption Frontier offers you professional, in-depth, and unbiased business insights. This article is original content, and any form of reprinting that retains the author's relevant information is strictly prohibited.