06/14 2024 582
If 2023 was the qualifying round for big models, where the amount of financing determined whether one could advance, 2024 has already fast-forwarded to the knockout stage.
ByteDance, Alibaba Cloud, Baidu Intelligent Cloud, Tencent Cloud, and others successively joined the "price war" in mid-to-late May, offering lightweight models directly for free, with API prices for their main models generally reduced by over 90%. Big models, once known for "burning money," have rapidly entered the era of "cabbage prices."
At the time, someone posed such a question: With big companies choosing to "burn money like crazy," what should big model startups do?
Over half a month later, several unicorn-level big model entrepreneurs have expressed their stances: Zhipu AI decisively followed suit, reducing prices twice in a month; MiniMax quietly launched a campaign offering 100 million tokens for free registration and certification, as well as free TPM expansion; Wang Xiaochuan, the founder of Baichuan Intelligence, publicly stated that they would not follow suit in price reductions; and Kai-Fu Lee, CEO of OneZero AI, directly said that "the crazy price reductions in the domestic big model market are a lose-lose strategy."
As the "first battle" in the true sense of the big model industry, how will big companies' cost-free price competition affect startups?
Three possible outcomes
As of now, the wave of price reductions for big models is still spreading, and changes in the market landscape will take at least half a year or even longer. But before the dust settles on this story, it's not difficult to speculate on some possible outcomes.
The first, a more idealized outcome: Price reductions for big models benefit developers, accelerating the explosion of AI-native applications and gradually triggering qualitative changes from quantitative changes.
Currently, most SMEs and individual developers use API interfaces to access the capabilities of big models. Before the crazy price reductions, prices were generally around 0.02 yuan per thousand tokens, which for developers lacking a profit model, computational power costs were like a heavy mountain.
Price reductions or even free big models mean that developers' cost curves will drop significantly, prompting developers to develop and test at lower costs, thereby creating more AI-native applications and infiltrating big model capabilities into more scenarios. Perhaps current applications still have homogeneity, mainly focusing on intelligent assistants and emotional companionship, but there is no lack of the possibility of triggering qualitative changes from quantitative changes, nurturing super apps.
What can be evidenced is that after Baidu announced the free open use of ERNIESpeed and ERNIELite, the daily call volume of the two models increased tenfold. It's somewhat reminiscent of the early days of the mobile internet, when carriers reduced data fees, and users began to try various novel applications, ultimately ushering in a繁荣 scene of hundreds of flowers blooming. In a vibrant ecosystem, every big model vendor will be a beneficiary.
The second, a more realistic outcome: Some big model vendors lacking a "moat" will be eliminated, with computing power, talent, and capital becoming increasingly concentrated.
A popular view is that the more users someone has, the richer the data generated, the more "feed" fed to big models, and the more likely it is to train better AI. The purpose of price reductions by big model vendors is to attract more people to use them, feeding the real usage data of users back to the big models, and then training and iterating.
The cautious attitude of the capital market has made the situation of the "Hundred Model War" fleeting, but many mid-level entrepreneurs who have obtained financing are still struggling with refining models. The price cards played by the top players are expected to make some wavering entrepreneurs give up developing their own big models, shifting their attention to the application layer or other areas, avoiding redundant construction of big models in terms of underlying hardware and software, and promoting the entire industry to move in the same direction.
Regardless of the considerations, the increase in industry concentration is not bad news. Big models belong to a track that extremely tests computing power resources and talent density. Even if the "Hundred Model War" did not unfold, there are still over a dozen active big model vendors, and further selection is still needed.
The third, a relatively pessimistic outcome: Price wars are manifestations of homogeneity. When price becomes the dominant force in the market, "land grabs" will ensue.
Classic battles in the mobile internet era, such as ride-hailing, food delivery, and community group buying, have all experienced price wars, often caused by homogeneous competition to some extent. If the benefits brought by big model roll performance become increasingly low, and their capabilities are more or less similar, price is undoubtedly an effective means of conquering territory.
In some people's eyes, the key to big models lies in their effectiveness, and customers care more about the value brought by their implementation than price. This view is not wrong, but it ignores the timing of price wars: If companies have been deeply using them for some time, price differences are difficult to pry away customers; if customers are still in the selection stage, whoever attracts more customers will gain the upper hand in the competition.
Borrowing the words of Lin Yonghua, chief engineer of the Institute for AI Research, "The price reduction wave for big models is a battle for the ecosystem. When a company has already adapted to a model, they may not be willing to adapt and switch to another one. Given the objective existence of switching costs, industry companies will hope to first attract a group of users through pricing." Following this logic, the price war for big models and those for ride-hailing, bike-sharing, etc., are not essentially different, but rather have different target audiences.
Objectively speaking, the decline in big model prices is an inevitable result, after all, the cost of inference has dropped. Just as Robin Li, the founder of Baidu, mentioned at the Create 2024 Baidu AI Developer Conference: "Compared to a year ago, the cost of inference for Wenxin's big model has dropped to 1% of its original cost."
The problem is that the prices of domestic big models have fallen almost precipitously, without giving big model entrepreneurs enough buffer time to find effective profit models.
The most active players in this price war are basically cloud vendors. Customers gained through price reductions or even free access can offset costs through model fine-tuning, model deployment, and various supporting cloud services. However, big model entrepreneurs lack a sufficiently thick ecosystem and may even need to rent computing power from cloud vendors. After the API business model is "cut off," it may be difficult for them to achieve "self-sufficiency" in the short term.
No matter which outcome prevails, big model entrepreneurs are the weaker party.
How to survive
Historical experience tells us that in a track full of uncertainties, there has never been a battle with a sure win. Behind big models lies a capital feast worth trillions. Compared to a few cloud vendors on the field, startups have greater imagination space, and someone always hopes they can stay at the table.
The price war driven by cloud vendors will undoubtedly increase the sense of crisis among big model startups, but it does not mean that there is no possibility of breaking the deadlock.
Strategy One: Break through the deadlock from the technical side, either by breaking through the ceiling in big model performance or finding the optimal solution in the deployment of big models.
According to Fu Sheng, the chairman of Cheetah Mobile: "In the short term, the performance of big models has encountered a bottleneck. No one can shake off the others, and no one can pull out an ace. Reducing inference costs and lowering prices have become the top priority for everyone now." The best way to dispel the haze of price wars is precisely to outperform competitors in big model performance. Facing unclear technical routes, there is no shortage of "luck" elements, but startups often have more courage to "gamble."
Even if it is not possible to create a significant gap in performance, enhancing the engineering capabilities of big models is also a feasible direction. Currently, APIs call "standard models," and enterprise users who want to deeply integrate big model capabilities with scenarios still need to fine-tune or locally deploy big models. If it is possible to further reduce the threshold and cost of big model deployment, there is no lack of the possibility of hedging against the impact of "price wars."
This is also a response strategy that some big model entrepreneurs are trying. For example, Zhipu AI has launched a one-click fine-tuning function on its MaaS 2.0 big model open platform, allowing users to complete the training of a "private big model" simply by preparing training data without needing code.
Strategy Two: Focus on enhancing the differentiated capabilities of big models, serving as cloud vendors' "ISV service providers," and enduring the elimination round of the big model wave with a low profile.
Due to various factors such as privacy, security, and performance, there has always been a phenomenon of "mixed model use" in the industry, i.e., calling different big models in different scenarios. When the capabilities of various models are similar and prices are comparable, creating a "comparative advantage" in a certain area is also a way to survive.
Moreover, cloud vendors such as Alibaba Cloud and Baidu Intelligent Cloud are also "wooing" big model entrepreneurs: on the one hand, they are binding with big model vendors in terms of computing power resources, for example, Alibaba Cloud's investments in MiniMax and Dark Side of the Moon have limited a portion of funds to be used for purchasing Alibaba Cloud services; on the other hand, they are actively deploying one-stop big model development and service operation platforms, such as Baidu Intelligent Cloud's Qianfan big model platform, where users can simultaneously call the capabilities of different big models. At least for now, no cloud vendor is willing to lose the "watermelon" of cloud services for the "sesame seeds" of big models.
Just as it has happened in many industries, price wars are often a sign that the industry is entering a "big melee." For those big model entrepreneurs who have not obtained huge financing, surviving as ISV service providers and enduring the elimination round of the big model wave is not a bad survival philosophy.
Strategy Three: Bypass or weaken the competitive pressure in the To B market, choose To C as a breakthrough direction, and attempt to become an industry giant in the next era.
Wang Xiaochuan's reason for not participating in the "price war" is: "In the domestic business environment, the To B market is ten times smaller than the To C market." Baichuan Intelligence simultaneously launched the AI assistant "Bai Xiaoying," initiating a dual-drive model of "super model + super app," hoping to compete with cloud vendors in a differentiated manner.
Wang Xiaochuan is not the only one with similar ideas. MiniMax has successively incubated products with daily active users exceeding one million, such as Xingye and Conch AI; Dark Side of the Moon has focused on the To C route from the start, launching the Kimi intelligent assistant; Kai-Fu Lee, who has been critical of price wars, has taken on the role of "Chief Experience Officer" for "Wanzhi," an AI assistant under OneZero AI; even Zhipu AI, which has the deepest To B layout, has begun to increase its promotion of Zhipu Qingyan, an AI assistant.
Will entering the To C track avoid internal competition? As mentioned earlier, current To C applications are mainly focused on intelligent assistants and emotional companionship. Big model startups have yet to provide better product forms, also concentrating on scenarios such as chatting and efficiency enhancement, with similar functions. However, compared to the passivity of To B price wars, To C carries the hopes of big model entrepreneurs.
Entrepreneurship has always been a nine-to-one proposition, and all paths to success are fraught with thorns. No one can know in advance which strategy is the right choice.
At least for now, the situation is far from dire. ByteDance's price reduction "big move" has not affected the new round of financing for Zhipu AI and Dark Side of the Moon, and both valuations have exceeded $3 billion; Shengshu Technology, which deploys multi-modal general big models, has recently completed a Pre-A round of financing worth hundreds of millions of yuan led by Baidu, with its video big model Vidu benchmarking OpenAI Sora...
However, not all entrepreneurs are so "lucky." Zhu Jianrenxian, the former deputy dean of Microsoft (Asia) Internet Engineering Institute, founded Zhujian Intelligence, which has been suspended for 6 months under cash flow pressure; Stability AI, which gained popularity with the open-source model Stable Diffusion, has been rumored to be seeking acquisition...
The sudden price war for big models, to some extent, has sounded the alarm for the industry. In a context where information gaps are becoming smaller and smaller, simply copying foreign business models from the internet era or "leapfrogging" by stepping on the time window and bonus window no longer has a soil for survival.