12/19 2024 376
From a flurry of model launches last year to pragmatic competition this year.
Written by | Zhao Weiwei
Two years after the debut of ChatGPT, the race for large models continues unabated. Following its previous pricing strategy, ByteDance's Volcano Engine has extended this approach with the launch of the Doubao visual understanding model, focusing on applications in education, tourism, and e-commerce. The cost per thousand tokens has dropped to 0.003 yuan, an 85% reduction compared to the industry average, heralding the entry of visual AI into the "cent era." Across the pond, OpenAI recently unveiled the Sora video large model, but overwhelming demand swiftly caused ChatGPT to experience a global outage. Apple's latest update also saw Siri truly leveraging ChatGPT capabilities.
As a formidable competitor, Google has unveiled its latest AI model, Gemini 2.0, with enhanced multimodal capabilities. The intelligent assistant Astra can integrate Google Search and image recognition applications to complete tasks. By the end of 2024, ChatGPT surpassed 300 million weekly active users, with a billion users as the aspirational goal. However, cost pressures have emerged, leading to the introduction of various paid models. In the Chinese market, the competitive differentiation in 2024 has been strikingly evident.
Firstly, there is fierce competition for large model talent, exemplified by companies like ByteDance and Alibaba. Secondly, the pace of catching up is accelerating, whether in general models or multimodal large models; technological routes may differ, but no application scenarios are overlooked. Finally, large model startups must compete with major companies on the same stage, embarking on a path of differentiation and continually streamlining their offerings. In 2025, the large model market still holds great promise. Regardless of speed, they must address a critical external concern: How can users perceive their value?
1 ByteDance vs. Alibaba: The Race for Acceleration
Last year, the most notable difference in large model competition strategies between Alibaba and ByteDance lay in their investment focus. Alibaba opted for a diversified approach, investing in five representative large model startups: Dark Side of the Moon, Zhipu AI, MiniMax, ZeroOne, and Baichuan Intelligence. In contrast, ByteDance did not invest in any and instead entered the market directly, outlining a strategy spanning from models to applications.
This year, the most apparent rivalry has been in talent acquisition between ByteDance and Alibaba. Mid-year, Zhou Chang, the technical lead of Alibaba's Tongyi Qianwen large model, resigned to start his own venture. It was only revealed at year-end that Zhou Chang had joined ByteDance, responsible for AI large model-related work. Alibaba subsequently sued Zhou Chang for breach of non-compete agreement and sought compensation.
Talent acquisition has been a prominent strategy for ByteDance in the large model competition. According to the "2024 Large Model Talent Report" released in September, competition for talent in the AI large model field is fierce. ByteDance emerged as the company with the highest number of new large model job postings, followed by Xiaohongshu, surpassing internet giants like Alibaba and Meituan.
In addition to Zhou Chang, other key hires by ByteDance's large model team include Huang Wenhao from ZeroOne, Qin Yujia from Mianbi Intelligence, and Jiang Lu, the former project leader of Google's videopoet, all reporting to team leader Zhu Wenjia.
Alibaba is not oblivious to the issue of talent density. In October, Wang Jian, the founder of Alibaba Cloud, also discussed talent in a media interview. He cited the comparison between Google and OpenAI as the most glaring example. OpenAI has over 600 employees, while Google has several thousand, but it is ultimately the density of talent that determines innovation success. Because OpenAI boasts a high density of talent, while Google has many people working independently.
"With talent density and research intensity, innovation acceleration emerges," Wang Jian believed at the time. This "acceleration" is pivotal for technological innovation. With it, even if one lags, they can catch up; without it, even if one leads, they will fall behind.
Throughout 2024, "acceleration" aptly describes the keyword for ByteDance and Alibaba's competition in the large model arena.
Alibaba's Tongyi large model has not lagged in development this year. In September, the AI Tongyi Wanxiang video generation large model was officially launched. On the model front, Tongyi Qianwen's open-source model Qwen 2.5 outperforms Meta's Lama3.1 with 405B parameters. At the infrastructure level, the AI Infra series of products has established a stable and efficient AI infrastructure for Alibaba Cloud, increasing model computing power utilization by 20%.
But more fundamentally, ByteDance's large model combined with Volcano Engine directly targets Alibaba Cloud's approach with Tongyi Qianwen. Both aim for the future B-end market.
During the earnings call, Wu Yongming, CEO of Alibaba Cloud, disclosed that in the second half of fiscal year 2025 (October 2024 to March 2025), Alibaba Cloud expects to resume double-digit revenue growth. From this year's first three quarters' performance, one of the drivers behind Alibaba Cloud's revenue growth has been AI-related products, maintaining triple-digit growth for five consecutive quarters and further increasing their share in public clouds.
ByteDance's strategy for competing in the cloud market is clear. On December 18th, Volcano Engine's FORCE Momentum Conference will announce a series of updates on the Doubao large model. Zhang Nan, after a prolonged absence, also launched Jimeng AI. As the former CEO of Douyin, she now defines Douyin as a camera of the real world, while Jimeng is the camera of imagination.
Jimeng AI (formerly Dreamina) is a one-stop AI video creation platform under Jianying, targeting OpenAI's Sora. It was officially renamed in May this year and gained momentum in the second half. Starting in November, it began frequent updates, launching the S2.0 version. Entering December, the video generation model PixelDance also commenced internal testing on the Doubao computer version.
Volcano Engine has undoubtedly become a competitor to Alibaba Cloud. Both possess similar AI computing power infrastructures. For both, the continuous reduction of computing power costs and model prices is now a thing of the past. The more pressing focus now is cultivating a more robust AI application ecosystem.
2 Tencent vs. Baidu: The Starting Line Speed
From the outset, Tencent chose an open-source route for its large models, while Baidu represents the industry's closed-source models. Baidu, an early starter, and Tencent, which took its time, form another benchmark in the large model industry.
Last year, Baidu was the first domestic giant to release a large language model, while Tencent was the last among the giants, with a six-month gap between their releases.
This year, both have made progress in technological upgrades and scenario expansion, integrating with their original businesses to create new highlights. For instance, Baidu Wenku has been transformed into a "one-stop AI content acquisition and creation platform," while the Hunyuan large model has become more efficient in scenarios like WeChat search.
In the first half of the year, the most significant change in the Hunyuan large model was the release of the "Tencent Yuanbao" app. However, judging from the year's changes, it's hard to ascertain the success of Yuanbao's user data, as Kimi, Wenxiaoyan, and Doubao firmly occupy the top three positions, with web user access exceeding 15 million. Tencent Yuanbao, launched latest, ranks last in data.
Of course, web access data is only one aspect of various large models' activity levels.
In terms of mobile AI native applications, as of October, the industry's total monthly active user scale reached 89.76 million, a 373% year-on-year increase. Among them, Doubao, Kimi, and Wenxiaoyan had monthly active users of 48.39 million, 16.5 million, and 11.79 million, respectively. In other words, Kimi dominates on the web, Doubao on mobile, with Wenxiaoyan in the middle.
However, there's a notable discrepancy between third-party data and large model companies' official data. In November, Baidu officially announced that the daily average Adjustment amount of its Wenxin large model has surpassed 1.5 billion, with over 430 million Wenxin Yiyan users. According to QbitAI's think tank data, as of late November, Doubao's cumulative user scale in 2024 had exceeded 160 million.
Data calibers may differ, but financial reports do not lie.
In Q3 2024, Baidu's revenue was 33.557 billion yuan, a year-on-year decline of 3%. Under non-GAAP, Baidu's net profit attributable to shareholders was 5.886 billion yuan, a decrease of 19%. During the earnings call, Baidu mentioned that AI search commercialization is still in its infancy, and in the short term, Baidu is not rushing to monetize, which will pressure its primary revenue source, online advertising.
Similarly, during Tencent's Q3 earnings call, it also took a cautious stance on embedding the Hunyuan large model into commercial search results, as the primary task at this stage is still attracting users rather than premature monetization. However, large models have indeed enhanced WeChat search efficiency, with "increased commercial inquiries, higher click-through rates, and more than double our search revenue year-on-year."
In contrast, Tencent has been continuously releasing new large models in the second half. It first unveiled the new trillion-parameter large model "Hunyuan Turbo," followed by the first open-source large language model "Hunyuan Large," the 3D image generation model "Hunyuan 3D," and the video generation large model. Together with the open-source text-to-image model launched by Tencent in May, Tencent's Hunyuan large model has released four open-source large models.
Especially the release and open sourcing of the Hunyuan video generation large model in early December, complementing Tencent's large model offerings. This year, OpenAI released the first video generation model, Sora. Under this trend, video generation large models have become essential for internet giants. Although Tencent started late, it finally caught up with peers by year-end.
"It may take a few more quarters before we see some truly large-scale application cases of large models," Tencent responded during the earnings call regarding the Hunyuan large model's commercialization strategy. Currently, the most direct benefit of the Hunyuan large model is improved content recommendation and advertising efficiency.
Compared to the mature to-B market in the US, it's challenging for China's AI large models to penetrate the market through platform software sales. Baidu's key direction for large model commercialization remains building a larger intelligent agent ecosystem. Li Yanhong predicted this year that intelligent agents, as the most mainstream AI application form, will reach a tipping point, and Baidu will unveil a new generation of large models early next year.
3 Large Model Startups: Accumulating Moves and Simplifying Offerings
Large model startups in 2024 each have their own concerns.
Large model startups are fraught with crises, and simplification has become a benchmark. Kimi's founder, Yang Zhilin, faced arbitration from former company shareholders like Zhu Xiaohu. In September, Dark Side of the Moon ceased updating two overseas to-C products, Ohai and Noisee. Hong Tao, co-founder and commercialization head of Baichuan Intelligence, resigned. He and Wang Xiaochuan, both Tsinghua graduates, with Hong Tao once serving as Sogou's Chief Marketing Officer. In August, Huang Wenhao, ZeroOne's Vice President of Algorithms and Pre-training Model Lead, joined ByteDance, while core member Li Xiangang resigned and returned to Beike. MiniMax's AI companion apps are growing rapidly in overseas markets but also face similar challenges as TikTok in the future.
Currently, Zhu Xiaohu has directed more conflict towards his former colleague Zhang Yutong, claiming that as the former Managing Partner of GSR Ventures, Zhang Yutong joined Dark Side of the Moon for free and acquired a 14% stake, equivalent to 9 million shares, creating a conflict of interest and violating fiduciary duties.
Regarding Yang Zhilin's previous financing to establish Dark Side of the Moon without obtaining exemption letters from five investors, Zhu Xiaohu's latest proposal was: If Kimi is willing to sever ties with Zhang Yutong, we are willing to exempt Kimi, Zhang Yutao, and Dark Side of the Moon from arbitration.
Zhang Yutong was the second-largest individual shareholder when Dark Side of the Moon was established. In Yang Zhilin's response, he affirmed Zhang Yutong's insight into the company's business and strategy and his rich investment and financing experience, a necessary complement to the team. Zhang Yutong's shares vested over many years, "with the condition for vesting being continuous service and performance output for the company for many years."
The latest update indicates that Zhu Xiaohu has yet to reach an agreement with Dark Side of the Moon, and the arbitration case remains unresolved. However, prior reports by TechNode suggest that both new and old shareholders, along with Dark Side of the Moon, have come to a preliminary agreement through consultation. Lin Renjun, another Managing Partner at GSR Ventures, has also given his approval, and it is possible that some former shareholders of Cyclic Intelligence may have withdrawn their arbitration cases against Dark Side of the Moon.
For large model startups, despite their valuations soaring into the tens of billions of dollars, the industry has transitioned into a phase of rationalization over the past year, compelling them to differentiate themselves.
In the AI industry, the notion of "not reinventing the wheel" has gained increasing prominence. For instance, rumors circulated in October this year that at least two of the six large model unicorns were gradually abandoning pre-trained large models, with ZeroOne being one of the potential candidates. However, founder Kai-Fu Lee subsequently denied these claims.
Compared to ZeroOne, the other four large model unicorns have each showcased unique developmental paths over the past year: MiniMax is projected to generate $70 million in revenue this year, primarily from its overseas companion product Talkie, according to foreign media reports. Dark Side of the Moon has focused on technology, releasing the Kimi Exploration Edition in the second half of this year and introducing two evolved versions, K0-math and the visual thinking model K1, at the end of the year. Zhipu AI excels in customized B-end projects and was the sole unicorn among the five to compete in China's large model bidding announcements, having just completed a new round of financing of 3 billion yuan. Baichuan Intelligence, meanwhile, has delved deeply into the medical field.
On the surface, large model unicorns continue to innovate and refine their models. However, after a year of evolution, the competition between these unicorns and internet giants has become multifaceted. It not only tests model capabilities but also organizational strength and business scenario competitiveness, posing a challenge to the comprehensive strategic prowess of each large model unicorn. This battle is destined to be fiercely competitive and eliminate weaker players.
Who will lag behind by 2025?
Proofread by Chen Qiulin
END