07/10 2024 561
From concept to implementation, while technology continues to iterate rapidly, the market novelty has long since faded.
@TechInsights Original
At this year's WAIC World Artificial Intelligence Conference, every AI enterprise seems to have found its own Product-Market Fit (PMF).
2023 marked the first year of domestic large language models, with Tencent, ByteDance, Baidu, and many others entering the market, officially launching external services. Startups like Yuezhimian and Zhipu AI also formed the "Five Tigers" landscape. After a year of iterative development, large model players have stepped into the spotlight, offering the AI industry more opportunities and answers.
Unlike last year's focus on mere participation with large models, this year's WAIC highlights more practical applications, more disruptive technologies, and more inclusive models. Simply put, in previous years, vendors at WAIC showcased and competed over large model performance, whereas now, they are more focused on demonstrating actual, up-to-date implementation achievements and application directions to the industry, competing over large model implementation capabilities.
From Baidu, Alibaba, Tencent, to SenseTime, Zhipu AI, Baichuan, Minimax, from parameters to technology, from multimodal to end-cloud collaboration, from pricing to application implementation, each company seems to have a clearer and more pragmatic direction in large models.
And through this WAIC conference, a clear signal for the entire industry is that although technology is still immature and anxiety persists, AI applications are indeed being implemented in the industrial sector, albeit not as dazzling as one might imagine, outside the spotlight.
Part.1
AI Applications "Roll Onto" the Stage
If last year's WAIC was a revelry for enterprises, this year's WAIC is a revelry for users.
From the model level, far removed from users, except for the absence of companies like ByteDance, Yuezhimian, and ZeroOne, which primarily target the C-end market, almost all China's Tier 1 and Tier 2 large model teams were present, including BAT, Zhipu AI, Mianbi AI, Baichuan AI, Jieyue Xingchen, etc.; however, aside from these, no new Chinese large model companies emerged within or outside the conference. Among the eight "treasures of the museum" officially disclosed by WAIC, model-level achievements accounted for only a quarter.
At the application level, Chinese enterprises like Alibaba Cloud, Huawei Cloud, Wuwenxinqiong, Biren, and Suiyuan continuously exhibited new achievements surrounding large model training and inference, and the application of AI models in various industries is also showing an increasing trend of explosion.
For example, Ali not only showcased the intelligent coding assistant "Tongyi Lingma" but also, through "Tongyi Twelve Hours – Experience a Day with AI Assistant," comprehensively demonstrated the four core capabilities behind Tongyi Qianwen's large model: dialogue, efficiency, AI agents, and vision, allowing users to perceive the practical role of AI assistants in life, study, and work. In addition, DingTalk and Alipay are also important aspects of Ali's AI capabilities in office and life assistant scenarios.
Tencent's Yuanbao APP covers two major scenarios: work efficiency enhancement and lifestyle entertainment. In addition to providing core functions such as AI search, AI summarization, and AI writing, it also offers interesting and fun special applications such as creative painting, oral English practice partners, and versatile AI avatars, as well as more user-created AI agents.
SenseTime showcased Vimi, a controllable character video generation large model. Based on the powerful capabilities of SenseTime's Ririxin large model, Vimi can generate human-like videos consistent with target actions using just a photo of any style and supports multiple driving methods, including existing human videos, animations, sounds, text, and other elements.
There are many similar efficiency-enhancing implementation applications, ranging from WPS AI by Kingsoft Office to Zhipu AI's digital human live streaming platform and the collective appearance of eighteen robot humanoids, covering various scenarios such as healthcare, education, and office work. Moreover, these applications are no longer conceptual demonstrations but have penetrated into the core business processes of various industries, becoming key drivers of efficiency improvement, cost reduction, and value creation.
From a technology competition to a comprehensive evolution towards application implementation, both frontline tech giants like BAT (Baidu, Alibaba, Tencent) and second-tier vendors like Zhipu AI, Baichuan AI, Jieyue Xingchen, and SenseTime have shifted their focus to AI's practical applications and commercialization achievements rather than merely showcasing model performance. However, amidst this prosperity, issues of homogeneity are gradually emerging.
Part.2
Routes Diverge, Products Converge
One fact is that the implementation of AI applications is faster, deeper, and more extensive than anticipated, but in terms of results, there are certain discrepancies in development logic, routes, and user expectations for products.
First, let's talk about the routes of each company's business model. Baidu's CEO Robin Li has even repeatedly emphasized at WAIC and elsewhere that closed-source models for commercialization are the most effective. Meanwhile, Alibaba Cloud is advocating for open source, while Tencent proclaims that "general large models are not the only direction for model applications."
The large players cannot even unify their routes, let alone the small players, which are even more diverse. Zhipu AI takes the B-end route, but its user base is the C-end; Baichuan AI focuses on MaaS products for enterprise applications as its key exhibit; MiniMax adheres to a development strategy that values both To B and To C businesses equally.
Different research and development directions also mean that each company has different concerns. For example, Tencent was relatively slow in launching its large model, but according to the latest disclosure by Liu Yuhong, the head of Tencent's Hunyuan large model, nearly 700 Tencent businesses have accessed Hunyuan, including Tencent Meeting, Tencent Docs, and the AI assistant in WeChat Reading. This has led to an increasing frequency of new product and version releases for Tencent's large model over the past six months, essentially because without exploring native AI large model applications, it is difficult to know how to better collaborate with businesses.
However, regardless of the paths taken by everyone over the past year, the results presented to users after one year are quite similar. The simplest way to put it is that before this conference, countless users were filled with anticipation, eager to see if domestic giants could bring something fresh to open their eyes.
But the results were somewhat disappointing. In a nutshell, even without attending the exhibition, one can basically guess what the main products exhibited by each vendor will be. For example, Baidu's focus is mainly around Wenxin Yiyan's implementation, while Ali revolves around Tongyi Qianwen, and Zhipu AI centers on its Zhipu large model. From entertainment and social networking to office work and learning, the primary roles of large models are quite apparent.
Seemingly sophisticated implementation applications are essentially still concentrated on various dialogue assistants, text-to-image, and text-to-video products. As the industry has developed to this point, the novelty of these products has long since faded, to the extent that some attendees expressed that each company's products were more or less the same.
Of course, there are indeed some unconventional implementation ideas, but upon closer inspection, they are merely tailored to specific industries and generally lack novelty.
If last year's WAIC gave people hope for AI, with everyone eagerly anticipating what kind of diverse products the major players could roll out, the result is that just one year later, they have all grown into nearly identical shapes.
However, compared to giants like BAT, which can use large models to serve their internal business ecosystems and use those ecosystems to support their AI businesses, companies like Kuaishou, Bilibili, Jieyue Xingchen, Minimax, and Zhipu AI, which have designed large models for a shorter time, are making their debut and are more anxious about how to gain a foothold and thrive.
Part.3
Anxiety on Full Display for Small and Medium-sized Enterprises
This wave of AI implementation is not particularly friendly to small and medium-sized enterprises.
For example, Jieyue Xingchen, whose booth is next to Alibaba's, although entering the field late, is eager to squeeze into the "Five Tigers of Large Models" team, attempting to change the landscape to "Six Strong Ones" through marketing.
At WAIC, it released three large models, including Step-2: a trillion-parameter MoE model, which currently requires an application to experience; Step-1.5V: a multi-modal model with 100 billion parameters, which not only enhances image understanding capabilities but also supports video understanding; and Step-1X: an image generation model with a DiT architecture and three different parameter volumes: 600M, 2B, and 8B.
However, a question arises: is there still a way forward with these three models? Last year's Hundred Models War caused a huge waste of social resources, especially computing power. It has become a consensus in the industry to shift focus from models to applications, as lack of implementation makes it difficult to stand out or even secure funding.
Kuaishou is also a new face at this year's WAIC. From its large model family, its focus is more on the implementation attributes, emphasizing the opening of commercialization space through large models and injecting new vitality into its video business through video generation technology.
It is worth mentioning here that Kuaishou's Keling, in the field of text-to-video, is touted as comparable to Sora, but its current trial scope still cannot fully match market demand, and even people within Kuaishou find it difficult to try it out. Moreover, the current conclusions are based on beta testing videos, which means that Keling's model capabilities may be overestimated. More crucially, although there are currently no domestic competitors, more are on their way.
Other pure AI startups like Zhipu AI, in a market environment surrounded by giants, need to urgently find their own market positioning. Taking Zhipu AI as an example, although it has a layout in the B-end market, the blurred boundaries of its user groups make it face confusion in market positioning. Enterprises must incorporate unique value propositions into their products and services to differentiate themselves from competitors, which is particularly difficult in the context of technological convergence.
Most emerging AI enterprises are at the startup or growth stage, and the uncertainty of capital operations increases financial pressure. On the one hand, high research and development costs and market development expenses require continuous capital injections; on the other hand, the contradiction between investors' expectations for returns and the enterprise growth cycle makes financial planning extremely complex.
For companies participating in special exhibitions, attending WAIC costs at least several hundred thousand yuan. For these small enterprises, these funds need to be "loud and clear," either yielding orders or enhancing brand awareness. For companies like Zhipu AI and cloud service providers that value enterprise services, spending several hundred thousand yuan to "buy" ultimate customer resources is not cost-effective. However, after the exhibition, ensuring continued follow-up and closing deals with B-end users is a significant test of the company's after-sales capabilities.
Building an open ecosystem and establishing partnerships with industry partners are crucial for emerging AI enterprises. However, how to establish an effective ecosystem chain within a short time to achieve resource sharing and value co-creation is a realistic challenge facing these enterprises.
A year ago, the AI industry was still on the path of popularizing large models, spreading the concept of large models to ordinary people. Today, AIGC and large models have at least achieved widespread applications and promising commercial prospects in the industry. The so-called uniformity can also be understood as the optimal solution for the current industry. At this critical juncture where reality intersects with virtuality, the current calm may be paving the way for future explosions. After all, in the age of AI, every day becomes worth looking forward to.