From short videos to AIGC, Kuaishou and ByteDance restart the game

08/28 2024 472

Author | Chen Wen

Source | Insight New Research Institute

From short videos to editing tools, from e-commerce to takeaways, and now to the AIGC large model, the competition between Kuaishou and ByteDance has never stopped.

In terms of general large models, Kuaishou has Kuaiyi, and ByteDance has Doubao; for AI image creation, Kuaishou has Ketu, and ByteDance has Xinghui; for video generation large models, Kuaishou has Keling, and ByteDance has Jimeng. Additionally, in various fields such as AI music, editing tools, social products, special effects production, and more, both Kuaishou and ByteDance have corresponding AIGC products that compete against each other.

Table: Insight New Research Institute

On the main track of short videos, Kuaishou and ByteDance are neck and neck. Now, as they turn to the future competition in AIGC, every move they make attracts significant attention.

01 Sharp Confrontation

Public data shows that to keep up with the global trend of large model industry, ByteDance last year reassigned senior management and business leaders to form a new AI department, Flow.

Zhu Wenjia, the former head of TikTok's product technology, serves as the business leader of Flow, while Zhu Jun, ByteDance's vice president of products and strategy, serves as the product leader. Hong Dingkun, ByteDance's vice president of technology, serves as the technical leader, and the three parties collaborate to advance AI progress.

In terms of basic large models, ByteDance launched its first large language model, "Doubao," and the multimodal large model BuboGPT in August last year. Furthermore, ByteDance's basic models have been deployed in both language and image modalities, with both teams reporting to Zhu Wenjia, TikTok's technical leader.

At the AI application level, Flow has launched three AI products: Doubao, Kouzi, and Cici.

Among them, Doubao is a GPT-like application that can complete tasks such as question and answer, text generation, language translation, and more. It can also provide personalized services by adapting to user needs and context for question and answer sessions.

Kouzi is a one-stop AI Bot development platform where users, regardless of their programming background, can quickly build various question-and-answer Bots based on AI models. These Bots can handle simple questions and manage complex dialogues with logical processing.

To strengthen its AI strategy, other departments within ByteDance, including Jianying, Juliang Engine, Douyin main app, Douyin E-commerce, and Feishu, have also intensified their AI operations, with results gradually materializing.

Similarly, Kuaishou initiated a new AI strategy early last year.

During the Q3 earnings call in 2023, Kuaishou CEO Cheng Yixiao introduced that Kuaishou's language large models, Kuaiyi 13 billion and 66 billion, have reached industry-leading levels comparable to their peers. Additionally, Kuaishou has commenced research and development on language large models exceeding 100 billion parameters and multimodal large models.

According to media reports, Kuaishou AI is primarily overseen by the Kuaishou AI technology team (formerly Kuaishou's Y-tech department).

Specifically, the business layout involves establishing an AI service platform based on the Kuaiyi large model, providing AI technology services to the market. These services encompass core technologies such as computer vision, computer graphics, natural language processing, audio technology, video technology, knowledge graphs, machine learning, AR/VR/MR, and multimodal capabilities.

In February this year, following ChatGPT, OpenAI's announcement of the Sora AI video generation model once again ignited the internet.

However, starting from May, domestic AI video model technologies comparable to Sora have been successively unveiled, and on June 6, Kuaishou also launched its AI video generation model Keling and invited users to test it. As Keling's generation results closely resemble those of Sora, it has garnered significant industry attention.

Fu Sheng, Chairman of Cheetah Mobile, gave Keling high praise after experiencing it, stating, "I even think it outperforms Sora. I believe that within the scope of my usage, this product is currently the best in the world."

02 Battle for Mindshare

Regardless of the investment in AIGC or the dazzling promotion of products, the ultimate goal is to make these products usable, especially by ordinary people. From this perspective, the essence of competition among large models is the battle for end-user mindshare.

In this regard, ByteDance's Doubao demonstrates strong competitiveness.

According to QuestMobile statistics, as of March this year, Doubao had 23.282 million monthly active users, followed by Wenxin Yiyan, Tiangong, Iflytek Spark, and Kimi Smart Assistant with 14.661 million, 9.661 million, 6.204 million, and 5.897 million monthly active users, respectively.

According to official Doubao data, its monthly active users on both mobile and desktop platforms have exceeded 26 million. Meanwhile, these users have collectively created over 8 million agents.

Based on the Doubao large model, ByteDance has also created a series of products, including the AI application development platform "Kouzi," the interactive entertainment application "Maoxiang," and the AI avatar creation application "Xinghui."

Within ByteDance, over 50 businesses, including Douyin, Fanxiao Novel, Feishu, and Juliang Engine, have integrated the Doubao large model to enhance efficiency and optimize product experiences.

From the inside out, ByteDance's large model services have been integrated into various platforms and devices, including OPPO's Xiaobu Assistant, Honor MagicBook's YOYO Assistant, ASUS laptops' Douding AI Assistant, and Zeekr's cockpit large model.

It is noteworthy that Doubao is not only heavily used but also applied in diverse scenarios, ranging from C-end app users to B-end industries. It is reported that currently, Doubao processes 120 billion text tokens daily and generates 30 million images.

It is evident that leveraging its existing user base, ByteDance has adopted a strategy of building platform-based products and establishing an ecosystem of related products around them. This scene is reminiscent of ByteDance's early "APP factory" era.

Kuaishou's Kuaiyi large model possesses similar capabilities to Doubao, although it is currently more prevalent within Kuaishou, primarily serving its short videos, live streaming, advertising, e-commerce, and other businesses.

According to official Kuaishou data, in the past six months, nearly 20,000 merchants have leveraged the capabilities of the large model on the Kuaishou platform to achieve intelligent operations and substantial returns. Compared to January this year, the number of monthly active AIGC customers in June increased by eightfold, monthly GMV grew 64-fold, and AIGC advertising revenue scaled up 12-fold.

Apart from Kuaiyi, Keling is another powerful tool for Kuaishou to compete for mindshare.

Not only Fu Sheng but also many tech bloggers and AI creators have highly praised Keling's performance.

After experiencing Keling, AI creator Nana believes that character consistency is a significant highlight of Keling, which excels at realistic art styles, particularly in generating images of beautiful women, animals, eating scenes, and more.

Compared to two other foreign video generation software, Runway and Luma, Keling offers an additional function of generating videos from images, beyond text-to-video generation, and produces more stable results than Luma.

Due to its outstanding performance, Keling has attracted numerous users to apply for its beta test. According to official Kuaishou data, over 500,000 users had applied for Keling's beta test qualification as early as July, with 7 million videos generated.

It is evident that Keling represents a technological breakthrough, and Kuaishou's strategic layout in the video generation race is highly strategic. As Cheetah Mobile's Fu Sheng puts it, "The success of Keling further proves that Sora is not a technological breakthrough but a product-oriented showcase."

03 The Decisive Point Lies in Commercialization

Kuaishou and ByteDance have different focuses in their large model business layouts, but the ultimate goal of their competition remains commercialization – not only to make users love the products but also to make them willing to pay for them.

In this regard, Kuaishou and ByteDance exhibit different thinking directions.

Let's first examine ByteDance's positioning of Doubao. From a product perspective, Doubao is a comprehensive AI agent platform, indicating that ByteDance harbors greater ambitions for Doubao – not just empowering existing businesses but also hoping it can become a new revenue growth point.

Therefore, in application design, ByteDance focuses on addressing Doubao's "anthropomorphism," "proximity to users," and "personalization." To enable more people to experience Doubao, ByteDance has not only provided basic free services but also worked to significantly reduce Doubao's pricing.

According to official ByteDance data, the input price for the reasoning of Doubao's main model, the Pro-32k version, is 0.0008 yuan per thousand tokens. This means processing over 1,500 Chinese characters costs just 0.8 cents, 99.3% cheaper than similar models in the industry. The 128k model costs 0.005 yuan per thousand tokens, 95.8% lower than industry prices.

ByteDance believes there are two reasons for setting such a low price for Doubao: one is the need to keep it low, and the other is their ability to do so.

The former is straightforward: Only a sufficiently low price can reduce trial-and-error costs for enterprise users, boost their confidence, and encourage them to try large models.

The latter stems from a judgment based on industry trends and ByteDance's capabilities.

Li Kaifu, CEO of Innovation Works, once stated that the reasoning cost of large models decreases tenfold annually, a trend that has already materialized in the past two years. With good optimization, it can even drop by 20-30 times.

Through measures such as model structure optimization, distributed reasoning, and hybrid scheduling, ByteDance has greatly reduced the reasoning cost of large models. The larger the model invocation volume, the greater the cost optimization space.

Price is the most prominent label that Doubao has left on the industry. More importantly, through a series of operations, ByteDance has set an example, creating a paradigm for large model commercial operations in the industry.

In contrast, Kuaishou's AIGC business commercialization process is slower. As mentioned above, Kuaiyi is primarily used to empower Kuaishou's internal businesses. Although Keling has vast potential, Kuaishou currently has no clear business plan for it and does not provide an API externally.

Wan Pengfei, head of Kuaishou's Vision Generation and Interaction Center, mentioned Keling's future in a speech, stating, "The threshold for video creation and the ROI of effects have significantly improved, blurring the lines between video creators and consumers. More and more consumers are becoming creators, which is invaluable for the prosperity of the video creation ecosystem."

Thus, continuously strengthening its short video ecosystem and extending its strengths can also be a viable commercialization path. Technical products validated internally often yield more substantial results when output externally.

04 Conclusion

Li Kaifu, co-founder of Innovation Works, once conducted a survey and found that despite the extensive promotion of many products and rapid growth in user numbers, when all applications are combined, their daily active users number only about 10 million in China, where there are 1.2 billion internet users. In contrast, the United States, with a population of 300 million, has tens of millions of daily active users, indicating a significant gap.

Robin Li also urged the industry not to compete solely on models but on applications. Without applications, a basic model, whether open-source or closed-source, is worthless.

This suggests that while the future of AI has arrived, the productization of large models is still far off, and the era of earning money with AI has just begun.

In this trend, Kuaishou has an opportunity, ByteDance has an opportunity, and so do you and I.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.