11/13 2024 345
Following the version update in September, Guangzhui Intelligence learned from Tang Jiayu, co-founder and CEO of Shengshu Technology, that the Vidu large model will undergo another version upgrade this week, with the Vidu-1.5 version Coming online soon 。
The focus of this version update remains on extending the generalization capabilities and subject consistency of the large model. The previous version emphasized the consistency of a single subject, while the latest version can understand and integrate multiple concepts such as characters, objects, and environments, generating multiple subject-integrated video results within 30 seconds following user instructions, taking the lead in achieving multi-subject consistency in video creation.
In addition to Vidu, since September this year, according to incomplete statistics, mainstream AI video generation platforms including ByteDance's JiMeng AI, Kuaishou's Keling AI, Runway, Zhipu Qingying, Aishi Technology's PixVerse, Pika, and others have all undergone version updates.
Currently, in the booming AI video generation sector, large model startups and internet giants are entering the fray. After an initial concentrated product launch phase, the competition has now entered a stage of product iteration and upgrading.
From the content of each company's version updates, it is not difficult to see that the major direction of iteration in the capabilities of large models for AI video generation remains the duration of generated videos, image stability, continuity, and subject consistency before and after.
However, at the same time, each player is beginning to "diverge" in practical functional applications, with different focuses, and some small and medium-sized players are also starting to find their niche markets.
For example, the latest version of Runway includes Act-One, which can accurately replicate human facial expressions onto AI characters, enabling 3D AI camera controls. PixVerse has launched various Halloween special effects and Venom special effects.
Regarding this round of updates by various AI video generation platforms, Chen Kun, founder of Xingxian Culture and producer of the AI original fantasy IP "Shanhai Qijing," believes: "The biggest update should be the expression transfer like Act-One, which provides the basic possibility for character performance." Regarding the consistency and stability of character subjects, "there has been progress, but it is not intergenerational progress."
In the view of AI video creator Caudal Fin Vicky, compared to the first-generation products from the first half of the year, the latest updated AI video platforms have not only iterated on the underlying model capabilities but also updated their functions, such as end-to-end stitching, image quality and frame rate enhancements, voiceover, and other functions. "The improvement of these functions is actually more comprehensive than in the first half of the year."
If the first half of 2024 was an arms race in the AI video generation sector, the second half has entered a period of small but steady version updates.
During this stage, the rivalry between ByteDance and Kuaishou remains intense, with small and medium-sized manufacturers beginning to find their unique niches, while some enterprises focus on overseas markets, achieving the effect of "blooming domestically and fragrant beyond the wall."
Undoubtedly, this stage of competition, while seemingly mild, substantially affects the platform's positioning, future development direction, and the sustainable growth of its subsequent user base and scale.
ByteDance chases fiercely, Kuaishou takes the lead
"JiMeng is lagging behind a bit." This is an objective evaluation from users of AI video generation platforms.
As one of the first batch of AI video generation platforms last year and a product under ByteDance, JiMeng AI's video generation effect has been criticized by users, lagging behind competitors like Runway and Pika.
In June this year, Kuaishou, a direct competitor of ByteDance in the short video sector, officially launched the "Keling" large video generation model on its official website and quickly gained popularity. Meanwhile, more and more AI video generation platforms have sprung up, and the AI video generation sector has completely taken off.
Under strong competitive pressure, as the first tier of domestic AI products, addressing the shortcoming of video generation has become an urgent task for ByteDance, and its rapid catch-up speed has been Beyond imagination 。
On September 24, the 2024 Volcano Engine AI Innovation Tour was held in Shenzhen. Chen Xinran, the former head of art at Douyin, appeared in his capacity as the market and operations head of JiMeng AI and Jianying and announced that JiMeng AI had integrated Douban's latest video generation model.
During the same period, ByteDance released two video generation models, Seaweed and Pixeldance, from the Douban model family and conducted small-scale invitation tests for creators and enterprise customers through JiMeng AI and Volcano Engine, respectively.
On November 8, ByteDance's AI content platform JiMeng AI announced that the video generation model Seaweed, independently developed by ByteDance, was officially open to platform users.
According to ByteDance, the Seaweed video generation model from the Douban model family, which is now open for use, is the standard version of the model. It can generate high-quality AI videos with a duration of 5 seconds in just 60 seconds, leading the domestic industry by 3 to 5 minutes compared to required generation times.
JiMeng AI also revealed that the Pro versions of the two video generation models, Seaweed and Pixeldance, will also be open for use soon. The Pro versions enable natural and coherent multi-shot actions and complex multi-subject interactions, overcoming the consistency challenge of multi-camera switching while maintaining consistency in subjects, styles, and atmospheres during camera transitions, suitable for various devices such as movies, TVs, computers, and mobile phones.
As leaders in the domestic short video platform market, ByteDance's Douyin and Kuaishou's competition has shifted from short videos and e-commerce to the AI sector. Objectively speaking, Douyin surpasses Kuaishou in all aspects. However, in the AI sector, Kuaishou has given a strong counterattack.
Since its stunning debut in June, Kuaishou's Keling has actually undergone several minor version iterations.
However, in terms of underlying large model capabilities, on September 20 this year, Kuaishou released the Keling 1.5 version, incorporating a new generation of models, achieving significant improvements in image quality and dynamic quality. The original model also added a new feature - motion brush, enhancing the controllability of generation effects.
"Keling 1.5 is very strong. It can be said to be the most realistic among all models, surpassing Runway and basically overcoming previous character deformation issues," said Yangyujiang AIgen (stage name), an AIGC entrepreneur, to Guangzhui Intelligence.
In the actual generated video effects, comparing Keling and Runway, it can be seen that for the same prompt, both have strong effects on the stability of the actual character subject, but the video effects generated by Keling can automatically unlock facial expressions.
"Runway can actually generate facial expressions autonomously, but the effect is very bizarre," said Yangyujiang AIgen. However, the ability of Keling AI and Runway is random and not fixed.
This also shows that Keling AI and Runway are superior in actual generation effects. In terms of understanding prompts, Keling AI is indeed at the forefront, but future iterations and upgrades are still needed to solidify this ability.
(Runway, prompt: a female model wearing new Chinese-style clothing is showcasing her look, with colored smoke in the background, provided by Yangyujiang AIgen)
(Keling AI, prompt: a female model wearing new Chinese-style clothing is showcasing her look, with colored smoke in the background, provided by Yangyujiang AIgen)
However, after JiMeng launched its latest large video generation model, Caudal Fin Vicky believes that there is not much difference between it and Keling in terms of model capabilities and UI design. Meanwhile, during the internal testing of the Pro version model on the JiMeng platform, it was easy to control the movement amplitude and actions of the screen.
As leading domestic short video platforms, Kuaishou and ByteDance's ultimate goal in the AI video generation sector is to attract and retain users' attention, which requires continuously producing novel, high-quality, and creative content.
Based on this, AI short dramas have also become one of the focal points of competition between ByteDance's JiMeng and Kuaishou's Keling.
In July this year, the AI short drama "Shanhai Qijing: Breaking Waves" created by "Keling AI" garnered widespread attention and became the first domestic AIGC original fantasy micro-drama.
In September, Kuaishou Xingmang Short Drama collaborated with "Keling AI" to launch the "Xing You Lingxi - AI Short Drama Creation Contest." It is reported that the contest incentivizes more people to join AI short drama creation through traffic rewards, honorary awards, content contracts, and other measures.
ByteDance is also not to be outdone. While JiMeng AI jointly released the first AIGC-generated sci-fi short drama "Sanxingdui: Revelation of the Future" with Bona Film Group, it is also collaborating with multiple "super creators" on the Douyin platform to achieve co-creation, inviting high-quality fans and influential personalities on the platform to join the "Super Creator Alliance" program, hoping to create the largest virtual creation community in China.
However, at this stage, whether it's Douyin or Kuaishou, the content created by film and television creators on their video platforms is "difficult to break the circle," as stated by Caudal Fin Vicky. "Because the entire market has not yet formed, C-end users do not know what to use it for. There will be some commercial demand at the top, but the demand is not large, and it is not stable overall."
After all, there are still relatively few professional creators globally at this stage, and AI video generation large model technology is still in its early stages.
Therefore, as leading video platforms, the competition between ByteDance and Kuaishou is intensifying. Besides the underlying AI technology and product competition, the more important aspect lies in who can take the lead in exploring the path of technology empowering content. After all, if a platform can gather more innovative content creators, it can create a community ecosystem that attracts and delights users.
Of course, besides ByteDance and Kuaishou, other players in the AI video generation sector are also beginning to "diverge." Some small and medium-sized manufacturers have also started to explore and forge their own paths of differentiated competition.
The rise of niche markets, finding the right positioning is key
On short video platforms like Douyin and Kuaishou, some creators' content may find it difficult to break the circle, but some videos with meme effects are exceptionally popular, such as the video of He Jiong and Huang Lei suddenly fighting, generated by AI.
For players in the AI video generation sector, if ByteDance and Kuaishou compete in a comprehensive technology and content ecosystem, other small and medium-sized players are more focused on niche markets. Finding the right platform and product positioning has become the foundation for survival and development.
At the end of October, the CEO of Runway explicitly stated in an open letter that Runway is not an AI company but a media and entertainment company, "I believe the era of AI companies is over."
Based on this, while major companies are competing to increase the length, realism, and smoothness of AI video generation, Runway has obviously carved out its unique niche in the AI video sector - serving art, media, and entertainment exclusively with AI.
Judging from Runway's actual video generation effects, its effects on character stability and consistency can be said to be at the forefront. Besides basic technical capabilities, in the latest version update, Runway's two new features, although small, will greatly benefit animators, game developers, and filmmakers, saving them significant costs.
Runway can be considered one of the most popular products among film and television practitioners. Besides technical strength, what is more important is its cost-effectiveness.
"Runway is just too good. We use Keling sparingly, but Runway is unlimited. It doesn't matter if we draw hundreds of times a day," said Yangyujiang AIgen. "The randomness of AI videos is still very strong. If charged per use, ordinary creators may find it difficult to afford the cost."
In contrast, for Keling, if you purchase credits with 1000 yuan, you can get 15,000 Keling points. Using 35 Keling points each time, 1000 yuan can only generate 428 videos. For true entrepreneurs, this is basically not enough. "Based on my frequency of generating over 200 videos a day on Runway, the credits purchased with 1000 yuan for Keling are basically used up in just two days," said Yangyujiang AIgen.
As mentioned in Guangzhui Intelligence's previous article "The Exploding AI Video Sector: Large Companies Go Left, Startups Go Right," for the current membership-based pricing models adopted by various platforms, the subsequent payment rate and willingness to pay for entrepreneurs who cannot achieve a commercial closed loop will not be high. Nowadays, even for entrepreneurs who can achieve a commercial closed loop, cost-effectiveness is also a key factor affecting their product use.
Besides Runway, Pika and Pixverse have also found their niches. From their latest updated versions, it can be seen that these two companies have focused on training special effects that users can directly use. "Although the metaphor may not be entirely accurate, it is somewhat similar to the stickers previously made by Douyin," said Yangyujiang AIgen.
For example, during Halloween at the end of October, the PixVerse V3 version added many Halloween-themed special effects, including zombie mode, wizard hats, monster invasions, and other themed effects. There are also special effects similar to Pika's popular AI Nienie, which allows users to extend existing videos by an additional 5-8 seconds and precisely control the content direction of the new segments.
With the recent release of the movie "Venom: Let There Be Carnage," PixVerse has launched a new special effect based on its latest video model, PixVerse V3, called "We Are Venom," which can generate cool Venom animations from images with just one click.
Currently, such meme effects are very popular among users on social platforms. Previously, in version 1.5, Pika introduced the AI Nienie special effect, which was instantly loved by users upon its release. Relying on this wave of special effects, Pika achieved overtaking on a curve. Similarly, Hailuo AI, which started growing around the same time as Pika, also relied on character performances and meme stickers to directly explode overseas public opinion, overtaking on a curve.
Pika's AI Pinching Effect
In fact, despite being launched later, Conch AI has received high praise from industry insiders. 'Conch AI excels in depicting human movements. The recent AI-generated video of He Jiong and Huang Lei fighting was created using Conch AI,' said Yangyujiang Aigen.
However, what is more significant about Conch AI is its achievement of 'blooming domestically, fragrant beyond the wall.' As an AI video generation platform launched overseas by domestic AI company MiniMax, its popularity has continued to rise since its launch.
According to the 'AI Product Rankings,' Conch AI's web version experienced an 860% surge in visits in September, topping the global and domestic growth charts for that month. Overseas users have been sharing their experiences on social platforms, generally agreeing that Conch AI is one of the best AI video generation tools currently available.
With the explosive popularity of its product in overseas markets, MiniMax has emerged as a leader in commercialization among the Big Six AI models.
In contrast, platforms like Vidu and Zhipu Qingying are evolving in terms of video generation time, subject consistency, and character stability, but they have yet to establish their unique styles or competitive advantages.
While AI video generation technology is evolving and branching into specialized sub-sectors, research reports from Cinda Securities also indicate that there is still room for improvement in character consistency, required generation time, and image quality to meet commercial standards.
Currently, mainstream AI video tools are still in the stage of competition for video generation, and most are single-function products. To achieve direct output of commercializable videos, multiple video creation tools need to be used in series.
In the future, AI video generation platforms based on large models will need to continue to iterate and evolve.