2025 AI Midfield Battle: Retreats and Advances

05/22 2025 428

From Surge to Contraction: AI Giants Encircle Emerging Players.

Author/Nan Yi

Produced by/Xinzhai Business Review

The domestic AI large model market is now transitioning from the "Hundred Models War" to a phase of elimination and integration.

After the "Hundred Models War" and capital frenzy of 2023, the industry landscape has clarified. The once-diverse market has reshuffled, with two dominant camps emerging: established giants and newly emerging unicorns.

In terms of cutting-edge technology and commercialization, established internet giants like Baidu, ByteDance, Alibaba, and Tencent (the "New BAT") have seized the initiative. Simultaneously, emerging players like ByteDance, Alibaba, StepStar, Zhipu AI, and DeepSeek are launching comprehensive attacks.

These giants and unicorns each have their unique strategies for model innovation, computational investment, ecological layout, and open-source strategies. Meanwhile, the "Six Little Dragons of AI" have diverged in fortune, with some already scaling back.

Since ChatGPT's release just two years ago, the market has seen a reshuffle from 300 large models competing to the "Six Little Dragons of Large Models" represented by Dark Side of the Moon, MiniMax, Zhipu, StepStar, Lingyi Wanwu, and Baichuan Intelligence. They once ambitiously pursued the footsteps of OpenAI.

However, with the full support of giant forces, technical thresholds and cost barriers continue to rise, squeezing the living space of the "Little Dragons." Importantly, a new round of AI advancements has also given birth to several major players.

Giant internet companies like ByteDance and Alibaba have increased investments in large models, deploying mid-platforms and application ecosystems. Joining them are emerging startups like StepStar, Zhipu AI, and DeepSeek. These players, primarily the "Five New Basic Model Powers," have demonstrated their prowess, sparking another round of technological and market competition. Nowadays, the frontline has shifted from algorithms to commercialization and ecological implementation, with price wars and competition for application scenarios becoming new focal points.

1. Building High Walls, Accumulating Grains: The Fierce New and Old BAT

In this AI game, the giant camp's offensive is particularly fierce.

ByteDance has maintained a steady and aggressive stance in this reshuffle. The "Large Model Family," including the Doubao series, has continued to upgrade in multimodal understanding and generation. The latest Doubao general model (Doubao-Pro) is on par with the GPT-4 series in public evaluations. ByteDance also continuously optimizes model structures for different application scenarios. This year, it announced visual understanding models and specialized models for text, speech, music, etc., maintaining an industry-leading position in multimodal tasks like language, vision, and sound.

More importantly, ByteDance has heavily invested in AI commercialization. The Doubao series has been launched in over 50 business scenarios within the company, with daily calls exceeding 4 trillion tokens, a 33-fold increase in just seven months.

To seize the market, ByteDance has boldly reduced prices and promoted the "cent era." When the Doubao general model was first released externally in May 2024, pricing was only 0.0008 yuan per thousand tokens, 99.3% lower than the industry average. By year's end, the visual understanding model was launched at a price of only 0.003 yuan per thousand tokens, 85% lower than the industry price. This price reduction storm was announced at the Volcano Engine AI Conference, where a ByteDance executive revealed: "The daily call volume of Doubao was 120 billion in May and has soared to 4 trillion since then."

The positive cycle resulting from the price drop quickly emerged. Underlying algorithm strength, coupled with a user ecosystem of hundreds of millions (Douyin, Toutiao, Feishu, etc.), enabled Doubao to quickly form a closed loop covering thousands of industries.

ByteDance also gathers developers through the open development platform "Kouzi." Currently, millions of active developers have participated in building 2 million intelligent agents, comprehensively spreading the AI application ecosystem. Combining technical strength, capital, and ecology, ByteDance has become a leader in the AI racetrack, with a strategy of "stronger models + lower prices + easier landing."

As the largest company with the earliest and most complete open-source practices, Alibaba is also the most determined in AI investment.

To date, the Tongyi team has cumulatively open-sourced over 200 models, covering the Qwen large language model and Wan visual generation model series. In late April, Alibaba released the latest "Qwen Family" Qwen3 series, ranging in size from 0.6B to 235B parameters (including two Mixture-of-Experts large models), choosing to fully open-source the models. In terms of performance, the Qwen3 small model (4B parameters) achieves the effects of the previous generation Qwen2 large model, while the Qwen-3 series as a whole significantly improves in multimodality and reasoning. As Ma Yun said, "AI is not a multiple-choice question, but a must-answer question for Alibaba."

In conjunction with the new models, Alibaba is actively reducing prices in the cloud service sector to seize the market. In 2024, Alibaba reduced the reasoning prices of multiple large models to about 3% of the original price and will continue to reduce prices in 2025. Alibaba adopts a two-pronged approach of open models and huge subsidies. On one hand, it opens up underlying models to attract global developer attention; on the other, it reduces costs to stimulate ecological vitality, reflecting its determination to continue investment.

Tencent's strategy and deployment in the large model field are also evolving.

In April, Tencent comprehensively restructured its Hunyuan large model R&D system, refreshing team deployment and increasing R&D investment around computing power, algorithms, and data. After the adjustment, Tencent established two new departments: the Large Language Model Department and the Multimodal Model Department, responsible for exploring cutting-edge technologies for large language and multimodal models, iterating basic models, and improving model capabilities. Tencent officials said this aims to optimize the R&D process and integrate resources to cope with the challenges of the large model era.

Previously, Tencent integrated major AI product lines like Yuanbao, ima, QQ Browser, and Sogou Input Method, proposing an AI strategy of "core self-development + embracing open source." Tencent's self-developed Hunyuan model excels in multimodal performance. This year's "Quick Thinking" Turbo S model and "Deep Thinking" T1 model reached industry-leading levels on public benchmarks. In vision and 3D generation, Tencent has open-sourced multiple models (like Hunyuan 3D generation, Hunyuan video generation, text-to-image DiT, the 100 billion-parameter Hunyuan MoE model, etc.), receiving nearly 30,000 stars on GitHub. The Hunyuan model is deeply embedded in products like WeChat, QQ, Tencent Meeting, and Tencent Docs, improving user-side intelligent experience and outputting capabilities through Tencent Cloud to help partners innovate and improve efficiency.

Notably, Tencent's 2024 fourth-quarter and annual financial report revealed record-high R&D investment of 70.7 billion yuan, providing strong backing for large model technological breakthroughs. In the ongoing midfield game, Tencent has built unique defensive lines and offensive points with its underlying algorithm strength, open-source influence, and ecosystem output.

Similarly, as an established internet giant, Baidu continued to increase investment in large model R&D and open source in the first half of 2025. With the "Wenxin Large Model" 4.5 series as the core, it successively launched and freely opened ERNIE 4.5 and Deep Thinking X1. Its performance surpassed competing products in multimodal understanding and reasoning capabilities, with reasoning costs reduced to 0.8 yuan and 1 yuan per million tokens, respectively, through the Turbo version.

At the market application level, the Wenxin Yiyan platform's user base soared to 430 million after becoming fully free in April, with an average daily call volume exceeding 1.5 billion. The upcoming launch of ERNIE 5.0 is expected to further innovate in multimodal fusion and reasoning efficiency.

At the product and ecological level, Baidu deeply embedded large model capabilities into search and intelligent assistants, launching the "AI Search" intelligent retrieval service and the general intelligent agent App "Xinxiang," covering over 100 scenarios like knowledge Q&A, document processing, and travel planning, and attracting developers to innovate through open APIs.

2. Breakthroughs by Emerging Forces: Reshuffling the Basic Large Model Landscape

In addition to the "New and Old BAT," some large model startups are also launching comprehensive attacks.

As an emerging force, Shanghai unicorn StepStar has been active this year. In late 2024, the company completed a B round of financing of hundreds of millions of dollars, led by Shanghai state-owned assets and followed by strategic investors like Tencent. The company disclosed that the funding will continue to be used for core large model R&D, especially to strengthen multimodal and complex reasoning capabilities, and further penetrate the C-end market through the product ecosystem.

In February, StepStar open-sourced its two most powerful multimodal models: "Step-Video-T2V" with 30 billion parameters, capable of generating 204-frame, 540P high-quality videos; and "Step-Audio," surpassing similar open-source models in multiple public speech evaluations, especially in the Chinese Proficiency Test Level 6.

Established for less than two years with a team of over 500, StepStar entered the "Six Little Tigers of AI" ranks after iterating 11 models and was named one of the "Four AI Startups to Watch in China" by MIT Technology Review. Company executives revealed that nearly 80% of the team are algorithm and technical personnel, with founder Jiang Daxin being a former Microsoft executive and IEEE Fellow inductee, and chief scientist Zhang Xiangyu being a co-author of the ResNet paper. These "high-density talents" support StepStar's rapid advancement in multimodal technology.

Meanwhile, Tsinghua University's Zhipu AI is rapidly expanding and has initiated the listing process. In April, Zhipu AI officially registered for guidance (with CICC as the guiding institution), becoming the first large model startup in China to enter the IPO process.

Since its establishment in 2019, Zhipu AI has focused on cognitive intelligent large model R&D. The company has cooperated with academic institutions to create a pre-trained model GLM-130B with 100 billion parameters in both Chinese and English, launching the dialogue model ChatGLM and its open-source version ChatGLM-6B.

In addition to general models, Zhipu has also launched multimodal and industry application components, including the AI assistant "Zhipu Qingyan," the efficient code generation model CodeGeeX, the visual language understanding model CogVLM, and the text-to-image model CogView. In terms of business models, Zhipu advocates "Model as a Service (MaaS)" and has established an AI development and open platform to provide privatized deployment and intelligent agent solutions for governments and enterprises.

According to official introductions, its MaaS platform already supports millions of developers and cooperates with multiple global automakers and terminal manufacturers to guide large models from "chatting" to "acting." Relying on its profound academic background and industrial cooperation, Zhipu AI strives to become a leading enterprise in the "base models" field.

In this camp, DeepSeek's voice cannot be ignored. The DeepSeek team is very low-key, but its technical approach is highly impactful. They focus on language models, especially mathematical and logical abilities, adhering to a firm open-source strategy. The DeepSeek-R1 model released during the Lunar New Year this year achieved performance comparable to top models like GPT-4 with significantly less computational input.

Industry analysts believe the key lies in DeepSeek's training method innovation: its use of the Mixture-of-Experts architecture enables total model parameters to reach 671 billion, but only 3.7 billion parameters are activated during runtime, significantly reducing computational demand; technologies like multi-token prediction and "latent attention" also greatly improve efficiency.

In summary, DeepSeek adopts a research-centric approach, refraining from immediate monetization. Its team, composed of elite newcomers, prioritizes algorithm optimization over size. Overlooked initially, DeepSeek garnered significant attention upon open-sourcing. Reports indicate that its app has surpassed 30 million daily active users, swiftly ascending to the top among AI applications. Investors and peers have flocked to it. For instance, Sina Finance reported that "following the open-sourcing of DeepSeek-R1, many insiders reassessed DeepSeek's technical prowess, prompting private equity funds to reach out to its founder, Liang Wenfeng." Through extreme engineering optimization and open-source innovation, DeepSeek has quietly carved out its competitive edge.

Overall, the new BAT trio of ByteDance, Alibaba, and Tencent, alongside emerging forces such as StepStar, Zhipu AI, and DeepSeek, form the current vanguard in China's AI "large model" race. They are fully committed to technological R&D, talent cultivation, capital investment, and market expansion, pushing AI technology to the forefront globally. In this landscape, large companies excel in resource and ecological integration, while startups emphasize innovation and focus, each leveraging their unique strengths.

3. Retrenchment and Focus: The Dissolution of the AI "Six Little Dragons"

Conversely, the "Six Little Dragons of AI," once high-flying startups, have experienced marked divergence.

Take Lingyi Wanwu, founded by Kai-Fu Lee, as an example. In early 2025, rumors circulated that it would sell its pre-training team to Alibaba. While Kai-Fu Lee denied the sale rumors, he acknowledged that "only large companies can currently sustain investments in ultra-large-scale model training." The company significantly shifted its focus, actively partnering with Alibaba Cloud to establish the "Industrial Large Model Joint Laboratory," transferring its pre-training algorithm and infrastructure teams to Alibaba, and concentrating on small-parameter, cost-effective models. He candidly stated: "We're no longer pursuing ultra-large models. It's not that we doubt Scaling Law; rather, we're leaving that to large companies and cooperating with them. This is our survival strategy."

Lingyi Wanwu abandoned self-training trillion-parameter models, turning instead to industry-specific intelligent applications in e-commerce live streaming and meetings, launching AI digital humans and the "Yi" series of services. While it hasn't collapsed, it has retreated from "foundation building" to focus on application deployment.

Similarly, Baichuan Intelligence has shown signs of strategic adjustment. In July 2024, it announced the completion of a Series A funding round of 5 billion yuan, valued at approximately 20 billion yuan, with investors including internet giants like Alibaba, Tencent, and Xiaomi. Baichuan focuses on the medical sector, introducing the large model "Baichuan-53B" in late 2023 and launching the AI assistant "Bai Xiaoying," which demonstrated medical consultation applications at the WAIC booth in 2024.

However, in mid-March this year, media outlets reported that two Baichuan co-founders had confirmed their resignation, amidst controversies regarding equity buybacks and financing mergers and acquisitions. In an internal letter, Wang Xiaochuan acknowledged Baichuan Intelligence's shortcomings over the past two years, stating that the company had "spread itself too thin and lacked focus." The company subsequently laid off part of its ToB financial department, refocusing on the medical expert model and Bai Xiaoying product. Despite financial support for product updates and ecological expansion, Baichuan is internally streamlining its business to sharpen its focus.

Dark Side of the Moon has recently tightened its approach, reducing its money-burning strategy. After the Spring Festival, media revealed that, under DeepSeek's pressure, Dark Side of the Moon decided to significantly cut its marketing budget, suspend multiple Android channel placements, and terminate third-party advertising cooperation. The company internally attributed this to "external environment changes and strategic adjustments," indicating a more cautious marketing approach.

Previously, Dark Side of the Moon aggressively promoted its AI assistant Kimi and launched products like Ohai and Noisee overseas. However, according to LatePost, Dark Side of the Moon decided to cease operations of Ohai and Noisee in September this year, concentrating resources on its core product, Kimi. Two senior executives responsible for overseas products left to start their own AI programming-focused company. These developments, amid cautious capital market sentiments, suggest that this once nearly $2.5 billion-valued unicorn is now adopting a more focused strategy, reducing investment in non-core business expansion.

MiniMax's path appears both cautious and dynamic. This company, which garnered attention due to its founding team's SenseTime background, quickly expanded into the overseas market with emotional companion apps like Talkie, achieving over 3 billion daily interactions. However, it now faces challenges with sluggish user growth and declining retention rates. As sector bottlenecks emerge, MiniMax has adjusted its strategy, gradually exiting the highly homogeneous emotional chat companion market, instead directing resources towards areas with higher technical barriers like video generation and music creation.

As the first domestic team to independently develop the MoE Mixture of Experts architecture, MiniMax is no longer content with single-modality competition but has built a comprehensive model system encompassing text, speech, images, and video.

In terms of product layout, MiniMax pursues both To B and To C strategies. The To B side reaches 30,000 enterprises through an open platform, covering standardized scenarios like customer service and education to reduce marginal costs. The To C side relies on emotional companion apps like Xingye and Talkie for 3 billion daily interactions but faces challenges with slowing user growth, limited paying capacity, and overseas regulatory risks. The recently launched Hailuo AI aims to break through homogeneous competition with unique features like text-to-video and music generation, though its penetration rate needs improvement.

In March this year, despite receiving $600 million in Series A+ funding and Talkie exceeding 3 billion daily interactions, MiniMax hasn't yet found a stable profit model. Despite significant capital injection and user activity, MiniMax continues to explore and adjust, seeking a sustainable path in the fiercely competitive market.

Overall, significant differences have emerged in the trajectories of AI startup "little dragons." However, this trend isn't a retreat but a necessary rational correction during industry maturity, aiming to preserve a "small but beautiful" survival space amidst giants.

In contrast, heavyweights led by ByteDance, Alibaba, and Tencent occupy a proactive position in mid-field competition due to their advantages in capital, talent, ecosystem, and data accumulation. Who will break through "intelligence limits" and "multi-modal capabilities" in the future remains to be seen.

It's certain that this "mid-field battle" has shifted from rapid advances to more solid competition. All players are honing their technology and optimizing strategies, aiming to retain advantages and explore new growth paths. Remember, "the path of large models must be clear and ultimately implemented in products." Who progresses further and steadier will determine the next stage of this AI competition.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.