Baidu and Alibaba Compete for AI Definition Rights

05/13 2025 390

Author | Hao Xin

Editor | Wu Xianzhi

'MCP' (Model Context Protocol) has surged in popularity.

Baidu is entering from the consumer side, with 'Xinxiang' leveraging the MCP protocol to integrate various AI models and external tools. Additionally, Baidu Maps has announced support for the MCP interface. Alibaba Cloud's Bailian platform has launched a full lifecycle MCP service, integrating the MCP protocol into products like Alipay to enable one-click invocation of AI tools. On April 29th, Alibaba's open-source Qwen3 series of models also began supporting the MCP protocol.

Upon closer inspection, the driving forces behind this trend are foreign companies such as Anthropic, OpenAI, and Google, as well as domestic giants like Baidu, Alibaba, and ByteDance.

On the surface, industry leaders and AI firms are aiming to bridge the 'last mile' for AI Agent implementation, unify industry standards, and unleash the full potential of Agent calling tools. However, beneath this lies a battle for definition rights over an emerging industry.

In reality, beyond the open-source MCP, companies like OpenAI and Google each maintain their own standards for Agent calling tools. Adopting Anthropic's MCP does not necessarily signify recognition of its dominance but rather a temporary agreement based on open-source principles to rapidly expand the partner ecosystem.

From a broader perspective, MCP can be seen as a crucial link in the implementation of Agents. Manus is just the beginning, and once a standard consensus is reached, large-scale Agent applications will become increasingly visible.

At that point, Agent applications will once again evolve into a competition among big companies' ecosystems.

Developing independent Agent applications inherently carries the risks of high costs and being overshadowed by market leaders, making integration into big companies' Agent application ecosystems a viable option. Consequently, big companies wield significant power, from defining standards to screening partners. In this scenario, the more comprehensive the ecosystem and the higher the data barriers, the greater the industry discourse power.

Big Companies Expand Their Ecosystems

Relevant technical personnel emphasize that the essence of MCP is to provide a standard and efficient connection method between models and external tools. They stress that 'MCP is merely a protocol and does not enhance or bring any new capabilities to large models.'

These technical experts note that MCP is not a necessary component for building services. Even without MCP, similar outcomes can be achieved through Function Call and existing tool tuning parameters.

While from a technical standpoint, the implementation process remains the same regardless of MCP, a unified standard protocol is crucial for big companies' discourse power within the industry. It can be argued that OpenAI and Google first recognized the significance of MCP, followed by Alibaba and Baidu. Through mutual recognition, they have sparked a global trend of opening up MCP services.

A representative from a company that recently launched MCP services shared that before MCP, customers seeking deep functionality in AI products were limited to SaaS tools. However, for many industry-savvy customers, general SaaS lacked certain in-depth capabilities.

Previously, customization through Open API integration into the system was the only option. Now, with the introduction of MCP services, as long as an Agent supporting the standard MCP protocol is available, it can quickly access the product platform, 'saving time, effort, and money.'

Looking ahead, to broaden the reach of MCP services, the representative mentioned considering open-sourcing and listing on Alibaba and Baidu model service platforms. The two key factors they value are big companies' traffic and ecosystem support.

Huang Jizhou, Chief Architect of Baidu's Intelligent Agent Business and Head of Xinxiang APP, informed us that Xinxiang supports both external MCP access and its own independent protocol. Currently, Xinxiang has integrated a total of ten agents, including AI picture book functions from Baidu's library and external health functions.

Existing cases illustrate that the effectiveness of MCP implementation scenarios is not technology-dependent but influenced by non-technological factors. From Baidu to Alibaba, the ecosystem serves as a pivotal link, enabling adaptation without disruption.

Photon Planet observed that the number of MCP Servers deployed on Alibaba Cloud's Bailian platform has reached 31, offering functions such as maps, text-to-image generation, and search, all belonging to Alibaba's ecosystem.

Big companies play dual roles as both integrators and integratees. On one hand, they provide mature MCP service capabilities, such as Baidu Maps and Gaode Maps opening MCP interfaces; on the other hand, they integrate external third-party capabilities within their ecosystems to complement each other. The more comprehensive the ecosystem, the richer the user needs it can satisfy.

Accessing MCP equates to possessing 'atomized' capabilities that can be freely combined and embedded into business workflows. For instance, developers can access receipt and payment functions through the 'Alipay MCP Server', connecting AI application payment channels and resolving the challenge of agents being able to converse but not collect payments.

From Manus to Baidu Xinxiang

According to a study on the AI research website 'AI Digest', the length of tasks that AI Agents can complete is growing exponentially, doubling every seven months.

Following this trend, by 2026, AI Agents will be capable of completing 2-hour tasks; by 2027, 8-hour tasks, equivalent to a full workday; and by 2029, Agents will be able to handle a month's worth of work.

The prospects for Agent application products are boundless. The previously hyped Manus outlined a product example of multiple agents completing tasks, but the results were underwhelming. Recently, Baidu launched 'Xinxiang', a similarly positioned product, to explore general Agent offerings.

In terms of product form, it has shifted from the traditional AI assistant model of question-and-answer to directly completing tasks and delivering results. Past AI assistants provided components that users needed to assemble themselves, whereas current Agent products deliver the final product directly. In terms of efficiency, users once had to navigate complex task flows and articulate their needs precisely, but now, a single sentence suffices, with all steps automatically completed.

During task execution, a 'butler'-like role called the main agent is responsible for dissecting user needs and assigning tasks. Upon instruction, various sub-agents work concurrently.

Xinxiang is currently available on Android mobile devices and will be extended to PC in the future. The challenge with Agent products like Manus lies not in technology but in screen limitations. Mobile screens are too small for clear viewing or quick interaction, whereas PC screens, while larger, present the challenge of efficient utilization.

Huang Jizhou explained that the MCP protocol plays a crucial role in invoking multi-agent collaboration, akin to a key that simplifies access with a unified secret. However, he also noted that the industry's biggest issue so far is that many want to benefit but few contribute. 'The costs are substantial. Converting Tool Use into MCP might be manageable for 1,000 instances, but what about 10 million or 100 million?'

Currently, the ecosystem serves as the solution for cost-sharing. Huang Jizhou believes that after each company opens up MCP, the challenge lies in the ecosystem's robustness and the business model's viability. Ideally, both developers and big companies should profit, with increasing demand enhancing user experiences.

The current goal for the Xinxiang product is to evolve into a universal super agent. Horizontally, it aims to integrate as many scenarios as possible into the application; vertically, it seeks to deepen scenarios and maximize functionality.

Huang Jizhou believes that law, tourism, health, education, and research are promising scenarios, with the potential to expand into long-tail interaction scenarios. 'The longer the tail, the higher the barriers.' On Baidu's MCP ecosystem, more AI functions will be integrated to offer multi-agent solutions.

Commercialization: Advertising or Otherwise?

What is the commercialization direction for Agent applications? As of now, the most plausible answer still points to traffic and advertising.

This is largely influenced by the current Agent application mechanism. In a Xinxiang demonstration, there's an example where a user plans a trip to Harbin and asks the Agent to help create a travel itinerary, make phone calls, and use group-buying coupons to book a restaurant. Essentially, this encompasses services like maps, reviews, and travel and transportation.

To form a comprehensive service, a complete data chain is essential. Xinxiang has integrated 'Maoyan data' to enhance movie box office information accuracy. Some third parties can bridge competition among big companies, but others cannot, meaning that in the early stages, competition on ecosystem diversity and completeness is crucial.

Agent applications serve as entry points, diverting traffic to other ecosystem applications, ultimately completing the request-to-delivery closed loop. This revenue stream is akin to 'keeping profits within the family.'

Third-party Agents that once supplemented big companies' ecosystems now fall into the traffic pool. According to our understanding, Baidu and other big companies have screening mechanisms for MCP and Agents, with the companies deciding which Agents to integrate. In the early stages, big companies need more developers to fill app store gaps, but later, it becomes a traffic competition. Similar to pay-per-click advertising, higher bids lead to greater exposure.

This scenario also applies during user application usage, such as in the travel planning example, where big companies hold the screening rights for restaurant rankings and flight price rankings that appear in searches. A single advertisement can yield multiple benefits, with businesses improving ranking order through advertising and marketing, while users can purchase ad-free or bidding services to enhance their experience.

In this way, while Agent applications seem capable of replacing multiple standalone apps, they cannot disrupt the existing advertising traffic system.

Big companies divert traffic through Agent applications and charge businesses for advertising. User behavior data (like search preferences and personalized data) is utilized for targeted advertising. By integrating infrastructure such as maps, reviews, and payments, a service closed loop is formed, forcing third-party Agents to rely on their data interfaces, becoming mere traffic conduits.

Foreign companies like OpenAI and Perplexity are already demonstrating these trends, and domestic companies like Baidu, Alibaba, and ByteDance are likely to follow suit.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.