The Next-Generation Battleground: Why Are Big Tech Companies Embracing Agents?

05/30 2025 505

With Deepseek and Manus captivating internet audiences, over half of the tech industry's hot topics revolve around large models and agents.

At Skyworth Coolka's spring conference on April 22, a super agent composed of six specialized agents in audio-visual, health, life, devices, creation, and education made its debut.

Three days later, at Baidu's AI Developer Conference, Robin Li unveiled multiple AI applications, including the general super agent app Xinxiang and the content operating system Cangzhou OS.

The Sequoia AI Summit, held mid-May, predictably featured 'agents' as a core topic, boldly stating that AI possesses a market potential "10 times that of cloud computing."

Furthermore, events like Google I/O 2025 and Microsoft Build 2025 also mentioned agents, spanning industries such as programming, healthcare, finance, and more. Whether it's global giants like Microsoft, Google, OpenAI, domestic enterprises like Alibaba, Tencent, Baidu, Skyworth Coolka, or capital institutions represented by Sequoia, all are vigorously promoting agents.

The pertinent questions arise: What exactly constitutes an agent? Why are 'big tech companies' competing for agents, and what transformations will they usher in?

01 The 'Magic' of Agents: The Next Interaction Frontier

Before delving into the discussion, let's pause to understand the concept of 'agents.'

Agents, referred to as AI Agents in English, where the term 'Agent' implies 'representative.' This creates a qualitative distinction between agents and conversational AI: agents are no longer confined to question-and-answer exchanges but rather intelligent applications capable of deep thinking, autonomous planning, decision-making, and thorough execution.

The scenario is undeniably alluring. To uncover the sudden surge in agents' popularity, we need to consider another perspective—why do enterprises and consumers need agents?

The key to any technology's widespread adoption might not lie in its capability ceiling but rather in its low application threshold. If only engineers can invoke it, experts can configure it, and only a few can comprehend it, even the most potent capabilities will remain 'lab miracles.'

Comparing the evolution of large models and cloud computing:

The training and inference of large models require immense computing power and underlying architecture optimization, akin to IaaS in cloud computing, serving as the 'engine' of agents but relatively distant from business and users.

The platform capabilities and API encapsulation of large models, including MCP tools, plugin systems, development interfaces, etc., correspond to PaaS, offering a unified 'toolbox' for AI development and invocation.

Agents, closest to users and business scenarios, can be viewed as SaaS, integrating capabilities, understanding intent, and executing tasks, providing 'ready-to-use' intelligence.

Taking B2B scenarios as an example, traditional enterprise system functional modules are numerous, with complex interface logic, often necessitating system training and mastery of business rules to complete a process. Enterprises invest significantly in time and cost merely to 'make people adapt to the system.'

When agents possess understanding, reasoning, and execution abilities, users need not confront complex interfaces or decipher system logic. With just a natural language command, the agent automatically recognizes intent, invokes system resources, completes the task chain, and outputs results in charts, text, or notifications. Shifting from people adapting to the system to AI adapting to people's needs significantly enhances productivity.

Another illustration from B2C scenarios: in the past, if a user wanted to watch a specific movie, they had to cumbersomely input the title using a remote control to search. Sometimes, forgetting the title meant first searching keywords on a phone, sifting through dozens of links to find the title, almost exhausting their desire to watch the movie.

With a TV equipped with Skyworth Coolka's super agent, you simply say, 'I want to watch a movie.' Even if you forget the title, describe the plot and characters. The super agent comprehends user needs, breaks down the task, and assigns it to the audio-visual agent to search content across video websites, directly reaching the playback interface. In AIOT home scenarios, upon receiving a movie-watching request, the agent automatically adjusts lights and closes curtains.

Numerous examples abound.

Beyond boosting productivity, agents have further transformed the human-computer collaboration paradigm: users no longer need to actively operate tools but simply issue commands, allowing agents to complete a series of complex tasks. Whoever first receives user needs will grasp system scheduling rights and control resource allocation.

For AI enterprises, agents embody the next entrance-level opportunity, and deploying agents means seizing 'control' of the next generation of interactions.

02 On the Eve of the Agent Explosion, Three 'Schools' Emerge

It's undeniable that agents are still in their nascent stage.

However, driven by technological iteration and market demand, more enterprises are participating. Due to diverse entry paths and varying understandings of agents' value, based on their core advantages and resource endowments, three distinct groups have gradually emerged.

The first group comprises standard AI vendors like Baidu, ByteDance, Google, OpenAI, etc., aiming to dominate the technological ecosystem's construction.

Their approach can be summarized as: using large models as the foundation, opening up the development tool chain and agent solutions, and attracting developers to build diverse agent applications on the platform. The goal is to create an AppStore for the agent era, enabling agents to be created, invoked, and distributed akin to apps.

Under this concept, agents are no longer a product but a new 'operating system,' aiming to play the role of infrastructure builder and ecosystem dominator in the 'model-development-distribution' chain. After all, whoever boasts the most powerful development platform and the most active developer ecosystem holds the 'distribution rights' and 'scheduling rights' in the AI era, an enticing yet challenging endeavor.

The second group consists of enterprise service providers focusing on vertical scenarios, such as Microsoft, IBM, Alibaba Cloud, etc., constructing enterprise-level agent solutions.

Most in this group hail from cloud computing and enterprise services, with deep industry know-how and enterprise architecture understanding. They're not in a rush to create 'public entrances' but choose to start with vertical scenarios offering the most realistic value, emphasizing agents' delivery capabilities and effect verification.

Thus, strategically, they tend to integrate agent capabilities into enterprises' original system processes, addressing automation and intelligence issues in business modules like finance, sales, human resources, and warehousing. Microsoft has a bold prediction: as more agents join, every employee will become an 'agent supervisor,' responsible for establishing, delegating, and managing agents to maximize their capabilities.

The third group encompasses hardware and software vendors proficient in user experience pain points, such as Huawei, Lenovo, Skyworth Coolka, Samsung, etc., directly embedding agents into user 'touchpoints.'

With tens of millions of users, hardware and software vendors have long been at the forefront of user experience, boasting natural advantages in user needs satisfaction, hardware and software refinement, and data accumulation. They typically begin deeply integrating agents into terminal products, using agents to resolve user experience bottlenecks.

A direct example is Skyworth Coolka, which launched an AI-capable smart screen as early as 2014. In 2025, it pioneered the 'super agent' standards of 'long memory, fast thinking, and instant action': during use, it forms an 'experience library,' enhancing the model's understanding of user habits and reducing repeated interaction costs. Simultaneously, it adopts atomic components and a multi-agent coordination framework, improving response speed to within 1.5 seconds, meeting end-users' 'faster, more accurate, and more direct' experience demands.

The above classification might not be rigorous, as Alibaba also has a C2C layout, and Skyworth Coolka is expanding into the B2B market.

The term 'three schools' is used because they constitute the agent ecosystem's triangular architecture—platform, service, and experience, commencing from the technological ecosystem, industry adaptation, and terminal scenarios, respectively. There's both competition and collaboration, jointly propelling agents from concept to implementation to large-scale application.

03 Frenzy and Rationality Coexisting: Agents' Possible Trend

Multiple forces' resonance has made agents the most imaginative trend currently. But historical experience reminds us that trends and bubbles often coincide.

After Manus unexpectedly went viral, top-tier companies swiftly followed, 'creating' similar products in less than a month. Beneath the fervor lies hidden concerns: many 'agents' are merely simple encapsulations of large model APIs, lacking core capabilities like task orchestration and long-term memory. They seem intelligent but are actually 'like but not smart.'

But this doesn't negate agents.

At the dawn of each new technology cycle, there's often a 'bubbles first' phenomenon. The market's pursuit of concepts outpaces technology's maturity, leading to short-term value being overestimated and long-term value severely underestimated, ultimately progressing in a frenzy-rationality spiral.

On the eve of 'clear concept, unclear path,' we attempt to 'speculate' agents' possible next trend from a rational perspective.

1. Vertical agents will land before general agents.

General agents' common issue is being 'strong but not specialized.' In contrast, vertical agents, close to business, familiar with processes, and possessing clear goal boundaries and industry knowledge graphs, initially meet the 'able to take up a post' requirements in scenarios like healthcare, education, hotels, and manufacturing.

A challenge arising from this is that a single agent can handle simple tasks but must rely on multiple agents for complex task chains.

For instance, in daily life, it might involve tasks like travel planning, food recommendations, hotel reservations, etc., requiring accurate user intent understanding after a command, breaking down the demand, and assigning it to different agents for completion. Currently, only Skyworth Coolka's super agent has demonstrated intelligent home service integration, while most other agents are still in the stage of manually invoking individual agent conversations.

When a user presents a complex request like 'help me plan a 3-day trip for my family of 5 in Shenzhen,' the agent can one-stop connect services like weather, transportation, food, hotels, attractions, and maps to develop a detailed travel plan, directly select suitable airline tickets and hotels, and enable ticket purchase via QR code scanning.

Integrating capabilities like personalized user intent recognition, dynamic task orchestration, and multi-agent coordination might become the agent marathon's first checkpoint.

2. Hardware opportunities might surpass software.

Current agent discussions mainly revolve around software form reconstruction: from tools to assistants, from applications to agents. A noteworthy phenomenon is that agents' impact on hardware might be far greater than on software. When agents begin dominating interaction logic, hardware itself becomes the 'service entrance.'

It can even be predicted that natural language-based interaction will reshape hardware's discourse power, and every screen might become a 'service hub.'

A similar trend is already evident in smart speakers, where users only care about results, not which platform the content is from. With agents' empowerment, service delivery rights will further shift from apps to hardware with sensing and understanding capabilities:

TVs, buddy machines, etc., are no longer mere playback tools but the family's AI control center; learning machines' capabilities extend beyond homework correction and video courses. Educational agents possess 'long memory' to accurately record children's learning trajectories, 'fast thinking' to analyze weaknesses in real-time, and 'instant action' to generate personalized plans, truly realizing the AI education paradigm of 'thousands of faces for thousands of people'...

It should be noted that the above is merely our shallow understanding after studying companies' agent strategies like Microsoft, Lenovo, Skyworth Coolka, IBM, etc.

However, we can be certain that agents won't be a standalone product but a comprehensive reconstruction of technology, interaction, and service methods. From general large models' 'universal engine' to vertical agents' 'industry brain' and then to hardware terminals' 'intelligent entrance,' the AI industry's structural upgrade has quietly commenced.

04 Written at the End

Agents still face numerous challenges ahead.

Whether general agents can dismantle silos and forge a sustainable, open ecosystem; whether vertical agents can pinpoint application scenarios and transition from showrooms to large-scale deployments; how to delineate the boundaries of human-computer collaboration, balance data security with personal privacy, and ensure the coordination mechanism between multiple agents operates as efficiently and orderly as real-world organizations—these are the 'ability hurdles' that agents must surmount to ascend to the industry's forefront.

Once these questions are answered comprehensively, the advent of AGI (Artificial General Intelligence) will be within reach.

Drawing from the consensus at the Sequoia AI Summit: in the AI era, victory will belong to those who not only deeply cultivate vertical scenarios and establish formidable competitive advantages but also maintain agile iteration and embrace the technological wave with open arms.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.