03/15 2026
462

In November 2022, ChatGPT made its debut. Just two months later, its monthly active user base soared past 100 million, marking it as one of the fastest-growing consumer applications in the annals of internet history.
Initially, many perceived this as merely an enhancement in search methodologies and content creation. However, with the benefit of hindsight, it's clear that this technological wave could fundamentally reshape the internet's operational framework.
Over the past three years, the AI landscape has evolved through three distinct phases: the model era, the application era, and the burgeoning operating system era.
While ChatGPT serves as the gateway to large models, the recently popularized OpenClaw offers a promising pathway to transition AI from a mere 'question-answering tool' to an 'operating system that acts on behalf of humans'.
As AI gains the ability to safely and reliably invoke tools, access files, operate software, and even proactively execute tasks, the architecture of future computer systems may undergo a corresponding transformation.
An AI operating system, akin to Windows in the PC era and iOS/Android in the mobile internet age, is gradually taking form.
I. The Imperfect Enlightener
Unlike traditional AI chatbots, OpenClaw can directly operate computers, invoke software, and execute tasks—key factors driving its explosive popularity.
It's important to acknowledge that the current iteration of OpenClaw is far from mature or user-friendly, plagued by numerous obvious flaws. Its deployment threshold is high, its operation is not seamless, and it faces practical risks such as permission security, privacy breaches, and rapid token consumption. These issues hinder its ability to swiftly become a productive tool for the average user.
However, the core value of this 'imperfect' product lies in achieving a pivotal industry enlightenment and cognitive breakthrough—enabling more people to intuitively perceive, for the first time, that AI can not only 'speak' to provide answers but also 'act' to complete tasks.
With OpenClaw's soaring popularity, domestic tech firms have successively launched AI entry-point battles by integrating 'Longxia' (a nod to OpenClaw).
In addition to large model companies like Kimi incorporating OpenClaw, notable actions have come from industry behemoths such as Tencent and ByteDance.
Tencent, which has been relatively cautious in its AI advancements, has taken exceptionally intensive actions this time, launching five Longxia products in succession, including the desktop AI agent WorkBuddy, OpenClaw integrated into Enterprise WeChat, OpenClaw integrated into QQ, OpenClaw deployed on Tencent Cloud's lightweight cloud, and QClaw launched by Tencent PC Manager.
More critically, some of these products can already link QQ and WeChat. For instance, after installing QClaw, users can directly converse with Longxia in WeChat and have it perform tasks. In the future, when a leader assigns a task during your rest period, you can simply send a message in WeChat, and the computer will complete the task, including modifying spreadsheets, sending emails, and operating browser processes, without interrupting your rest.
Tencent is also advancing official agents within WeChat.
According to The Information, Tencent is developing a new AI agent for WeChat that will connect to the millions of mini-programs running within WeChat, offering a range of services from hailing taxis to ordering groceries, to surpass competitors like Alibaba and ByteDance. The project is listed as a high-priority confidential plan, scheduled to begin gray-box testing in mid-2025 and officially launch in the third quarter.
ByteDance, Baidu, and others are making similar strides.
Volcano Engine officially launched ArkClaw, described as an out-of-the-box cloud-based SaaS version of OpenClaw. Without any complex configuration, users can access a 7×24-hour online AI assistant by simply opening a webpage, making it easy to 'raise a shrimp' (a colloquial term for nurturing an AI assistant).
Baidu also launched the mobile app 'Red Finger Operator,' extending OpenClaw's capabilities to mobile devices, supporting users in automating cross-app tasks through natural language instructions, enabling cross-app interactions such as hailing taxis and ordering takeout.
Why are these companies acting so swiftly?
The core reason is that AI is undergoing a qualitative transformation from a productivity tool to a system-level entry point. Unlike early chat-based AI, the new generation of AI agents can invoke software, operate devices, and automatically complete complex tasks.
If the entry point in the mobile internet era was the App, then in the AI era, the entry point is likely to become the AI agent. The battle for the AI-era operating system has already commenced globally.
On one hand, AI companies are bolstering AI's system capabilities.
OpenAI is continuously expanding ChatGPT's tool invocation, task execution, and developer interfaces, enabling AI to directly connect to various software services.
Recently, OpenAI introduced GPT-5.4, which incorporates native computer usage capabilities, allowing AI agents to interact with operating systems, websites, and applications through mouse, keyboard, and visual inputs. Developers can use this model to automate multi-step workflows across various software environments.
Meanwhile, traditional tech giants are mounting a defensive stance at the foundational level.
Microsoft is deeply embedding AI into the Windows and Office ecosystems, hoping to make AI the new operational entry point; Apple is strengthening local AI capabilities within the iPhone and macOS, attempting to integrate AI into the system's core.
When AI can invoke applications, operate devices, and execute complex tasks, a new computing architecture is taking shape: User → AI → Application Services. The competition around this entry point is essentially a new battle for operating system dominance.
II. The Next Round of AI Competition Hinges on Behavioral Data
The explosive popularity of OpenClaw has quickly made agents one of the hottest directions in the AI industry. However, for tech companies, this race is closely tied to the current realities and pressures of the AI industry.
In the past few years, large model training has primarily relied on publicly available internet texts, such as encyclopedias, news articles, books, or forum content. However, as model scales continue to expand, the value of these data sources is declining.
Studies have pointed out that the demand for data by artificial intelligence is growing far faster than the speed at which authentic and diverse data sources can provide. The scarcity of naturally generated real data is posing serious risks to the development of artificial intelligence.
Research institution Epoch AI released a study in 2024 predicting that tech companies will exhaust the publicly available training data for AI language models within a decade (roughly between 2026 and 2032).
In the short term, tech companies like OpenAI and Google are competing to acquire high-quality data sources, sometimes even paying for them, to train their large AI language models—for example, by signing agreements to access the continuous stream of sentences from Reddit forums and news media.
In the long term, new blogs, news articles, and social media comments will not be sufficient to sustain AI's current development trajectory. This will force companies to utilize what is now considered private and sensitive data (such as emails or text messages) or rely on less reliable 'synthetic data' output by chatbots themselves.
The key to enhancing model capabilities in the next phase lies not just in more text but in data closer to real-world behavior.
When a user asks AI to complete a task, the AI goes through a series of specific steps, such as searching for information, opening webpages, invoking software, or filling out forms. These operations form a complete task chain, also known as task trajectory data in the industry.
Compared to static text, such data is closer to the action logic in the real world and holds higher value for training AI models with execution capabilities. From this perspective, tech companies' large-scale promotion of agents is also aimed at preemptively securing data sources for the next round of competition to train their models.
As more users complete tasks through agents, these operational processes themselves generate vast amounts of new training data.
During the use of agents, users often need to continuously give instructions, correct errors, and adjust task steps. For AI systems, these interactive processes actually constitute a type of high-quality reinforcement learning data. Every task execution and every correction records the complete trajectory of how AI gradually completes complex tasks.
Once these data are aggregated to the cloud, they could become a crucial resource for training the next generation of agent models.
Compared to traditional internet texts, such data not only contain linguistic information but also include task decomposition, tool invocation, and decision paths, holding higher value for enhancing models' reasoning and execution capabilities.
III. Is AI Entering Its '1995 Moment'?
If we rewind the timeline 30 years, the internet in 1995 was in a chaotic phase.
At that time, the TCP/IP protocol was mature, but most companies were still exploring what the internet could do, and ordinary people faced tedious commands to access it.
Until the arrival of Windows 95, which effectively encapsulated the complexity of underlying technologies through a graphical interface and provided developers with a low-threshold creative environment through standardized API interfaces.
This change not only transformed 'going online' from a geek behavior into a daily activity for ordinary people clicking icons but also spurred the explosion of the PC software ecosystem, initiating a golden decade of internet popularization.
Today, 30 years later, the AI industry seems to be standing at a similar '1995 moment'.
Large models have demonstrated the ability to handle various complex tasks, such as writing reports, generating videos, coding, analyzing data, and even operating computers, invoking software, and executing tasks—almost omnipotent.
However, in practical use, ordinary users still need to learn complex prompts, switch back and forth between different webpages and applications, and find suitable models or agents to complete tasks.
In other words, AI's capabilities are sufficient, but it lacks an organizational hub that can transform various AI capabilities into systemic efficiency.
From this perspective, if Windows 95 was the operating system gateway in the PC era, then the AI era urgently needs its own 'operating system'. It will serve as a unified hub connecting users, agents, and application services, including understanding user intent, decomposing tasks, scheduling tools, and generating results. Users only need to express their needs, and the system will handle the rest automatically.
Over the past few decades, from Windows in the PC era to iOS and Android in the mobile internet era, applications have always been the fundamental units of the online world. The process of users using phones or computers has always been: opening an application and then completing various operations within it.
However, under the architecture of an AI operating system, this logic may change.
When AI can understand user needs, invoke tools, and automatically complete tasks, users no longer need to open multiple applications themselves. Instead, they only need to tell AI what they want to accomplish. AI will automatically invoke different services in the background and return the final results to the user.
Under this model, the structure of computer systems will become: User → AI → Application Services.
This means that in the AI era, computers may enter a new interaction mode: intent-driven. Users no longer need to learn how to use software; they only need to express their intent. The computer system's task is to understand the intent and automatically invoke various tools to complete the task.
So, what form will such an AI operating system take? Currently, the industry is at a crossroads of multiple evolutionary paths.
One possibility is a new hardware entry point. OpenAI has enlisted the services of Ive, the designer of the original iPhone, to develop its first AI consumer product, hoping he can replicate his success in designing iconic products like the Apple iPod, iPhone, and iPad.
According to foreign media reports, this product is positioned as a 'third core device' that can be carried in a pocket or placed on a desk alongside a MacBook Pro and iPhone. Moreover, this device will be compact and portable, capable of sensing its surroundings and life contexts, and completely screen-free.
Another possibility is establishing an AI entry point on top of super apps. Platform companies like Tencent and Alibaba are attempting to reintegrate their existing app ecosystems through AI, allowing users to invoke various service capabilities through a single entry point.
Regardless of the form, if this model truly matures, AI could become the core infrastructure of the next-generation computing platform after PCs and mobile internet. Under this new architecture, today's App-centric traffic distribution system may also be rewritten, with true commercial power shifting from 'app traffic' to 'intent distribution rights'.