When agents need to be ubiquitous, how should computing power flow across multiple ends?

06/03 2026 554

The cloud and the edge merge into a unified system, constructing a computing continuum.

A lobster sparks the token economy.

In early 2026, the agent OpenClaw (Lobster) transformed the business models of all AI companies.

AI begins to evolve from 'passive response' to 'autonomous action.' Users are no longer satisfied with AI merely 'answering questions'; they demand it to plan tasks, invoke APIs, collaborate across systems, and even execute actions in the physical world.

Agents are no longer a single large model in the cloud but split into a group of small assistants that run in real-time across edge devices like smartphones, PCs, robots, and the cloud, helping you book flights, drive cars, and conduct experiments.

Agents are not only ubiquitous but also operate autonomously and collaborate with each other. This brings many new computing power demands and challenges.

Firstly, agents operate autonomously, interacting with software at speeds far exceeding human capabilities. In all workflows, tokens will be generated at machine speed, not human speed.

'In 2026, there will be approximately 31.7 billion token demands every 10 seconds globally; by 2030, this figure will reach 1.27 trillion every 10 seconds, a 40-fold increase. This explosive growth is driven by the massive tokens generated by agent AI,' predicts Qualcomm CEO Cristiano Amon.

Additionally, ubiquitous agent AI, distributed across different devices, requires a ubiquitous computing platform capable of providing continuous services.

'We will no longer talk about the cloud and the edge separately because they will merge into a unified system,' Qualcomm believes. The cloud and the edge are not mutually exclusive. In his speech, Amon stated, 'Whether AI runs in the cloud or on the edge is determined by the agent. Computing resources will be fully utilized, and AI will run on all devices.'

This means that inference capabilities will be distributed to the most suitable locations, achieving an optimal balance between token cost, power consumption, latency, and privacy.

The era of agents has arrived—all devices will become endpoints for AI.

Robots are considered one of the best carriers for agents in the physical world.

In 2026, robot intelligence is divided into three layers. The first layer is immediate execution, corresponding to actions humans perform without thinking, such as standing steady; the second layer is specific action execution, corresponding to the robot's interaction with the scene, such as dancing; the third layer is logical reasoning, corresponding to understanding physical relationships in the real world, such as inferring a complete task and understanding cause and effect.

These three layers of intelligence are progressive and complementary. Amon summarized, 'Developing robots is not just about developing their 'brains'; it also requires equipping them with core computing units, motion control modules, and various drive execution capabilities. To succeed in the robotics field, you must know how to design it as a hierarchical computing system.'

The same phenomenon is also observed in intelligent vehicles. Previously, intelligent vehicles and cabins were unrelated, but in 2026, with the keyword 'cabin-driving integration,' automotive intelligence has also become a system integrated with two levels of intelligence. Users can now directly use voice commands to activate intelligent driving in cars.

The reason for this change is that AI agents enable AI, previously limited to being a Chatbot, to directly handle tasks. 'AI is evolving from a tool that simply responds to instructions and assists human-machine interaction into a system capable of autonomous action. This is the direction AI is evolving, and it will usher in unprecedented large-scale adoption,' Amon said. This is why Qualcomm considers 2026 the 'Year of the Agent.'

However, at the same time, after seeing the growing demand for running agents on end devices, Amon also recognizes that all current computing power vendors face unprecedented opportunities and challenges. 'It is reshaping computing architectures and will drive huge demand for new devices and computing capabilities. This upgrade cycle is expected to become one of the largest in the industry's history,' he said.

In fact, over the past three years, end-device hardware vendors have laid the groundwork for AI implementation.

For example, since the end of 2024, the smartphone and PC industries have extensively integrated NPUs into devices, and the computing power parameters of intelligent vehicle cabins and intelligent driving chips have also increased significantly. However, the reality is that many PCs still have limited NPU computing power scheduling, and most AI functions on smartphones still require default internet connectivity for execution, with users having very limited perception of actual end-side AI usage.

But in the agent era, perhaps these previously prepared infrastructures are finally about to be truly utilized.

'These devices will all become endpoints for agents. Moreover, agents will not be limited to any single device or ecosystem. Everything that can connect users with agents will become an endpoint for AI,' Amon said.

As Amon stated, when agents need to respond instantly, plan autonomously, and execute complex tasks across applications on devices, end-side platforms must become a native operating environment for agents. This way, workflows led by agents can distribute tasks to the most suitable locations—end devices, the edge, local servers, and the cloud.

This means that when AI begins to dominate (take the lead in) its own work and mobilizes multiple cloud-side and end-side AI devices to jointly provide computing power, the previous method of simply stacking AI accelerators will no longer be applicable. AI computing platforms must undergo a system-level redesign.

Computing Continuum: New Infrastructure for Agent AI

What kind of new infrastructure is needed when agents replace humans in control?

How should computing be collaboratively allocated across the cloud, smartphones, PCs, and vehicles for agents?

Qualcomm's answer is the 'Computing Continuum,' enabling agents to freely flow across the entire computing power chain.

This solution is not just about simply sharing computing power, allowing AI terminals at different stages to jointly produce tokens. Instead, it is a new infrastructure system that distributes inference capabilities to the most suitable locations, achieving an optimal balance between single-token cost, power consumption, latency, and privacy.

This understanding is also the consensus among current computing infrastructure vendors.

For example, Tia White, General Manager of AWS OpenSearch Service, recently posted on LinkedIn, stating that agents' communication needs are entirely different from humans'. 'They may experience sudden traffic peaks without warning or quietly enter idle states.' In response to the impact of agent-generated data traffic on existing computing networks, Lai Yi Ohlsen, Senior Product Manager at edge computing company Cloudflare, said, 'Non-human traffic will surpass human traffic at some point in the first half of 2027.'

Omdia pointed out in a report that by distributively deploying AI capabilities across devices, an 80% local processing rate can reduce cloud operating costs from $5.5 billion to approximately $1.2 billion, saving $4.3 billion annually while improving latency, energy efficiency, and reliability.

How is this transformation specifically implemented? Qualcomm provided a solution with 'three pillars' at this conference.

Firstly, the first pillar is 'scalable coverage,' which answers the question of 'where agents ultimately run.'

In Qualcomm's vision, agents will not be confined to a single location but will dynamically migrate between end devices, the edge, local environments, and data centers. Qualcomm is responsible for providing a unified architecture, with computing power platforms covering from milliwatt-scale (end-side) to kilowatt-scale (data centers), enabling seamless flow of inference and planning workloads across all levels.

This agent-native infrastructure can support traffic surges during agent operation. Based on unified scheduling by Qualcomm's computing platform, token generation can avoid traditional cloud-side single-point bottlenecks or the limited computing power ceilings of end devices. This reduces overall costs while maintaining AI performance and response speed.

The entire process can be simply likened to solving a math problem. Previously, a PhD student provided complete answers, but Qualcomm's approach is for the PhD student to provide the problem-solving thought process ( thought process - approach), the graduate student to list the formulas, and the undergraduate student to perform each step of the calculation.

More importantly, since the entire computing platform is within Qualcomm's architecture, it is like the entire team has 'telepathy.' After integrating the end-side and cloud-side, agents can also achieve cross-level consistency, further ensuring fast and accurate computation without 'going astray.'

The second pillar provided by Qualcomm is 'native AI devices and systems,' which answers the question of 'how agent applications are implemented.'

After all, users need to use AI through specific end devices. This means that agents providing always-on services on devices must meet three constraints: instant performance, privacy protection, and reliability. In other words, to achieve AI's intelligent 'presence everywhere' (scalable coverage), devices must inherently 'understand AI' (native systems).

Qualcomm's layout at this level covers smartphones, AI PCs, wearable devices, intelligent vehicles, robots, edge inference devices, and system-level designs for new forms.

On PCs, Qualcomm introduced the Snapdragon C platform for entry-level PCs; in the field of embodied intelligence, Qualcomm showcased the high-performance robot reference design platform Qualcomm Yuelong IQ10 RRD. Most surprisingly, Qualcomm also officially announced a new brand for data centers—Qualcomm Feilong (Dragonfly). From personal devices to data centers, Qualcomm's computing platform is about to achieve closure.

Notably, Qualcomm has implemented high-performance CPUs and inference-specific accelerators in these business areas. This enables each product to provide the general-purpose computing and artificial intelligence neural network inference capabilities required for agent planning.

Moreover, Qualcomm's computing platforms consistently balance cost and power consumption control, allowing consumers to purchase long-term agent companionship at a high cost-performance ratio.

'Each device needs the right AI platform because they are all different and have different uses. The key is to maximize intelligence and energy efficiency everywhere,' Amon summarized.

The last pillar is 'intelligent connectivity,' which answers the question of 'how agents collaborate with each other.'

When end-side devices all have AI capabilities, connectivity itself must also be intelligent. If the previous two pillars were at the computing level for agents, then connectivity enables agents to chat with each other and collaborate in real-time, like a group of small teams instantly coordinating on the battlefield.

Omdia also pointed out in its report that to achieve effective large-scale AI implementation, the tech industry needs to prioritize developing cross-device collaborative planning capabilities and integrating edge systems with cloud services. After all, communication is the most critical part of agent collaboration. If information cannot be exchanged in a timely manner, distributed computing would be meaningless.

To meet this demand, reliance may be placed on the upcoming 6G network.

However, Qualcomm has already made relevant layouts in this area. Amon envisioned, 'The network itself is an AI-native network, with distributed AI computing and inference capabilities extending from wireless base stations to central offices and even data centers.' This means that the era of universal intelligence has finally arrived.

Through the 'Computing Continuum' solution, it can be seen that Qualcomm's layout in the AI era extends far beyond personal end devices and communications, covering a complete computing power network from wearable devices to data centers. From the smallest edge computing power to cloud-side infrastructure, Qualcomm can provide relevant products and attempt to connect them into a whole with a set of interconnected solutions.

'Qualcomm has leading performance per watt in smartphones, PCs, in-vehicle computing, and robotics, and is extending this advantage to data centers,' Amon said.

As Amon stated, Qualcomm's label is about to change in the agent era—from a past chip company to an AI agent solution company.

The Depth of the Moat Depends on the Breadth of AI Applications

Qualcomm's transformation is a microcosm of this year's Computex conference.

Before the era of AI large models, Computex was like a 'computer accessories marketplace.' During the initial stages of AI large models from 2023 to 2025, the conference mainly focused on hardware vendors attempting to integrate with AI, a ' carnival (carnival)' of building large models by stacking infrastructure. However, this year, the conference's direction has completely shifted to being AI-centric, with every vendor at the event discussing the specific implementation of AI in the 'physical world + agent era.'

Especially this year, many vendors announced plans to expand their AI layout ( layout - presence).

For example, this year, NVIDIA released several products at once, including CPUs, PC chips, and humanoid robots, preparing to expand from its position in AI infrastructure to a full stack of physical AI. AMD's main focus this time was shifting from pure hardware to creating Agent Computers (local agents), attempting to capture NVIDIA's share in developers and edge AI. After Intel 'turned around' with its 18A process, it began planning to create AI full-stack hardware 'from handheld devices to data centers.'

'The focus of discussion has never been about what can run in the cloud can also run on the edge. What we need is for things that should run in the cloud to run in the cloud, and things that should be on the edge to stay on the edge. This is a completely different concept,' Amon said.

As Amon stated, at this stage, mere algorithms or hardware can no longer answer the question of whether agents can be used effectively. To truly enable agents to flow freely across full-stack computing power hardware, vendors not only need breadth in their AI layouts but also need to thoroughly understand every aspect of agent applications.

To this end, Qualcomm prepared numerous real-world demonstrations at the conference, attempting to verify the feasibility of the Computing Continuum's end-to-end system-level layout.

Some of the demos were crafted by Qualcomm itself, serving as a technological benchmark to showcase its depth. For example, the AI Invoice Assistant can scan receipts via a camera and automatically complete tasks ranging from translation to file generation, demonstrating a complete workflow of on-device multimodal AI. Other demos came from partners, covering the full workflow from creative production to software development, showcasing the breadth of the ecosystem.

Regarding the benefits of on-device-cloud collaboration, Ammon gave an example, saying, 'Take a real-world Claude Code operation scenario as an example. The planner intelligently schedules workloads: some tasks are kept for local computation on the device, while necessary content is uploaded to the cloud. Through this distributed AI agent architecture, which fully leverages all computing resources across the computing continuum, approximately 1.4 million tokens can be saved, reducing costs by 60% while achieving the same results.'

This cost reduction is like a timely rain for tokens, which burn through money like water.

To achieve this effect, Qualcomm has made extensive behind-the-scenes arrangements.

For example, in June last year, Qualcomm announced the acquisition of Alphawave Semi, a company specializing in high-speed wired connectivity. This filled the gap in data center AI inference accelerator cards for both scale-up (PCIe) and scale-out (Ethernet) needs, a key factor enabling Qualcomm to further bet on data center computing cards.

Additionally, Qualcomm attempted to cover computing platforms from the industrial edge to robot brains under the Ryzen brand. At this conference, Qualcomm introduced a full-stack reference design for robot computing platforms—Qualcomm Ryzen IQ10 RRD. It is understood that the Qualcomm Ryzen IQ10 RRD not only supports industrial robots, autonomous mobile robots (AMRs), and humanoid robots simultaneously but also integrates an end-to-end software stack, emphasizing 'out-of-the-box' usability.

On the consumer side, building on its previous Snapdragon X platform focused on high-performance PCs, Qualcomm introduced the Snapdragon C platform for entry-level laptops. In terms of product expectations, PCs equipped with the Snapdragon C platform may resemble Chromebooks, focusing on lightweight applications and office needs. However, the difference is that the Snapdragon C platform integrates an NPU, meaning that for the next generation of entry-level PCs, in addition to power consumption and performance, the ability to utilize on-device AI becomes a product threshold.

From chips to connectivity, from wearables to data centers, from consumer to industrial scenarios, Qualcomm has built a complete computing continuum that traverses the physical world.

This comprehensive layout allows Qualcomm to seize a truly fully integrated opportunity in the age of AI agents. While other computing vendors are still emphasizing 'brute-force computing' and parameter scale, Qualcomm proposes a fundamentally different systemic solution with its edge AI network, performance per watt, and superior TCO (Total Cost of Ownership) advantages.

In 2026, when AI agents become the keyword, AI has significantly lowered the barrier for humans to access higher intelligence. And what Qualcomm is doing is further reducing this barrier to make it accessible to everyone.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.