Jensen Huang's Vision for Physical AI: Transforming 5G Networks into Distributed AI Computers

Home

Finance

ICV

Smart City

Digital Live

Cloud

Optics

Home Finance AI ICV Smart City Digital Live Cloud Optics

03/27 2026 340

These days, discussions about NVIDIA's GTC Conference have been dominated by Huang's 'Token Economics.'

"The data centers of the future are not storage warehouses but factories producing intelligent tokens; and performance per watt is the only hard metric in this race." With this statement, Jensen Huang outlines a new paradigm for future corporate competition.

From compute costs to inference efficiency, from token pricing to AI business models, market attention is focused on a familiar question: how to produce and consume 'intelligence' more efficiently. However, if we shift our gaze slightly away from the cloud, we will notice another message from NVIDIA that is relatively easy to overlook—on March 16, NVIDIA announced a collaboration with T-Mobile and Nokia to deploy Physical AI applications on distributed edge AI networks, attempting to upgrade wireless communication networks into high-performance edge AI computing platforms.

Compared to the further optimization of efficiency and costs through 'Token Economics,' this message points to a more fundamental question: when AI is no longer just about generating content but about entering the real world and participating in every real-time decision, do the network and computing architectures we rely on to run AI need to be rewritten?

Jensen Huang's answer to this question is straightforward: "The network is evolving into an AI infrastructure, enabling billions of devices—from visual AI agents to robots and autonomous vehicles—to see, hear, and act in real time. By collaborating with T-Mobile and Nokia to transform 5G networks into distributed AI computers, we are creating a scalable blueprint for global edge AI infrastructure."

For practitioners who have long focused on IoT and edge computing, this may be the most noteworthy signal from this GTC Conference~

Breaking Through the Key Bottleneck of Scalable Physical AI Development

Previously, Jensen Huang has introduced predictions about the stages of AI development in multiple speeches, stating that after experiencing Perceptual AI and Generative AI stages, AI has now entered the Agent AI stage and will move into the Physical AI era in the future. While Generative AI addresses the challenge of 'understanding and generating information,' Physical AI faces a more complex proposition: understanding the world and acting within it.

According to NVIDIA's definition, "Physical AI is a model that uses motor skills to understand and interact with the real world, typically embodied in autonomous machines such as robots and autonomous vehicles"—we know that large language models like GPT and Llama are remarkably capable of generating human language and abstract concepts, but they have limited understanding of the physical world and are constrained by its rules. However, Physical AI can understand the spatial relationships and physical behaviors of the three-dimensional world we inhabit, thus extending current Generative AI.

With Physical AI, autonomous machines can perceive, understand, and perform complex operations in the real (physical) world. For example, autonomous vehicles can use sensors to perceive and understand their surroundings to make informed decisions in various environments (from open highways to urban landscapes), including but not limited to more accurately detecting pedestrians, responding to traffic or weather conditions, and automatically changing lanes. In industrial and logistics scenarios, autonomous mobile robots (AMRs) in warehouses can navigate complex environments and avoid obstacles, including humans, using direct feedback from onboard sensors. Robotic arms can adjust their grip and position based on the pose of objects on a conveyor belt to achieve precise operations. In urban spaces, systems composed of numerous cameras and sensors are attempting to understand and respond to environmental changes in real time.

It is precisely in this transformation that the requirements for underlying infrastructure by AI are completely changed—because once it enters the physical world, latency, reliability, and real-time performance can shift from being 'experience issues' to 'life-and-death issues.'

Many systems cannot tolerate high latency or rely on the classic path of 'uploading to the cloud first and then processing.' As current industry practices show, scenarios such as autonomous driving, robotics, and smart cities require millisecond-level response times and highly reliable connectivity. The problem thus becomes clear: a key bottleneck to the scalable development of Physical AI lies in the 'lack of low-latency, secure, and ubiquitous connectivity.'

Under traditional architectures, there are two approaches to this problem, but neither is ideal—

"Everything in the cloud": Terminal devices collect data, upload it to the cloud for processing, and then return the results. The issue with this model is that the link is too long, and latency and stability are uncontrollable, making it nearly unusable in critical scenarios.

"Everything on the device": Stacking as much computing power as possible onto the device itself, but this also faces bottlenecks. Terminal devices are limited in power consumption, cost, and size, making it impossible to sustain the operation of complex models. At the same time, the isolation of computing power on devices makes it difficult to support continuous model iteration and unified scheduling.

Precisely between these two paths, a new architecture is emerging—shifting computing power from the cloud 'downward' but not entirely onto the terminal, instead placing it 'within the network.' This is the core logic of the AI-RAN architecture promoted by NVIDIA, T-Mobile, and Nokia: deploying AI inference capabilities at network edge nodes close to terminals, enabling Physical AI systems to offload a large amount of computing tasks from the device side to the nearest base station or edge data center for completion.

The direct result of this change is that developers no longer need to stack expensive computing power on every camera, robot, or terminal device. Instead, they can rely on distributed computing resources on the network side to deploy more complex AI capabilities at a lower cost. Under this architecture, communication networks are no longer just 'data transmitters' but become computing platforms that carry intelligence, thereby supporting the deployment of AI applications at the scale of billions of devices.

Leading Developers Deploy Inference and Visual AI to the Edge

To transform the network into a distributed AI computing platform, ultra-low latency and spatiotemporal consistency must be provided for billions of terminals at the network edge, which is precisely the core capability of NVIDIA's collaborator, T-Mobile. Unlike Wi-Fi, which has limited coverage and security, T-Mobile's standalone 5G network provides wide-area coverage and quality of service guarantees, enabling complex AI agents to operate at busy urban intersections, industrial facilities, and remote areas.

According to the official press release, T-Mobile is collaborating with NVIDIA-certified Physical AI developers (including Fogsphere, LinkerVision, Levatas, Vaidio, and Siemens Energy) to demonstrate "how base stations and mobile switching centers can support distributed edge AI workloads" and fully leverage public 5G network connections. They will integrate NVIDIA's Metropolis Blueprint on this platform for video search and summarization (VSS) functionality.

NVIDIA's latest VSS (3) Blueprint introduces multimodal visual understanding and intelligent search capabilities and is provided in a modular architecture that can be reconstructed for different environments ("from retail stores to warehouses"). NVIDIA states that there are 1.5 billion cameras worldwide, but less than 1% of video content is manually reviewed. The VSS (3) Blueprint can "break down complex natural language queries and search video clips within five seconds to find specific events" and "summarize long videos 100 times faster than manual review."

Currently, many leading developers are collaborating with NVIDIA and T-Mobile to integrate Physical AI agents capable of driving real-time actions into T-Mobile's distributed edge network based on NVIDIA Metropolis Blueprint for video search and summarization (VSS). Pilot application scenarios include:

Smart City Operations: LinkerVision, Inchor, and Voxelmaps are testing an integrated "urban operations agent" based on computer vision and digital twins, which can perceive, simulate, and optimize traffic signal timing, aiming to improve accident response speed in San Jose by five times.

Utility (Power) Facility Automated Inspections: Levatas is leveraging NVIDIA computing power to conduct 5G network automated inspections of hundreds of thousands of miles of transmission lines to detect and quickly address issues such as leaning, corrosion, and abnormal heating of utility poles, with speeds increased by five times. The two sides are currently evaluating AI-RAN infrastructure to further reduce costs, shorten fault recovery times, and accelerate the transition from reactive to predictive maintenance.

Vision-Based Facility Management: Developers such as Vaidio are building facility management agents based on the VSS Blueprint for threat detection and fault prediction, triggering automated workflows to improve facility management efficiency.

Real-Time Industrial Safety: Fogsphere provides safety AI agents for SAIPEM to detect and respond to hazardous events in real time in high-risk onshore, offshore, and drilling construction environments, such as workers being under suspended objects or hydrocarbon leaks.

How is AI Reshaping the Role of Communication Networks?

From a broader perspective, the changes described above also mean that the telecommunications industry itself is undergoing a fundamental transformation in its role.

For a long time, communication networks have been viewed as 'connectivity infrastructure'—their core task is to efficiently transmit data between devices. However, the scale of this infrastructure is vast enough to rival the entire IT industry: the global telecommunications industry is worth nearly $2 trillion, with base stations spread across cities and villages, making it one of the most widely distributed technological systems in human society. In the past, they carried information flows; under the AI-RAN architecture, these nodes, which were originally primarily responsible for 'transmission,' will be redefined as distributed computing nodes, becoming the infrastructure platform for AI to operate at the edge.

In fact, the reshaping of the entire communication network's role by AI has already been quietly happening. Previously, in "Is LoRa Vying for 'Discourse Power' in the New Development Cycle of IoT?" I mentioned that it is no coincidence that LPWAN camps, represented by the LoRa Alliance, are beginning to emphasize concepts such as 'Physical AI' and 'action closure.' In the past competitive landscape of LPWAN, whether it was NB-IoT, LTE-M, or satellite IoT, technological narratives have long revolved around coverage capabilities, power consumption performance, and cost advantages. LoRaWAN was also widely recognized for its 'low power consumption, low cost, private network flexibility, and strong deployment elasticity.' However, in the AI era, it is attempting to redefine its role: not just a data connectivity protocol but an AI data entry point, an action exit point, and the communication nervous system for Physical AI.

This trend will become even more pronounced in future network architectures. The design philosophy of 6G is pointing toward being 'born for AI,' not just improving speed. In February 2026, the 3GPP SA2 #173 meeting concluded in Goa, India, and its R20 architecture panoramic report released an important signal: industry consensus has moved beyond mere 'connectivity pipelines' to 'native intelligent platforms.' Under this architecture, the core network element AIMF (AI Management Function) changes the way terminals interact with the network: previously, the core network was only responsible for bit transmission, while the R20 architecture begins to provide MaaS (Model as a Service). Through gradient splitting mechanisms, terminals only need to compute bottom-layer gradients to protect privacy, while the core network undertakes high-layer gradient computation. This means that network computing power will directly participate in the training and optimization of user-side large models, rather than just being a passive pipeline for transmitting information.

Looking at the big picture, it is clear that AI is consuming communication networks, and communication networks are also reshaping themselves. Whether it is edge computing, Physical AI, or the future 6G native intelligent network, all herald the formation of a new paradigm: from 'transmitting bits' to 'providing intelligence,' from 'passive pipelines' to 'active computing platforms.' Under this new paradigm, AI will not only be software but will also become an inherent attribute of telecommunications networks; networks will not only be infrastructure but real-time ecosystems that carry intelligence.

Today, we may truly be standing at the starting point of an intelligent world that is 'touchable everywhere and intelligent everywhere.'

References: Nvidia positions AI-RAN with Nokia, T-Mobile in (its) $1tn AI infrastructure market——RCR WirelessAgents, inference and token economics – Nvidia pitches the AI future——RCR WirelessState of enterprise IoT 2026: The shift from IoT to autonomous connected operations——IoT AnalyticsNVIDIA, T-Mobile, and Partners Integrate Physical AI Applications on AI-RAN-Ready Infrastructure——NVIDIA Official WebsitePhysical AI——NVIDIA Official Website3GPP Latest Meeting Review: The Latest Evolution Trends of 6G Architecture——Wireless AI Perspective

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.

Newest

Links