05/15 2025
588
Original by IoT Intelligence
This marks my 371st column article. "Physical AI" has recently come into focus, progressing at a pace swifter than anticipated. NVIDIA's daring venture into Physical AI is already taking concrete form. By 2025, AI is poised to transcend the virtual realm and delve into the depths of the physical world. NVIDIA is ambitiously driving this transformation by constructing platform-level infrastructure for "Physical AI" and reimagining the entire chain, from training and simulation to deployment. From its inception to current industrial collaborations, NVIDIA's "Physical AI" strategy has begun to take on a tangible and verifiable shape. Leading global industrial giants, including Siemens, BMW, Foxconn, Schneider, Omron, SAP, and General Motors, are partnering with NVIDIA to integrate AI into intricate physical systems such as manufacturing, warehousing, autonomous driving, and robotics. This article delves not solely into NVIDIA's Physical AI but explores pivotal questions:
Amidst the proliferation of buzzwords—notably Physical AI, Embodied Intelligence, and Spatial Intelligence—what are the similarities and differences among these terms? Are they synonyms, an evolutionary sequence, or competing paradigms?
How does NVIDIA's current bet stack up against GE's Predix industrial internet platform, launched a decade ago? Why did GE falter, and is NVIDIA at risk of a similar fate?
For developers, hardware companies, and system integrators across the AI industry chain, how should they navigate the new wave of "Physical AI"? How can they avoid fads and cultivate truly sustainable innovation capabilities?
As AI transitions from conversational models to embodied agents, we need not only technological fervor but also strategic foresight and historical perspective. Physical AI, Embodied Intelligence, and Spatial Intelligence represent AI's divergent paths into the real world. NVIDIA's aggressive and systematic push has made "Physical AI" a frequent fixture in tech media and industry reports.
However, this is not a standalone concept but intersects intricately with two key directions explored by the AI academic and industrial communities: Embodied Intelligence and Spatial Intelligence. These terms share inheritance relationships and paradigm differences, reflecting diverse perspectives on how AI perceives, integrates into, and alters the world. Clarifying these concepts is crucial for the industry to discern trends, and understanding their distinctions is essential for enterprises to make informed technology selections and strategic investments.
Before analyzing NVIDIA's platform strategy, we must answer a fundamental question: What is Physical AI, and how does it differ from Embodied Intelligence and Spatial Intelligence? Who stands as the true gateway to the physical world? In essence, Spatial Intelligence is perception and cognition, Embodied Intelligence is the acting body, and Physical AI is the central nervous system connecting perception and action, enabling AI to truly inhabit the physical world.
1. Spatial Intelligence: AI's Perceptual Organ for Comprehending the 3D World
Spatial Intelligence focuses on how AI interprets three-dimensional spatial structures, object relationships, and environmental geometry. Proposed by Stanford professor Fei-Fei Li in 2023, it signifies a leap from computer vision to cognitive intelligence. The crux of Spatial Intelligence lies in constructing a world model, enabling AI to recognize not just "this is a cat" but also "this cat is on the table, moving, and might fall off soon." Spatial Intelligence serves as the perceptual and cognitive foundation for Embodied Intelligence and Physical AI.
2. Embodied Intelligence: Endowing AI with a "Body" and "Experience"
Embodied Intelligence draws from philosophical and cognitive science, emphasizing that intelligence is acquired through body-environment interaction. Promoted by scholars and entrepreneurs, with significant research and practice by DeepMind, OpenAI, and Stanford University, it underscores that intelligence resides not just in algorithms but in the perception-movement-feedback loop. Robots, reinforcement learning, and meta-learning are common technical carriers. In this paradigm, training an intelligent agent involves experiential learning rather than data feeding:
Embodied Intelligence extends Spatial Intelligence's action capabilities and forms the "life core" of Physical AI.
3. Physical AI: From Understanding the World to Transforming It
Proposed by NVIDIA CEO Jensen Huang, Physical AI aims to create a holistic intelligent system that not only comprehends but also "acts, transforms, and deploys" within the physical world. It encompasses three critical capability loops:
In summary, Physical AI is the systematic and platform-driven evolution of Embodied Intelligence and the industrial-scale implementation of Spatial Intelligence. It signifies a leap from AI's "semantic understanding" to "physical control." While these terms have distinct origins and technical paths, they converge on enabling AI's transition from "linguistic intelligence" to "physical intelligence." Their shared goal is to empower intelligent agents to perceive, cognize, and act within the world, evolving from symbolic "speaking" to real-world "doing." Technologically, they are interwoven, relying on multimodal learning, 3D simulation, synthetic data generation, reinforcement learning, and digital twins. Their application scenarios also overlap, focusing on autonomous driving, robotics, intelligent manufacturing, logistics, and healthcare. However, their positioning and evolution differ.
Spatial Intelligence stems from the fusion of computer vision and cognitive science, emphasizing AI's understanding of 3D spatial structures, object relationships, and real-world dynamics. It enables intelligent agents to "understand the world" and is widely used in autonomous driving, SLAM, navigation, and AR/VR. Its focus is on enhancing spatial perception and modeling, though it remains tool- and middleware-centric with localized industrialization.
Embodied Intelligence, evolving from cognitive science and robotics, argues that intelligence emerges from body-environment interaction, highlighting the synergy between perception and action and experiential learning. It is crucial for general-purpose robots and virtual agents but remains primarily academic with limited commercialization. It serves more as the "philosophy and methodology" of physical intelligence than a comprehensive industrial solution.
In contrast, Physical AI is a new paradigm with strong systems engineering and industrialization traits. Driven by companies like NVIDIA, it arises from the integrated demand for training, deployment, and feedback of real-world intelligent systems. It integrates Spatial Intelligence's cognitive abilities and Embodied Intelligence's behavioral mechanisms, emphasizing the construction of a closed-loop platform system from data to models, simulation to deployment. Its core technologies include synthetic data generation, virtual simulation platforms (like Omniverse), model generalization, and edge deployment, targeting large-scale physical systems like industrial robots, autonomous driving, and smart factories.
Thus, Physical AI is not just a technical path but a strategic choice for platformization and ecosystem building. In forecasting future trends, while Embodied Intelligence and Spatial Intelligence represent significant perception and behavior directions, they are confined to localized capabilities or research projects, lacking a unified system organization. Physical AI, however, integrates these capabilities through a platform closed loop, rapidly forming implementation cases across key industries. With continued investment from giants like NVIDIA, Siemens, BMW, Foxconn, and General Motors, Physical AI has moved from labs to real-world scenarios, becoming a more systematic and commercially viable AI paradigm for the physical world.
Comprehensively, Physical AI is not only closer to industrial needs but also has the capacity to build an ecosystem, making it more likely to lead the next AI trend. It embodies the fusion of "Embodied Intelligence + Spatial Intelligence + Industrial Platformization + Data Closed Loop," transcending an academic concept to a commercializable "operating system" strategy.
A Decade's Cycle: Two Bold Moves in Industrial Intelligence
The core of Physical AI lies in NVIDIA's "Three Computers" architecture:
This is not a closed system or an "ultimate product" but a full-stack toolkit empowering developers. From GPU chips to CUDA libraries, AI models to synthetic data platforms, simulation engines to deployment hardware, NVIDIA is constructing a Physical AI development universe spanning virtual and real worlds. If "Physical AI" is the more systematic and implementable intelligent paradigm today, it must also confront historical challenges.
Understanding Physical AI's definition and its distinctions from Embodied Intelligence and Spatial Intelligence, we cannot overlook a poignant fact: this is not the first attempt to reshape the industrial world with software and data. In 2012, General Electric (GE) launched its industrial internet strategy globally with the Predix platform, one of the earliest systematic integrations of "IoT + Cloud Computing + Big Data + AI," heralding the rise of the industrial internet concept.
GE positioned Predix as an operating system for industrial equipment, centered around "three transformations": digitizing and fully sensorizing industrial equipment and processes; intellectualizing with cloud computing and machine learning for prediction and optimization; and platformizing to provide a standardized platform supporting device access, data processing, and application development. GE's vision mirrored NVIDIA's today: creating a "operating system" for the physical world, transforming traditional industries like energy, manufacturing, healthcare, and transportation through data collection, edge computing, cloud platforms, digital twins, and predictive maintenance.
GE even proposed an industrial "brain-nerve-muscle" metaphorical framework, aiming to infuse intelligence into every turbine, oil well, and power station. How similar this sounds to today's "Physical AI"! Both involve data-to-models, simulation-to-deployment, and closed systems evolving into platform ecosystems. Yet, a decade later, Predix has faded from mainstream view, and GE has significantly downsized its digital business, even selling some Predix assets.
Predix was the "pioneer" of industrial internet platforms, aiming to reshape industrial equipment intelligence with software. Though it failed to fully commercialize, its concept profoundly influenced later strategies like NVIDIA's Physical AI. In the current AI-industry integration trend, reviewing GE's industrial internet strategy successes and failures and comparing them to NVIDIA's Physical AI strategy provides clear directions and warnings for industry chain enterprises.
The question arises: Why did GE's industrial internet strategy falter? Is NVIDIA at risk of a similar outcome? The answer lies in their fundamental strategic logic, technological timing, system architecture, and ecological paths, despite superficial similarities. GE's Predix was a typical "industrial company making a platform" attempt. The problem was that it aimed to be a platform while still rooted in its "device manufacturer" identity. Despite GE's open "vision," Predix primarily served its own equipment, failing to build an open, scalable, and self-evolving ecosystem for third parties. It attempted to standardize an extremely complex and diverse industrial world from the top down, overlooking developer, customer, and scenario diversity and complexity.
Predix was primarily a "self-use-oriented" digital engineering project, rather than a technology ecosystem embodying true platform attributes. Conversely, while NVIDIA's Physical AI strategy also harbors platform ambitions, its approach is centered on "developer-first + tool chain first." NVIDIA clearly states that its goal is not to create an ultimate solution but to offer a comprehensive suite of tools and computing infrastructure, ranging from Omniverse's digital twin modeling to Cosmos's synthetic data links, and extending to DGX and Jetson's training and deployment platforms. It empowers the entire Physical AI industry chain, refraining from monopolizing any single value chain.
More importantly, NVIDIA has emerged in a more favorable era. When GE was promoting Predix, cloud computing was still in its nascent stages, AI algorithms were just beginning to delve into deep learning, simulation technology was immature, and data closed loops were difficult to establish. Today, however, AI large models, generative simulations, hardware acceleration, data flywheels, and open-source ecosystems have collectively laid the groundwork for a seamless process from "data generation → model training → virtual-real mapping → entity deployment."
Physical AI is not a solitary technological leap but a systematic iteration standing at the convergence of multiple mature technologies. Naturally, NVIDIA is not immune to risks. Every platform builder confronts the same challenge: how to foster a self-sustaining ecosystem without relying on external push; how to maintain developer engagement beyond short-term migrations; and how to effectively abstract and customize the myriad differences in industrial scenarios on a unified technological foundation. Unlike GE's closed, single-point, asset-heavy approach, NVIDIA's platform architecture boasts the inherent advantages of openness, modularity, and lightweight deployment. Instead of aiming to "replace" industrial systems, it "embeds" within industrial processes; rather than adopting a top-down approach, it fosters "ecological penetration" through development tools and underlying chips; and instead of emphasizing "platform lock-in," it prioritizes "tool empowerment."
This fundamental difference underpins their divergent fates. GE, a pioneer in the industrial internet, faltered on the island of a "closed platform." Meanwhile, NVIDIA, with its open ecosystem and system closed loop, is steering Physical AI to become a bridge connecting the virtual and real worlds. Regardless of whether it's termed "Physical AI" or something else, it is evident that future AI will transcend screens, infiltrating factories, warehouses, cities, and hospitals. Those who control intelligent agents in the physical world will wield the dominant power of future computing platforms. Tools, Platforms, and Ecosystems: Who Truly Owns the Future in Physical AI? The success or failure of a platform hinges on people and the ecosystem. Therefore, in this race for Physical AI, how should other industry chain participants position themselves? And how can they avoid becoming the next "GE"? These questions will be the focus of our discussion in this section.
In the "physical AI operating system" that NVIDIA is constructing, every node enterprise confronts fresh strategic choices. Whether it's a robotics startup, an industrial automation integrator, or a traditional manufacturing giant, they all sense the shift in this technological paradigm. However, being at the forefront of a trend doesn't guarantee success; being swept up in the wave doesn't equate to having direction. If the first two parts delved into the technical logic and platform landscape of physical AI, the question now is: In this seemingly inevitable systemic evolution, what strategies should industry chain participants adopt? And what pitfalls should they avoid?
I. The Biggest Pitfall of Physical AI: Blindly Following the Trend
A perilous trend in the current market is treating physical AI as a "marketing gimmick" rather than a technical system to be built. Some enterprises hastily label themselves and seize opportunities, only to superficially integrate with NVIDIA's tool chain, hastily launch a simulation interface, and unveil an AI demonstration video, devoid of underlying capabilities, data closed loops, or system architecture. In essence, they use demonstrations to circumvent implementation and concepts to conceal structural flaws. The greater risk lies in the fact that physical AI isn't about "module stacking" but "paradigm shifting." It necessitates enterprises to reevaluate their entire systems engineering, encompassing R&D processes, data management, model training, deployment architectures, and hardware-software coordination. Any company attempting to ride the wave with "superficial integration" will ultimately expose systemic deficiencies during implementation. In this sense, physical AI isn't a "technological sprint" but a competition of "systematic capability management." It's more akin to a "major industrial revolution" than just a "minor product upgrade."
II. True Participation Begins with "Endogenous Capability Building"
For developers and robotics enterprises, the core strategy revolves around endogenous capability building. This implies not merely "using NVIDIA's tools" but understanding the underlying logic these tools embody and constructing unique capability closed loops within them. For instance, developers should focus on building model training architectures with transferability and generalization capabilities, rather than optimizing for isolated scenarios; robotics companies should establish autonomous data collection and learning loops across perception, control, and motion planning, rather than relying on static modeling; and automation solution providers should shift from a "project-based" mindset to a reusable structure of "productization + platformization." Furthermore, enterprises should prioritize building "virtual-real fusion" capabilities within their systems: capable of simulation, training, deployment, and feedback. Once this capability is established, it holds the potential for "self-evolution," avoiding platform dependency and tool lock-in.
III. The Key to Building a Moat: Data, Scenarios, and Feedback Loops
In the era of physical AI, the genuinely scarce resources are no longer models or computing power but high-quality, structured, training-ready data from the physical world. Whoever controls real interactive data controls the fuel for model evolution; whoever establishes scenario-level data closed loops possesses an irreplaceable feedback mechanism. Therefore, industry chain enterprises should reflect on: Can my system continuously generate training data? Does my scenario offer universal migration value? Can my feedback loop contribute to model optimization? This necessitates enterprises to move beyond merely "accessing platforms" and become nodes in the platform ecosystem that can "produce, feedback, and accumulate." Otherwise, all superficial innovations will inevitably devolve into "data suppliers" or "capability intermediaries" for the platform.
Written at the End
From "Industrial Internet" to "Physical AI," from Predix to Omniverse, the evolution of technological history has never been devoid of grand narratives. Whenever a paradigm shift arrives, it's invariably accompanied by capital and media hype, blind corporate follow-ups, and utopian technological visions. When the bubble bursts, what truly endures are those who maintain composure amidst the frenzy and continue to accumulate during the cooling-off period. Physical AI indeed represents a leap in technological paradigms.
It not only redefines the development of AI but also challenges our fundamental understanding of how "intelligent systems" engage with the real world. It converges multiple cutting-edge technologies, including embodied intelligence, spatial understanding, multimodal learning, simulation training, and system deployment, offering new possibilities in fields such as robotics, autonomous driving, and smart manufacturing. However, due to its complexity, physical AI is unlikely to be a short-term trend but a systemic project spanning over a decade. It necessitates coordinated evolution across multiple levels, including models, data, computing power, interfaces, standards, and ecosystems, and the maturity of each link isn't achieved overnight.
Taking large models as an example, it took six years from the publication of the Transformer paper in 2017 to the widespread application of GPT-4; the system complexity involved in physical AI is far greater than that of language models, and its development cycle will only be longer. Therefore, for the industry, what truly matters isn't "whether to enter the field" but with what posture, rhythm, and structural capabilities. Blindly chasing platform narratives will only result in becoming marginal nodes in the giant's ecosystem; hastily launching "physical AI products" often leads to unsustainable experimental quagmires devoid of data closed loops, computing power accumulation, or system understanding.
We need to lower short-term expectations and embrace more patient long-term preparations. Abandon the illusion of "rapid disruption" and adopt a strategy of "gradual evolution." Invest resources in foundational data, virtual-real fusion capabilities, model training closed loops, and developer ecosystems, and accumulate scenario understanding and system integration capabilities from a five- or even ten-year perspective. Only then can physical AI evolve from today's technological concept to tomorrow's industrial reality.
References: 1. NVIDIA's Omniverse Physical AI Operating System Expands to More Industries and Partners, Source: NVIDIA 2. Nvidia’s bold bet on physical AI takes shape, Source: maginative.com 3. The evolution of AI from AlphaGo to AI agents, physical AI and beyond, Source: technologyreview.com