05/22 2026
381
As "beating expectations" becomes the new normal, the market eagerly seeks the next catalyst to fuel investor sentiment.
On the morning of May 21 (Beijing Time), NVIDIA released its Q1 FY2027 financial results, once again surpassing market forecasts amid sustained strong demand for AI infrastructure and Blackwell systems.
Revenue surged 85% year-on-year to $81.6 billion, with GAAP net profit soaring 211% to $58.3 billion. Gross margin held steady at 74.9%, averaging $650 million in daily earnings—or $27 million per hour.
NVIDIA CEO Jensen Huang emphasized during the earnings call: "The pace of AI factory construction is accelerating at an unprecedented rate. This represents the largest-scale infrastructure expansion in human history."
However, market reactions reveal a nuanced reality: For the global AI leader, robust growth alone no longer guarantees stock price appreciation. While NVIDIA’s performance remained strong, investors are increasingly scrutinizing whether its growth margins can sustainably exceed lofty expectations, justifying current valuations after prior significant gains.
The market is now asking: What is NVIDIA’s growth ceiling? Can this titan, already commanding over 90% of the AI computing market, unlock new avenues for expansion?

Over the past two years, capital markets have centered their AI narrative on large-scale model training and the high-stakes race for premium GPU computing power.
A deeply entrenched perception has taken hold: AI equals GPUs, with computational might directly correlating to value. During the era of large model training, the industry prioritized extreme floating-point performance and massive parameter stacking—areas where GPUs excelled.
NVIDIA, leveraging its unparalleled ecosystem and hardware advantages, dominated this phase. However, entering 2026, research from leading investment banks and shifting capital flows signal a paradigm shift: The era of pure GPU training dividends has peaked, and the AI industry is entering its next chapter.
The new growth narrative revolves around three core pillars: the reevaluation of CPU computing power, the commercial explosion of AI inference, and the widespread deployment of intelligent agents.
Transitioning from the resource-intensive model training phase to an era of cost optimization, efficiency gains, and monetization, AI’s computing architecture, value distribution, and business models are undergoing a comprehensive transformation. The market, once defined by standalone GPU performance, now embraces shared computing resources, scenario-specific deployments, and value realization—a landscape Wall Street is eagerly investing in.
HSBC’s Frank Lee noted that GPU-centric momentum has become a "relatively less compelling" investment thesis, as cloud providers’ AI capital expenditures diversify into memory, networking, and server CPUs.
NVIDIA is responding by charting a new growth trajectory. As AI evolves from training to inference and intelligent agent deployment, CPUs gain strategic importance. Huang highlighted the potential for billions of AI agents, driving demand for inference hardware: "Thinking happens on GPUs, while orchestration primarily runs on CPUs."

NVIDIA CFO Colette Kress revealed that the Vera CPU has unlocked a $200 billion addressable market for the company. With nearly $20 billion in CPU revenue already visible for the year, NVIDIA is positioning itself as a global CPU leader.
From a technical standpoint, CPUs offer three irreplaceable advantages in the AI landscape:
1. Complex Logic Processing: Intelligent agents excel at autonomous task execution, requiring massive conditional judgments, loop scheduling, and branching processes—areas where CPUs outperform GPUs due to their sequential processing architecture.
2. System-Level Orchestration: Agent workflows involve multi-model collaboration, external tool integration, and cross-platform system management. CPUs serve as the central hub for this complex orchestration.
3. Low-Latency Responsiveness: Financial trading, autonomous driving, and real-time enterprise applications demand millisecond-level responses. CPUs’ cache architecture and instruction set optimizations make them ideal for these high-stakes scenarios.
These technological strengths are driving incremental CPU market growth. Morgan Stanley projects that with the AI agent boom, infrastructure will shift from GPU-centric to a "CPU + memory + system collaboration" model. By 2030, the global server CPU market could exceed $100 billion—double its 2025 size—with a 35% CAGR, making it the fastest-growing AI computing segment.
While CPUs reshape AI’s foundational architecture, the commercial explosion of AI inference forms the bedrock of this industrial shift. As large models mature and deployment scenarios proliferate, inference demand has skyrocketed, transforming from a cost center to a profit engine.
The most visible change lies in market demand. IDC forecasts 2.216 billion active AI agents globally by 2030, with annual Token consumption surging from 0.0005 Peta Tokens in 2025 to 152,000 Peta Tokens—a 300 million-fold increase. The computing power "ceiling" is nowhere in sight.
This demand surge reflects AI’s deepening penetration. Over 40% of enterprise digital systems now embed task-oriented intelligent agents, while smartphones, automobiles, and smart home devices deploy lightweight inference models, accelerating edge intelligence adoption. Scenarios like AI-assisted programming, intelligent offices, and data-driven decision-making have become industry standards, driving inference computing demand across sectors.
Historically, AI companies focused on model training and high-end GPU procurement. Today, inference accounts for over 70% of costs, becoming the primary expenditure for enterprise AI commercialization. The competitive logic has shifted: Success now hinges on deploying inference at lower costs and higher efficiency, making inference capabilities the key to AI profitability.
Mature business models are further unlocking inference’s commercial potential. The traditional power-leasing model is evolving into a diversified profit system. Token-based billing, which aligns pricing with inference complexity, has become mainstream. Subscription services for office and enterprise digital agents are gaining traction, offering stable recurring revenue. Cloud-based inference solutions provide integrated computing, optimization, and maintenance services, cementing inference as AI’s most stable and lucrative revenue stream.
2026 is widely regarded as the "first year of AI agent commercialization." After years of technological refinement, AI agents have shed their experimental labels, evolving from lab concepts into "digital employees" capable of autonomous complex task execution. They now represent the core deployment vehicle for AI’s next phase and the focal point of Wall Street’s new narrative.
Goldman Sachs predicts that within two years, over 50% of U.S. enterprises will adopt intelligent agents, making them standard tools for office operations, production, and management.
Technological breakthroughs are accelerating agent deployment. The industry has moved from single-agent systems to multi-agent collaboration, with standardized interaction protocols enabling cross-platform communication. This allows agents to divide roles, collaborate autonomously, and link tasks seamlessly.
Simultaneously, embodied intelligence technologies are maturing. Physical agents equipped with multi-dimensional sensors can navigate complex industrial environments, performing high-risk tasks like inspections and maintenance. This bridges the gap between virtual and physical AI applications, driving industrial digital transformation.
From Computing Power Platform to Token Factory
NVIDIA’s $200 billion CPU strategy aims to demonstrate to Wall Street the sustainability of its high-growth narrative amid the intelligent agent boom.
AI is no longer a productivity enhancer but a necessity across industries. This drives sustained investments in energy, chips, infrastructure, models, and applications. NVIDIA projects AI infrastructure spending could reach $3–4 trillion annually by 2030.
Management highlighted during the call that AI infrastructure demand is expanding at an unprecedented pace, with AI factory construction accelerating. Two drivers fuel this growth: First, hyperscale cloud providers are shifting core workloads from CPUs to GPU-accelerated computing, spanning search, advertising, recommendation systems, and content understanding. Second, AI-native products are reaching an inflection point, evolving from one-time inference to logical reasoning and intelligent agent capabilities.
Investors now focus on whether the next-gen AI architecture, codenamed "Vera Rubin," can enter mass production as scheduled in late 2026. Goldman Sachs reiterated its "buy" rating on NVIDIA, citing the Vera Rubin timeline as a key valuation catalyst. The market has fully priced in the growth limits of the current Blackwell architecture, requiring next-gen products to drive incremental gains.
As NVIDIA’s AI superchip platform, Vera Rubin—named after dark matter pioneer Vera Rubin—targets high-performance computing (HPC) and large-scale AI training. It aims to bridge the computing gap between the H100 series and future hyperscale models, solidifying NVIDIA’s data center dominance. Initial customers include North American cloud leaders like Amazon AWS and Microsoft Azure.
Global AI is entering the "quadrillion Token era." In March, China’s daily Token invocations exceeded 140 trillion, with global annualized inference usage surpassing 1,000 trillion Tokens—marking AI’s transition from an interactive tool to a continuously operating intelligent infrastructure. The Vera Rubin platform is designed not as a general-purpose computing solution but as the Token production backbone for the intelligent agent era.

The Vera Rubin platform comprises seven chips: the Vera CPU (marking NVIDIA’s entry into the server CPU market), Rubin GPU (flagship product), NVLink 6 (sixth-gen interconnect switch), ConnectX-9 SuperNIC (network interface card), BlueField-4 DPU (storage chip), Spectrum-6 (Ethernet switch supporting CPO technology), and Groq 3 LPU (post-integration chip). It succeeds the Blackwell architecture as a cross-generational solution.
Notably, the Rubin GPU delivers five times the inference performance of its predecessor, reduces large MoE model training GPU requirements by 75%, and cuts per-Token inference costs to one-tenth, making Tokens as accessible as utilities like water and electricity.
From a technical standpoint, Vera Rubin leverages TSMC's cutting-edge packaging technology (likely CoWoS-Lite or an even more advanced iteration), integrating eight computing cores and four sets of HBM3e memory into a single package. This configuration offers a substantial total memory capacity of 192GB and a bandwidth exceeding 2TB/s, marking a 25% increase over the H100's 1.6TB/s. Its tailored architecture is meticulously optimized for the attention mechanism inherent in Transformer models, supporting mixed-precision computing in FP8, FP16, and BF16 formats. The single-chip FP8 peak computing power can soar to 512 TFLOPS, representing a 30% leap from the H100's 395 TFLOPS. Concurrently, the chip maintains a typical power consumption below 380W through dynamic voltage regulation technology, enhancing energy efficiency by 20% and aligning with the green data center trend.
For customers, Vera Rubin offers the potential to slash large model training time by over 30% and reduce training costs by approximately 25%, enabling cloud providers to swiftly deploy next-generation AI services. It also preserves NVIDIA's CUDA ecosystem, allowing existing AI frameworks like TensorFlow and PyTorch to adapt seamlessly without extensive modifications, thereby minimizing migration costs for customers.
Unlocking the Trillion-Dollar Market for Physical AI
Beyond AI computing power infrastructure, the realm of physical AI presents a vast and imaginative landscape—where billions of autonomous robotic systems will operate in the physical world, encompassing industrial robots, service robots, autonomous vehicles, drones, and more.
In this quarter's earnings report, NVIDIA not only highlighted the staggering growth of 'AI factories' but also formally elevated 'physical AI' to the forefront of the industrial stage through a revamped earnings report framework. With the independent disclosure of the 'edge computing' segment and a series of high-profile technical releases targeting autonomous driving and embodied intelligence, NVIDIA is bridging the 'last mile' for AI to penetrate the physical world.
NVIDIA explicitly mentioned in the earnings report the launch of the NVIDIA Alpamayo 1.5 open model and NVIDIA Omniverse NuRec technology to bolster the development of large-scale autonomous driving systems. This move directly addresses the industry's challenge of 'non-interactive and difficult-to-reuse' real-world testing data. Through neural rendering technology, NVIDIA is transforming 'static data' into interactive 'live scenes,' providing robust simulation and data closed-loop capabilities for the mass production of Level 4 autonomous vehicles.
On the autonomous driving front, global tech giants and automakers are accelerating their breakthroughs, forming a tight industrial symbiosis with NVIDIA's underlying computing power. At the GTC 2026 conference, NVIDIA announced collaborations with leading automakers, including BYD, Geely, Nissan, and Isuzu, to jointly develop L4 autonomous vehicles based on the NVIDIA DRIVE Hyperion platform. It also introduced NVIDIA Halos OS, a unified safety architecture for AI-driven vehicles.
Simultaneously, NVIDIA expanded its partnership with Uber to launch an autonomous fleet equipped with full-stack NVIDIA DRIVE AV software. Autonomous driving companies such as Pony.ai, WeRide, and Mogo Auto are continuously deploying globally around robotaxis (Robotaxi) and autonomous buses (Robobus).

For embodied AI, NVIDIA unveiled the NVIDIA Isaac GR00T N model and a new Isaac simulation framework this quarter, aiming to provide a robust physical AI computing foundation for embodied AI. GR00T N adopts a dual-system architecture of "fast thinking + slow thinking," akin to human cognition, separating high-level reasoning from low-level motion control. The slow-thinking reasoning layer, based on visual-language models (VLMs) like Cosmos-Reason-2B, processes image and language instructions, performs scene understanding, task decomposition, and multi-step planning (e.g., breaking down "assemble parts" into "grab," "align," and "join"). The fast-thinking action layer, based on diffusion Transformers (DiT) or Flow Matching architectures, receives high-level tokens from the reasoning layer and the robot's body state, generating smooth and precise continuous action vectors through a denoising process to directly control robot joints.

With the full-scale launch of the NVIDIA IGX Thor platform, NVIDIA is expediting the development of robots with enhanced environmental perception and interaction capabilities, propelling embodied AI from laboratories to real industrial and commercial settings. NVIDIA IGX Thor is specifically designed for deploying real-time physical AI directly at the edge. The platform integrates high-speed sensor processing, enterprise-grade reliability, and functional safety into a compact, desktop-sized module, enabling developers to construct intelligent systems that perceive, reason, and act swiftly, safely, and intelligently.
Notably, NVIDIA disclosed "edge computing" as an independent segment for the first time. The earnings report explicitly stated that this segment encompasses data processing devices for agentic AI and physical AI, including PCs, gaming consoles, workstations, AI-RAN base stations, robots, and automobiles, directly affirming that AI technology is accelerating its transition from the virtual digital world to the real physical world.
For investors and industry practitioners, clinging to the outdated GPU narrative no longer aligns with the current industrial pace. Rather than fixating on NVIDIA's current lofty valuation, it is wiser to focus on its latest advancements in areas such as CPUs, inference, and intelligent agents. After all, as the door to the most certain and vast trillion-dollar market slowly creaks open, earning $650 million a day may seem astonishing, but in the grand scheme of the AI revolution, this is merely the prologue.