Competing in CPUs, Vying for PCs! Jensen Huang Rocks Taipei, Intel and AMD Should Be Worried

06/05 2026 329

NVIDIA's release of the personal computer superchip RTX Spark deals a major blow to the PC market.

Just now, at NVIDIA's GTC Taipei 2026, Jensen Huang made his iconic entrance in his signature leather jacket once again.

His opening statement set the tone: 'Two years ago when I was here, I started talking to you about the next wave of AI. Today, I can tell you that Agentic AI has arrived. That useful AI has arrived.'

At NVIDIA's GTC Taipei 2026, Jensen Huang highlighted six key points:

First, Token economics: Tokens are now the unit of profitability. Cheap chips don't necessarily mean profit, and expensive chips don't necessarily mean loss.

Second, the five core components of Agent architecture: Model, Harness, Tools, Skills, and Runtime.

Third, Vera Rubin is now in full production, with shipments starting in autumn.

Fourth, the release of the CPU Vera for the Agent era; compared to x86 CPUs, task completion speed is 1.8 times faster.

Fifth, the release of the personal computer superchip RTX Spark. Huang stated, 'All the essence of what we've learned in 30 years is condensed into this single chip.'

Sixth, chip design has entered the Agent era, collaborating with Cadence, Siemens, Synopsys, and others to build autonomous AI engineers.

Token has now become the hottest term among all tech professionals in Silicon Valley, Taiwan, and Shenzhen. Huang said, 'Tokens are now the unit of profitability. Every token is revenue. AI companies want to build more tokens, build more AI factories.'

A 1-gigawatt AI factory project starts at $20-30 billion. It will soon reach $60 billion, $80 billion. $10 billion per gigawatt. Global tech giants are Crazy construction ( Crazy construction means 'frantically building' in Chinese, kept as-is for emphasis) AI infrastructure, and Taiwan's computer manufacturers are extremely busy lately. Huang addressed the supply chain on-site, saying, 'You're all so busy. Taiwanese companies are doing exceptionally well.' Behind this statement lies a celebration of the entire semiconductor supply chain.

This is Token economics. In the traditional IT era, buying servers was a cost, and computation was a consumption. In the AI era, buying GPUs is an investment, and computation is revenue. Huang drew a clear line: Cheap chips don't necessarily mean profit, and expensive chips don't necessarily mean loss. The cost of choosing the wrong architecture has never been higher. If your AI factory's throughput per watt is not high enough, the more you buy, the more you lose. If the throughput per watt is high enough, the more you buy, the more you earn.

Two years ago, Huang said the next wave would be Agent AI. Today, he stated, 'Autonomous AI has arrived. Useful AI has arrived.'

Huang presented a set of data: GitHub submissions surged from 300 million in 2023 to 500 million in 2026. A nearly twofold increase in two years. With 30 million software developers worldwide earning $3 trillion in salaries, they created $9 trillion in productivity.

Huang refuted claims that AI would cause unemployment: 'Some say AI will make programmers unemployed. That's nonsense. The number of engineers is increasing. Since each engineer can produce three times as much, companies naturally want to hire more.' The value of AI lies not in replacement but in amplification. It exponentially increases the output capacity of every developer and every enterprise. When each software engineer can create three times the value, companies have no reason to reduce hiring; instead, they will expand. This is the future Huang sees: A productivity revolution is underway, and it's happening faster than anyone expected.

For the past forty years, the way computers work has never changed: Launch an application, click to input, and wait for results. The Agent era is completely different. Users only need to describe their intent, and AI automatically generates code or uses tools to produce the necessary output.

In traditional computing, software is a binary package running inside an operating system, subject to the OS's scheduling and constraints. Agent computing is heterogeneously distributed—models, harnesses, tools, skills, and runtimes are distributed across different locations in the data center, coordinated uniformly by the CPU.

Huang broke down the five core components of Agent in detail:

Huang clearly stated: 'This agent consists of model, harness, tools and skills, and a runtime.'

Model: Acts as the 'brain,' responsible for understanding, observing, reasoning, and planning. Large language models, integrating synchronous transformation capabilities, now excel at thinking tasks.

Harness: The 'operating system' that connects everything. During each context processing, it precisely routes information, understands what's happening, and coordinates all components to work together. The distinction between working memory and long-term memory becomes crucial here.

Tools: These can be spreadsheets, web browsers, data processing engines, database engines, C compilers, Python interpreters, JavaScript engines, or even accelerated computing libraries. Whenever an Agent uses a tool, the CPU is called to process these requests.

Skills: This is the breakthrough Huang particularly emphasized. Skills are essentially the user manuals for tools, which AI reads and says, 'This is how it's used.' All of NVIDIA's CUDA X libraries will now be equipped with AI-learnable skills. An Agent's ability to use these libraries will far exceed that of human programmers.

Runtime: The execution environment that coordinates all components. Security control devices run on CPU and DPU security processors, monitoring the entire process. Memory management is the most challenging part—working memory, similar to KV caches, needs to handle compressed, retrieved, structured, and unstructured data.

Agent computing is distributed and heterogeneous. This poses significant technical challenges: When computation is decomposed, bandwidth between CPU cores, between CPU and storage devices, and between CPU and GPUs becomes a bottleneck. Data flowing inside and outside the chip cannot suffer from triplet losses or cross chip boundaries. Cross-chip communication latency must be extremely low.

New Agent applications fundamentally differ from past applications in how they run. Past applications were constrained by the operating system, while Agents are constrained by the architecture itself—the distributed computing nature dictates that it must operate efficiently in heterogeneous environments.

It was this heterogeneous computing challenge that prompted NVIDIA to develop Vera Rubin.

Today, Huang announced that Vera Rubin is accelerating toward full production, with product shipments starting this autumn.

Vera Rubin is NVIDIA's largest POD-level platform to date—five dedicated racks form a massive AI supercomputer designed specifically for agent workloads. The platform integrates the Vera Rubin NVL72 system, Vera CPU, Groq 3 LPX, Vera BlueField-4 STX storage, and Spectrum-6 SPX Ethernet racks into a fully integrated system. Compared to the previous-generation NVIDIA Grace Blackwell platform, Vera Rubin's large-scale agent throughput has increased by 10 times.

Huang said, 'Vera Rubin was born for this moment—it's an AI factory engine capable of delivering intelligence at scale and possesses the performance, efficiency, and security needed to drive the next industrial revolution.'

Assembling a Grace Blackwell rack used to take two hours; now it takes just five minutes. No cables, no hoses, no fans—just a single PCB connecting both sides. When Huang demonstrated this comparison, his tone couldn't hide his pride: 'Last time I showed you this, it took how long? We were surrounded by cables. But now there's a single PCB connecting both sides. What used to take two hours now takes just five minutes.'

This represents not just higher productivity but a qualitative leap in the speed of AI factory deployment. More importantly, reliability has improved—without cables, there's no risk of cable failures. Huang said, 'Rubin's reliability and resilience will be off the charts.'

Top system integrators, infrastructure software, and storage partners are fully producing Vera Rubin products, including Dell Technologies, HPE, Lenovo, and Supermicro, as well as Taiwanese manufacturing giants like AIC, Compal, Foxconn, Gigabyte, Inventec, Pegatron, Quanta Cloud Technology (QCT), Wistron, and Wiwynn.

The Vera Rubin platform introduces NVIDIA Spectrum-X Ethernet photonics technology, the world's first switch based on co-packaged optics (CPO) with 200Gb/s SerDes, now in production.

Meanwhile, the Vera Rubin platform adopts full-stack NVIDIA confidential computing technology to create a rack-level trusted execution environment. Vera Rubin NVL72 integrates Vera CPU, Rubin GPU, NVIDIA NVLink networking, and security features into a unified platform, encrypting data through high-speed interconnects. This provides hardware-level authentication to ensure system tamper-proofing.

The NVIDIA DSX platform provides a complete design and operational foundation for the Vera Rubin AI factory—unifying reference designs, simulations, infrastructure software, facilities, and ecosystem technologies to help build and operate energy-efficient AI factories, achieving the lowest Token costs.

Huang took time to thank Microsoft, Dell, and CoreWeave for already setting up Vera Rubin engineering racks. This means manufacturing partners are no longer just producing components; they're helping NVIDIA validate the entire system. Chips, cooling, networking, and storage are all interconnected. This is truly a one-stop delivery.

Another announcement in this speech was NVIDIA's first processor designed specifically for the AI Agent era: Vera CPU.

Huang posed a profound question: All past CPUs were designed for humans, who live in a world measured in seconds. Humans can wait, click to close pop-ups, and adapt to inconveniences. But Agents are different. Agents lack patience. They don't live in a world of seconds; they live in a world of nanoseconds. When an Agent uses a tool, it expects the response time to be as fast as possible. When it accesses a database, it must return quickly. Every moment an Agent waits prevents it from moving to the next step.

This is why a completely new CPU architecture is needed. Traditional CPU designs assume users can tolerate certain delays, but Agents have entirely different requirements.

In the Vera Rubin rack, the Vera CPU assumes three critical roles: First, orchestration and management. Vera CPU coordinates and manages GPU tools, manages KV caches, and handles all software running in the rack. In complex Agent workflows, these CPUs serve as the command center for the entire system. Second, security and isolation. Through Vera BlueField, the CPU handles security and isolation functions, ensuring different workloads do not interfere with each other. Third, harness and gateway. Vera CPU is used for AI model tool orchestration and database access.

Huang pointed out that the architectural design of Vera CPU revolves around four key characteristics: First, single-thread performance must be ultimate ( ultimate means 'extreme' in Chinese, kept as-is for emphasis); second, per-core bandwidth must be extreme; third, total bandwidth inside and outside the chip must be extreme; fourth, energy efficiency must be extreme.

Compared to x86 CPUs, the Vera CPU achieves a 1.8x faster task completion speed and can drive a wide range of workloads across industries, including Agent AI, reinforcement learning, and data processing, thereby generating more data center token revenue. Huang Renxun also highlighted several key metrics: on-chip bandwidth of 3.6TB/s, no triplet loss, no chip boundary crossing; first to support PCIe 6.0; first to feature LPDDR5X with bandwidth of 1.2TB/s; and 88 Olympus cores.

Huang Renxun said, 'This is the first CPU in a very long time that truly pushes the limits.' Currently, cloud service providers such as ByteDance, CoreWeave, Lambda, Nebius, Nscale, and Oracle Cloud Infrastructure (OCI) have all planned to deploy the Vera CPU. The Vera system will be available through system builders and cloud partners starting this fall.

Huang Renxun pointed out a fundamental trend: 'In the past, we built CPUs for humans. This marks the beginning of a new market, an unprecedented one. It won’t disrupt the old market; this is a new market—the CPU for agents. This market will undoubtedly be larger than the last. The reason is that the number of agents will far exceed the human population.'

The most significant announcement today, and also the product with the strongest consumer electronics attributes—RTX Spark.

Huang Renxun opened with a historical perspective: 'Forty years ago, Windows launched the PC era. Forty years later, Microsoft and NVIDIA will reshape the PC.'

Over the past forty years, the way PCs work has never changed—users launch applications, click the mouse, and type text. Now, an agent that understands you and assists you will directly take over your computer. You can talk to it, it can see you, and you can ask it to resubmit documents or conduct research on your behalf. The new operating system is the old operating system plus a large language model. In many ways, this is the modern-day equivalent of DirectX. It has input and output capabilities, understands prompts, and possesses computer vision understanding.

Huang Renxun said, 'The essence of everything we’ve learned over the past 30 years is encapsulated in this single chip.'

Key specifications of the RTX Spark: 6,144 CUDA cores; 1 petaflop of AI performance; connected to a high-performance 20-core Grace CPU via NVLink-C2C inter-chip interconnect technology; 128GB of unified memory; TSMC 3nm process; 70 billion transistors. NVIDIA collaborated with MediaTek to develop a custom CPU design, achieving best-in-class energy efficiency, performance, and connectivity.

The RTX Spark laptop features a full-size premium design, measuring just 14 millimeters in thickness and weighing only 3 pounds, available in 14- to 16-inch sizes. The precision-machined aluminum body offers durability and a sleek, modern aesthetic. Equipped with dual color-accurate OLED displays and NVIDIA G-SYNC technology, it delivers stunning visuals for creative work and immersive gaming.

Currently, major hardware manufacturers are joining the RTX Spark lineup, with leading makers including ASUS, Dell, HP, Lenovo, Microsoft Surface, and MSI launching products this fall, followed by models from Acer and Gigabyte. Huang Renxun excitedly announced, 'This is the first comprehensive overhaul of the PC product lineup in 40 years. I feel incredibly honored that 100% of the global PC industry has joined us to reshape the PC.'

Huang Renxun showcased the new roadmap. For each generation of architecture, NVIDIA will offer a desktop, a laptop, and a workstation. Huang Renxun said, 'We have a roadmap; this is an entirely new product series for us.'

Huang Renxun announced that Cadence and NVIDIA are collaborating to develop a chip design agent.

But this time, it’s not just a collaboration—it’s a real production system. Cadence uses NVIDIA OpenShell to secure its ChipStack AI super-agent—a fully autonomous AI engineer capable of performing chip design and verification. NVIDIA is the first customer to use ChipStack for autonomous verification of its chip designs.

Every chip begins with a series of architectural specifications, which are then translated into RTL (the language of chip design). RTL must be verified in simulation, and a single bug can delay the chip by months. At NVIDIA, thousands of engineers spend billions of compute hours and millions of tests annually to write, run, and debug, with a cycle that takes weeks for teams to compress.

Now, this process is being disrupted by agents. Companies like Cadence, Dassault Systèmes, Siemens, Synopsys, Flexcompute, Luminary, Neural Concept, nTop, P-1 AI, PhysicsX, and Synera are leading the way in using NVIDIA NemoClaw to build autonomous AI engineers. By delegating these tasks to always-on autonomous AI engineers, businesses can compress engineering cycles that once took weeks into just hours.

Siemens is integrating NVIDIA NemoClaw and OpenShell into Fuse EDA AI Agent, a purpose-built autonomous agent for planning and coordinating multi-tool workflows in semiconductor, 3D IC, and printed circuit board system design. Synopsys is collaborating with NVIDIA to build always-on autonomous AI engineers for chip design, with a focus on achieving full workflow autonomy.

At the model level, Huang Renxun unveiled Nemotron 3 Ultra, NVIDIA’s latest open model series.

This is a 550-billion-parameter Mixture of Experts model that provides cutting-edge intelligence for long-running agents in coding, research, and enterprise workflows. Compared to similar open frontier models, Ultra delivers up to 5x faster inference and up to 30% lower costs, enabling agents to complete tasks faster and at a lower cost.

This is the world’s first model based on a hybrid architecture of SSM (State Space Models) and Mixture of Experts. What does this architecture mean? Huang Renxun said, 'We move fast so you can think fast when you need to think agilely. Same cost, deeper thinking.'

More importantly, NVIDIA isn’t just providing the model—it’s also offering complete training data, training scripts, and long-running tools. This is what makes it a truly open model—not just giving you a black box, but the entire training pipeline so you can reproduce and fine-tune it.

Nemotron 3 Ultra has undergone post-training and is usable with leading agent platforms and tools, including Hermes Agent, LangChain Deep Agents, OpenClaw, OpenHands, and OpenCode. CrowdStrike is using NVIDIA Nemotron models to continuously identify, prioritize, and fix vulnerabilities and policy configuration errors for its dedicated agents. Palantir is integrating NVIDIA Nemotron models into its AI FDE (Frontier Deployment Engineer) platform to autonomously execute complex tasks.

Huang Renxun announced a full commitment to Nemotron 3 production and is already developing Nemotron 4.

Some say NVIDIA is now the 'shovel seller' of the AI era. As long as AI continues to grow, it can’t do without NVIDIA’s chips. That’s only half right. NVIDIA is indeed selling shovels, but Huang Renxun is clearly not satisfied with that. He wants to sell not just shovels, but the entire mining operation—from GPUs to CPUs, from networking to storage, from the software stack to AI models. He wants to fit the entire AI era into his own box.

Looking back at the entire announcement, today’s CPU and RTX Spark will significantly disrupt the PC market.

The Vera CPU is precisely positioned—it’s not meant to replace the x86 processor in your desktop but is tailored for AI factory scenarios. NVIDIA is very clear about its boundaries: they won’t compete in the consumer CPU market because there’s no point. The value of the Vera CPU lies in its being an indispensable part of the entire Vera Rubin system. That’s why Huang Renxun has been emphasizing that this is a 'brand-new market.'

Now, about RTX Spark. This is a product on an entirely different scale because it directly enters the consumer market. For 40 years, the core architecture of PCs has remained fundamentally unchanged: an x86 processor paired with the Windows operating system. But RTX Spark laptops change that formula: NVIDIA RTX Spark plus Windows plus Agent. For the first time, NVIDIA has fully defined the PC architecture with its own chips, achieving vertical integration from the bottom layer to the application layer.

The impact on the market isn’t just about 'having another chip option.' It means the criteria for evaluating PCs have been redefined. In the past, you looked at a PC’s clock speed, core count, and memory size; now, you look at AI compute power, unified memory capacity, and local agent execution capabilities.

In other words, this is what NVIDIA is doing: replacing traditional chip vendors’ positions in the PC market with its own chips.

The terrible (formidable) aspect of this self-disruption is that NVIDIA is already the absolute king in the GPU market and can afford the cost of transformation. When it decided to enter the CPU market, it brought not just chips but also the CUDA ecosystem, developer community, and a full suite of software optimizations. These are advantages no new entrant possesses.

*Disclaimer: This article is the original creation of the author. The content represents their personal views. Our republication is solely for sharing and discussion purposes and does not imply endorsement or agreement. If you have any objections, please contact our backend.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.