Lao Huang might be starting to feel anxious.

Home

Finance

ICV

Smart City

Digital Live

Cloud

Optics

Home Finance AI ICV Smart City Digital Live Cloud Optics

04/17 2026 526

On April 15, 2026, Jensen Huang faced a sharp line of questioning on Dwarkesh Patel’s podcast—something he hadn’t experienced in a long time. During the hour-long conversation, the phrase he repeatedly used to define NVIDIA was: “Something must turn electrons into tokens.” He described his company as the architecture that produces the most tokens per watt globally—not the most computing power per watt, but the most tokens per watt.

This is a GPU-selling company redefining itself.

Why redefine? Because he saw earlier than anyone else that power in the AI era is being split into two distinct forces: computing power, held by the U.S., and tokens, held by China. These two powers exist on different planes, neither able to consume the other. NVIDIA, stuck in between, once thought it was the biggest winner but now finds itself without a comfortable position in either structure.

That’s the real reason for his anxiety.

The landlords of computing power, the tenants of tokens

Let’s first look at the U.S. side.

On April 14, EpochAI released new data showing that Amazon, Google, Meta, Microsoft, and Oracle—five U.S. hyperscalers—collectively hold 67% of global AI computing power, measured in H100 equivalents. In Q1 2024, that figure was 60%. Over 18 months, global AI computing power concentration has risen by seven percentage points, with no sign of slowing.

These five companies share one trait: none of them build cutting-edge models.

Google is somewhat of an exception, with its in-house TPU and Gemini model, but the other four are essentially “landlords”—Microsoft leases computing power to OpenAI via Azure, Amazon ties Anthropic to AWS and its Trainium chips, Oracle cuts into OpenAI’s infrastructure through Project Stargate, and Meta builds chips primarily for internal use, having just signed a 1GW order with Broadcom on April 13.

So who’s using all this computing power? Frontier AI labs are their tenants. Anthropic recently announced an expanded TPU collaboration with Google, aiming to access multi-gigawatt-scale computing power in the coming years. OpenAI’s backbone computing power comes from Microsoft Azure and Oracle Stargate—OpenAI just closed an $110 billion funding round, with Amazon investing $50 billion, SoftBank $30 billion, and NVIDIA an additional $30 billion (down from its original $100 billion commitment). xAI builds its own clusters but still buys chips from NVIDIA.

The real power structure of the U.S. AI industry looks like this: five computing-power landlords + three frontier-model tenants + one NVIDIA. Outsiders assume OpenAI, Anthropic, and xAI are the main players, but in reality, they’re the ones paying rent. Every funding round they raise, the majority goes to the five landlords—who are also their investors, suppliers, and negotiating counterparts.

NVIDIA, in this structure, appears to be the biggest winner. It secured roughly $115 billion in data center revenue in FY2026, with a market cap exceeding $3 trillion. But the fragility of this position is something Jensen Huang sees most clearly.

At Morgan Stanley’s March conference, he announced that NVIDIA’s $30 billion investment in OpenAI and $10 billion investment in Anthropic would likely be its last. Two months earlier, he had just bet $10 billion on Anthropic, which promptly deepened its TPU collaboration—NVIDIA’s investment money flowed back to help competitors build non-NVIDIA computing power.

This is a business decision: Anthropic cannot rely on a single chip supplier and must diversify. NVIDIA’s customer base is already highly concentrated—globally, the real heavy users are those five U.S. companies. Add in the Chinese market, and you have the entirety of its core data center revenue pool from the past two years. After April 2025, the H20 ban will largely evaporate China’s contribution. The remaining five customers each have a stronger incentive than NVIDIA to move upstream—Meta with its in-house chips, Amazon with Trainium, Google with TPUs, and Microsoft and Oracle betting on OpenAI’s in-house Titan.

A shovel seller facing customers who are building their own shovels—that’s NVIDIA’s real predicament in the U.S. structure.

The token factories: no landlords, no tenants

Now, let’s look at the Chinese side.

China’s top four cloud providers—Alibaba Cloud, ByteDance’s Volcengine, Tencent Cloud, and Baidu Intelligent Cloud—are all doing two things simultaneously: selling computing power and building their own large models. Alibaba Cloud runs Qwen, Volcengine runs Doubao and Seedance, Tencent Cloud runs Hunyuan, and Baidu Intelligent Cloud runs Wenxin. Combined with their complex investments in AI “new forces,” their industrial layouts are complete.

Chinese cloud providers must act as both landlords and tenants—building their own computing power, models, and consuming their own tokens, ultimately selling tokens as commodities to downstream enterprises. With computing power, models, and applications all consolidated under one company, only one thing remains as a tradable commodity across the value chain: tokens.

Alibaba exemplifies this chain. Its ATH initiative (which most are familiar with) has a core mission, as CEO Eddie Wu stated in an internal memo: “Create tokens, deliver tokens, apply tokens.” In the month after ATH’s launch, Alibaba’s activity density reached its highest point in two years.

On March 30, it released Qwen3.5-Omni—Qwen’s first shift from open-source to closed-source in recent years. Soon after, Qwen 3.6 Plus topped global large-model weekly call volume (invocation volume) on OpenRouter at 4.6 trillion/week, with the Preview version taking third place. Today, ATH unveiled Happy Oyster, its latest open-world model product, showcasing terrifying R&D efficiency.

Eddie Wu made a more significant statement than ATH’s launch itself during a March 19 earnings call: “Enterprises no longer treat token consumption as an IT budget item but as a production input.” IT budgets are costs to be optimized; production inputs are assets to be expanded. The CEO of a trillion-dollar Chinese company told Wall Street on an earnings call that tokens have shifted from “expenses” to “investments,” from “IT line items” to “economic activity itself.” His concrete target: Alibaba Cloud and AI commercialization revenue to hit $100 billion annually within five years. Under the new token-consumption algorithm, this figure could even include AI revenue from e-commerce operations.

Before corporate actions, regulators set the tone for this tech feast earlier. On March 23, Liu Liehong, director of the National Data Administration, formally defined “token” as “ token ” (cíyuán) at the China Development Forum, calling it the “value anchor of the intelligent era” and the “settlement unit connecting technological supply with commercial demand.” The next day at a State Council Information Office briefing, he released official figures: China’s daily token invocation volume exceeded 140 trillion, a thousandfold increase from 100 billion in early 2024.

This elevates tokens from a technical concept to an economic statistic. Just as “traffic” defined the internet era, “ token ” defines the AI era. The OpenRouter platform provides the hardest data for this narrative: from February 9–15, Chinese large models’ token invocation volume surpassed the U.S. for the first time, 4.12 trillion vs. 2.94 trillion. By early April, Chinese models reached 12.96 trillion weekly invocations vs. 3.03 trillion for U.S. models, with China leading for five consecutive weeks. The top six globally invoked models were all Chinese. MiniMax M2.5 surpassed 3.07 trillion invocations in seven days post-launch. Kimi K2.5 generated more revenue in under 20 days than Moonshot AI did in all of 2025. Zhipu’s GLM-5 went viral during the Lunar New Year, publicly “seeking computing-power partners”—not a marketing gimmick, but genuine orders overwhelming capacity.

Calculate the structural differences. On the U.S. side, computing power is the core asset, and tokens are its byproduct. On the Chinese side, it’s the opposite—tokens are the core commodity, and computing power is the raw material for producing them. Pricing logic, business models, company valuations, and regulatory metrics all revolve around these two distinct forces.

The Atlantic is flooded with computing power, but tokens keep flowing from this side.

Why Lao Huang is anxious

Back to Dwarkesh’s podcast.

Huang repeatedly framed the AI industry as a “five-layer cake”: energy, chips, systems, models, and applications. He said the U.S. must lead in all five layers to avoid defeat. He called China the world’s largest contributor to open-source software, emphasizing this twice as “Fact.” He noted China accounts for half of global AI researchers, manufactures over 60% of mainstream chips, and operates vast underutilized “ghost data centers.” He stated that most AI progress comes from algorithmic advances, not hardware itself. Then he dropped a line: If DeepSeek’s next version debuts on Huawei chips, it would be a disaster for the U.S.

Reading these comments together, the message is clear. His anxiety doesn’t stem from macro narratives like “China catching up” but from a specific, awkward predicament: on the U.S. side, he’s losing bargaining power; on the Chinese side, he’s losing market access.

On the U.S. side: Five hyperscalers control 67% of global computing power—his largest customers and strongest competitors. Each is developing in-house chips: Meta just signed a 1GW Broadcom order, Amazon’s Trainium secured Anthropic, and Google’s TPU is eating NVIDIA’s market share. When your customers are also your rivals, and your market consolidates to five buyers, your pricing power erodes. Anthropic taking NVIDIA’s $10 billion investment and then deepening TPU collaboration two months later is a trend, not a fluke.

On the Chinese side: The token ecosystem is bypassing NVIDIA. DeepSeek V4, poised to launch on Huawei’s Ascend 950PR for training and inference, avoids NVIDIA. Alibaba, ByteDance, and Tencent placed bulk orders for hundreds of thousands of Huawei chips. Models like Qwen, Kimi, and MiniMax generate trillions of tokens weekly—every additional day of operation increases Huawei chip adoption. Worse, these models’ users are expanding globally: Chinese models have led U.S. invocations on OpenRouter for five consecutive weeks. MiniMax promotes “$1 for one hour of AI digital workers,” and Kimi’s overseas paying users now surpass domestic ones. When developers worldwide start defaulting from NVIDIA to Huawei stacks, NVIDIA has no role to play.

Facing these two dilemmas, NVIDIA has taken two actions in the past six months that can be reinterpreted under this framework.

First, it ventured into products it shouldn’t touch. At GTC in March, NVIDIA released Cosmos 3 (world foundation model), Isaac GR00T N1.7 (robotics), Alpamayo 1.5 (autonomous driving), and Nemotron (open-source large-model family)—a full stack covering physical AI and frontier models. On World Quantum Day (April 14), it launched Ising, an open-source quantum AI model family, which Jensen Huang personally defined as “AI becoming the control plane, the operating system for quantum machines.” These launches collectively signal one thing: NVIDIA is no longer satisfied with selling chips. It wants to build models, operating systems, physical-world simulators, and quantum control layers. The shovel seller is now mining itself—not out of greed, but because both structures are squeezing its original position.

Second, it shifted the company’s core metric from “computing power” to “output.” The “electrons to tokens” metaphor appeared over five times in the Dwarkesh interview. He described NVIDIA as the architecture producing the most tokens per watt globally—not the most computing power per watt, but the most tokens per watt. A company changing its core metric from “computing power” to “tokens” has already admitted where the era’s hard currency lies.

But admission doesn’t equal resolution. NVIDIA’s GPUs remain fundamentally machines for producing computing power, not tokens. Its actions in both structures—expanding upstream in the U.S., locking developers into its tech stack—don’t change the fundamental issue: computing power and tokens exist on different planes. A company that only produces computing power cannot occupy both power bases, no matter how it redefines itself.

Merely selling shovels is no longer enough to keep NVIDIA safe.

Epilogue

The power structure in the AI era is now split in two.

On the American side, it's real estate—five hyperscalers sit atop 67% of global computing power, with cutting-edge labs as their tenants. In this structure, computing power is the hard asset, and the valuation logic is 'land area × rent.' On the Chinese side, it's factories—four cloud providers compress computing power, models, and applications into a single assembly line, with tokens as the sole output. Here, tokens are the hard currency, and the valuation logic is 'output volume × unit price.'

The two structures cannot consume each other. American computing power landlords cannot directly enter China's token factories—regulations, ecosystems, and developer habits all stand in the way. Likewise, China's token factories cannot directly replace America's computing power landlords—cutting-edge training still requires the most advanced hardware, and no matter how fast Huawei's Ascend chips catch up, they cannot bridge the generational gap with top-tier Nvidia chips in the short term. The two powers operate independently, expand separately, and seek their own exits.

Jensen Huang is caught in the middle. He once thought he was the biggest winner of this era—computing power is the upstream, models are the downstream, and he supplies both, making him the sole bottleneck. But the bottleneck is loosening in both directions. On the upstream side, the five landlords are building their own chips; on the downstream side, China has built its own token ecosystem using domestic chips. Expanding leftward (investing in OpenAI, Anthropic) failed to secure the downstream, while expanding rightward has yet to prove it can defend the tech stack. Alone, he faces growth dilemmas on both fronts.

In a podcast, he said, 'There must be something that turns electrons into tokens.' He was defending Nvidia's future, but the statement itself admits—the hard currency of this era is not electrons, but tokens. And the place where tokens are most densely produced is not in America, at least for now.",

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.

Newest

Links