Old Huang might be starting to feel anxious.

04/17 2026 497

On April 15, 2026, Jensen Huang faced a sharp line of questioning on Dwarkesh Patel’s podcast—something he hadn’t experienced in a long time. During the hour-long conversation, the phrase he repeatedly used to define NVIDIA was: “Something must turn electrons into tokens.” He described his company as the architecture that produces “the most tokens per watt globally”—not the most computing power per watt, but the most tokens per watt.

This is a GPU company redefining itself.

Why the redefinition? Because he saw earlier than anyone else that power in the AI era is being split into two distinct things. One is computing power, held by the United States; the other is tokens, held by China. These two powers are not on the same plane—neither can eliminate the other. NVIDIA is caught in between. It once thought it was the biggest winner but now finds itself without a comfortable position in either structure.

That’s the real reason for his anxiety.

The landlord of computing power, the tenant of tokens

Let’s first look at the U.S. side.

On April 14, EpochAI released its latest data—Amazon, Google, Meta, Microsoft, and Oracle, the five U.S. hyperscalers, collectively hold 67% of global AI computing power, calculated on an H100-equivalent basis. In Q1 2024, this figure was 60%. Over eighteen months, the concentration of global AI computing power has increased by seven percentage points, with no sign of slowing down.

These five companies have one thing in common: none of them build cutting-edge models.

Google is somewhat of an exception, with its in-house TPU development and Gemini model, but the other four are essentially “landlords”—Microsoft leases computing power to OpenAI via Azure, Amazon ties Anthropic to itself through AWS and its Trainium chips, Oracle cut into (qīng rù) into OpenAI’s infrastructure landscape via the Stargate project, and Meta builds chips primarily for internal use, having just signed a 1GW order for in-house chips with Broadcom on April 13.

So who is using all this computing power? Cutting-edge AI labs are their tenants. Anthropic just announced an expanded TPU collaboration with Google, aiming to access multi-gigawatt-scale computing power in the coming years. OpenAI’s backbone computing power comes from Microsoft Azure and Oracle Stargate—OpenAI just completed an $110 billion funding round in February, with Amazon investing $50 billion, SoftBank $30 billion, and NVIDIA an additional $30 billion (down from its original $100 billion commitment). xAI builds its own clusters but still buys chips from NVIDIA.

The real power structure of the U.S. AI industry is this: five computing power landlords + three cutting-edge model tenants + one NVIDIA. The outside world thinks OpenAI, Anthropic, and xAI are the main players, but in reality, they are the ones paying rent. Every time they raise money, they have to hand over the majority to those five landlords—who are also their investors, their suppliers, and their negotiating counterparts.

NVIDIA, in this structure, appears to be the biggest winner. It secured approximately $115 billion in data center revenue for fiscal year 2026, with a market cap exceeding $3 trillion. But Huang himself sees the fragility of this position most clearly.

At Morgan Stanley’s March conference, he announced that NVIDIA’s $30 billion investment in OpenAI and $10 billion investment in Anthropic would likely be its last. Just two months earlier, he had bet $10 billion on Anthropic, which then turned around and deepened its TPU collaboration—NVIDIA’s investment money flowed back to help a competitor build non-NVIDIA computing power.

This is a commercial choice: Anthropic cannot rely on a single chip supplier; it must diversify. NVIDIA’s customer base is already highly concentrated—in the global computing power landscape, the real heavyweights are those five U.S. companies. Add in the Chinese market, and that’s essentially the entire core revenue pool for its data center business over the past two years. After April 2025, the H20 ban will largely evaporate China’s market. The remaining five customers each have a stronger desire than NVIDIA to move upstream—Meta with its in-house chips, Amazon with Trainium, Google with TPU, Microsoft and Oracle betting on OpenAI’s in-house Titan.

A shovel seller facing customers who are all building their own shovels—that’s NVIDIA’s real position in the U.S. structure.

The token factory: no landlords, no tenants

Now, let’s look at the Chinese side.

China’s top four cloud providers—Alibaba Cloud, ByteDance’s Volcano Engine, Tencent Cloud, and Baidu Intelligent Cloud—are all doing two things simultaneously: selling computing power while building their own large models. Alibaba Cloud has Qwen, Volcano Engine has Doubao and Seedance, Tencent Cloud has Hunyuan, and Baidu Intelligent Cloud has Wenxin. Combined with their complex investments in AI “new forces,” their industrial layouts are complete.

Chinese cloud providers must act as both landlords and tenants—building their own computing power, their own models, and consuming their own tokens, finally selling tokens as a commodity to downstream enterprises. The three layers—computing power, models, and applications—are all consumed by the same company. Only one thing remains as a tradable commodity across the entire value chain: tokens.

Alibaba is representative of this chain. ATH (Alibaba Token Hub) is well-known: CEO Wu Yongming defined its core mission in an internal memo as “creating tokens, delivering tokens, applying tokens.” In the month after ATH’s establishment, Alibaba’s activity density was at its highest in two years.

On March 30, it released Qwen3.5-Omni—the first time in years the Qwen series shifted from open-source to closed-source. Shortly after, Qwen 3.6 Plus topped global large model weekly invocation charts on OpenRouter with 4.6 trillion calls per week, while the Preview version took third place. Today, ATH unveiled its latest open-world model product, Happy Oyster, showcasing terrifying R&D efficiency.

Wu Yongming said something more significant than ATH’s establishment itself during the March 19 earnings call. He stated: “Enterprises no longer treat token consumption as an IT budget item but as a production input.” IT budgets are costs to be optimized; production inputs are items to be expanded. The CEO of a trillion-dollar Chinese company told Wall Street on an earnings call that tokens have shifted from “expenses” to “investments,” from “IT line items” to “economic activity itself.” He provided a specific figure: Alibaba Cloud and AI commercialization annual revenue should reach $100 billion within five years. Under the new token consumption algorithm, this $100 billion could even include AI revenue generated by e-commerce operations.

Before corporate actions, regulators set the tone for this tech feast even earlier. On March 23, Liu Liehong, director of the National Data Administration, officially named tokens “ token (cí yuán)” (word elements) at the China Development Forum, calling them “the value anchor of the intelligent era” and “the settlement unit connecting technological supply with commercial demand.” The next day at a State Council Information Office briefing, he announced official figures—China’s daily word element invocations exceeded 140 trillion, a more than thousandfold increase from 100 billion in early 2024.

This elevates tokens from a technical concept to an economic statistic. Just as “traffic” defined the internet era, “word elements” define the AI era. The OpenRouter platform provides the hardest data support for this narrative—during the week of February 9–15, Chinese large models’ token invocations surpassed the U.S. for the first time, 4.12 trillion vs. 2.94 trillion. By early April, Chinese models reached 12.96 trillion weekly invocations, compared to 3.03 trillion for U.S. models, with China leading for five consecutive weeks. The global top six models by invocation volume are all Chinese. MiniMax M2.5 saw over 3.07 trillion invocations in seven days after release. Kimi K2.5 generated more revenue in its first 20 days than Yuezhi’an’s total 2025 revenue. Zhipu GLM-5 went viral during the Lunar New Year, publicly “seeking computing power partners”—not a marketing gimmick, but real orders overwhelming capacity.

Calculate the structural differences. On the U.S. side, computing power is the core asset, and tokens are a byproduct. On the Chinese side, it’s the opposite—tokens are the core commodity, and computing power is the raw material for producing them. Pricing logic, business models, company valuations, and regulatory metrics all revolve around these two different things.

The ocean is filled with computing power, but tokens are continuous (yuán yuán bù duàn) (continuously) generated from this side.

Why Huang is anxious

Back to Dwarkesh’s podcast.

Huang offered a framework he repeatedly uses—the “five-layer cake” of the AI industry: energy, chips, systems, models, and applications. He said the U.S. must lead in all five layers; otherwise, it will lose. He called China the world’s largest contributor to open-source software, saying “Fact, fact” twice. He noted China accounts for half of global AI researchers, manufactures over 60% of mainstream chips, and has vast underutilized “ghost data centers.” He said most AI progress comes from algorithmic advances, not hardware itself. Then he dropped a line—if DeepSeek’s next version debuts on Huawei chips, it would be a disaster for the U.S.

Reading these comments together, the message is clear. His anxiety doesn’t come from a macro narrative like “China catching up” but from a specific dilemma—on the U.S. side, he’s losing pricing power; on the Chinese side, he’s losing entry tickets.

On the U.S. side: Five hyperscalers control 67% of global computing power—his biggest customers and strongest rivals. Each is developing in-house chips—Meta just signed a 1GW Broadcom order, Amazon’s Trainium secured Anthropic, and Google’s TPU is eating NVIDIA’s share. When your customers are also your competitors, and your market concentrates to just five buyers, your pricing power erodes. Anthropic took NVIDIA’s $10 billion investment and two months later deepened its TPU collaboration—this is a trend, not a coincidence.

On the Chinese side: The token ecosystem is bypassing NVIDIA. DeepSeek V4 is poised to run training and inference on Huawei Ascend 950PR, skipping NVIDIA. Alibaba, ByteDance, and Tencent placed bulk orders for hundreds of thousands of Huawei chips. Models like Qwen, Kimi, and MiniMax produce trillions of tokens weekly—every additional day of operation increases Huawei chip adoption. Worse, these models’ users are expanding globally—Chinese models have led U.S. models on OpenRouter for five consecutive weeks. MiniMax promotes “$1 for one hour of AI digital workers,” and Kimi’s overseas paying users now surpass domestic ones. When the default tech stack for global developers shifts from NVIDIA to Huawei, NVIDIA has no role to play.

Facing these two dilemmas, NVIDIA has done two things in the past six months that can be reinterpreted under this framework.

First, it directly developed products it shouldn’t have. At March’s GTC, NVIDIA released Cosmos 3 (world foundation model), Isaac GR00T N1.7 (robotics), Alpamayo 1.5 (autonomous driving), and Nemotron (open-source large model family)—a full stack covering physical AI and cutting-edge models. On April 14 (World Quantum Day), it unveiled Ising—an open-source quantum AI model family, which Huang personally defined as “AI becoming the control plane, the operating system for quantum machines.” These launches collectively signal one thing: NVIDIA is no longer satisfied with just making chips. It wants to build models, operating systems, physical world simulators, and quantum computing control layers. The shovel seller is now mining itself—not out of greed, but because both structures are squeezing its original position.

Second, it shifted the company’s core metric from “computing power” to “output.” The “electrons to tokens” metaphor appeared at least five times in the Dwarkesh interview. Huang described NVIDIA as the architecture producing “the most tokens per watt globally”—not the most computing power per watt, but the most tokens per watt. A company changing its core metric from “computing power” to “tokens” has already admitted where the real hard currency of this era lies.

But admission doesn’t equal resolution. His GPUs are still fundamentally machines for producing computing power, not tokens. He’s doing the right things in both structures—expanding upstream in the U.S., locking developers into his tech stack—but none of these moves change the fundamental issue: computing power and tokens are not on the same plane. A company that only produces computing power cannot occupy both types of power, no matter how it redefines itself.

Being just a shovel seller is no longer enough to keep NVIDIA safe.

Epilogue

The power structure in the AI era is now split in two.

The American half is real estate—five hyperscalers sit atop 67% of global computing power, with cutting-edge labs as their tenants. In this structure, computing power is the hard asset, and the valuation logic is 'land area × rent.' The Chinese half is factories—four cloud providers compress computing power, models, and applications into a single assembly line, with tokens as the sole output. Here, tokens are the hard currency, and the valuation logic is 'output volume × unit price.'

These two structures cannot consume each other. American computing-power landlords cannot directly enter China’s token factories—regulations, ecosystems, and developer habits all stand in the way. Conversely, China’s token factories cannot directly replace America’s computing-power landlords—cutting-edge training still requires the most advanced hardware, and no matter how fast Huawei’s Ascend chips catch up, they cannot bridge the generational gap with top-tier Nvidia chips in the short term. The two powers operate independently, expand separately, and seek their own exits.

Jensen Huang is caught in the middle. He once thought he was the biggest winner of this era—computing power is the upstream, models are the downstream, and he supplies both, making him the sole bottleneck. But the bottleneck is loosening in both directions. On the upstream side, the five landlords are building their own chips; on the downstream side, China has built its own token ecosystem with domestic chips. Expanding leftward (investing in OpenAI, Anthropic) failed to secure the downstream, while expanding rightward has yet to prove it can defend the tech stack. Alone, he faces growth dilemmas on both fronts.

In a podcast, he said, 'There must be something that turns electrons into tokens.' He was defending NVIDIA’s future, but the statement itself admits—the hard currency of this era is not electrons, but tokens. And the place where tokens are most densely produced is not in America, at least for now.

Caught between the two structures, NVIDIA is losing its most comfortable position.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.