AI Computing Power Challenge: Power Shortage is Just the Surface, Manufacturing is the Real Bottleneck

Home

Finance

ICV

Smart City

Digital Live

Cloud

Optics

Home Finance AI ICV Smart City Digital Live Cloud Optics

03/17 2026 336

While everyone is discussing the power anxiety of AI data centers, Dylan Patel, a semiconductor industry expert and founder of SemiAnalysis, presented a disruptive viewpoint in his latest podcast: The core bottleneck of AI computing power expansion has never been electricity, but rather the manufacturing capacity of advanced semiconductors. From the Capacity constraints (production capacity constraints) of EUV lithography machines to the resource competition for HBM memory, and then to Nvidia's moat built through its supply chain layout (strategic deployment), the competition in the AI industry is shifting from algorithmic innovation in the cloud to the hard manufacturing capabilities in the physical world. Nvidia's chip strategy shift at the 2026 GTC conference—with CPUs returning to center stage and a $20 billion bet on Groq's inference chip technology—further confirms this trend, all of which are precise responses to manufacturing bottlenecks. Ultimately, the future of AI will be constrained by the production speed of silicon wafers.

Is Power Anxiety a False Proposition? Manufacturing is the Insurmountable Hard Constraint

In past discussions about AI computing power, 'power shortage' has always been a frequent topic. The power demands of AI data centers, often reaching tens of thousands of kilowatts, have led people to assume that power supply is the primary factor restricting computing power expansion. However, Dylan Patel stated bluntly in the podcast that this is a fundamental misjudgment of the bottlenecks in the AI industry: Power is a cost issue, while manufacturing is an availability issue, and the two are fundamentally different.

Power shortages are not insurmountable. Driven by high AI returns, capital is willing to pay expensive power supply costs for computing power. Today, leading tech companies have already moved beyond the limitations of the public power grid, adopting distributed power generation, gas turbine + energy storage backup systems, and even temporary high-cost power supply solutions to ensure data center operations. Nvidia's 75% year-over-year revenue growth in its data center business is sufficient to support companies in paying a premium for electricity—as long as they can obtain chips, there are always solutions for power. However, the lack of advanced semiconductor manufacturing capabilities is a gap that money cannot fill in the short term.

Dylan Patel emphasized that the underlying constraint on AI computing power ultimately lies in critical equipment like EUV lithography machines. A state-of-the-art ASML High-NA EUV lithography machine costs over $300 million, with a production cycle exceeding 18 months, and global annual production capacity is only in the dozens. The manufacturing of such equipment involves dozens of cutting-edge fields, including precision optics, high-end materials, and ultra-precision machinery, and cannot be rapidly expanded through mere capital investment. More critically, the supply chain for EUV lithography machines is highly concentrated, with the production capacity of core components monopolized by a few companies. Even by 2030, the production capacity gap (production capacity gap) will remain a major bottleneck for the semiconductor industry.

In addition to lithography machines, production capacity for advanced wafer manufacturing and packaging technologies is equally tight. The CPU market in 2026 is facing what The Futurum Group calls a 'silent supply crisis,' with AMD and Intel issuing supply shortage warnings to Chinese customers, CPU delivery lead times extending to six months, and price increases exceeding 10%. Chip analyst Ben Bajarin summed it up: 'Wafers don't grow on trees; we can't magically harvest 10% more silicon. The entire industry is facing capacity constraints.' This manufacturing capacity shortage is not a problem for any single company but a systemic bottleneck for the global semiconductor industry.

The development of the AI industry must ultimately return to the fundamental laws of the physical world. From logic chips to memory chips, from advanced processes to packaging and testing, the manufacturing capacity at every stage determines the speed of computing power expansion. When the pace of algorithmic innovation far exceeds that of hardware manufacturing, manufacturing capabilities naturally become the true ceiling for AI computing power.

The Zero-Sum Game for Global Manufacturing Resources Triggered by HBM

If EUV lithography machines represent the 'foundational issue' of AI manufacturing bottlenecks, then high-bandwidth memory (HBM) is the most pressing 'load-bearing wall crisis.' Dylan Patel predicted in the podcast that over the next 1-2 years, the world will face a 'massive memory crisis,' the essence of which is the siphoning effect of AI on manufacturing resources, triggering a zero-sum game.

HBM is a core component of AI chips, with its high bandwidth and low power consumption perfectly suited to the training and inference needs of large AI models. With the rise of agentic AI, demand for data processing in AI is growing exponentially, and so is the demand for HBM. However, HBM is extremely difficult to manufacture, with low yield rates, and its production capacity expansion cannot keep pace with demand growth. Currently, global HBM production capacity is concentrated among Samsung, SK Hynix, and Micron, and leading AI companies have begun locking in HBM production capacity to secure computing power supplies.

Nvidia is a typical winner in this capacity competition. To match the computing power demands of its GPUs, Nvidia has signed long-term HBM supply agreements with memory manufacturers in advance, securing a large portion of production capacity. This 'head locking' strategy creates a virtuous cycle: AI companies lock in capacity → memory manufacturers allocate more resources to HBM production → HBM capacity becomes further concentrated among leading players. However, behind this cycle is resource squeezing on the consumer electronics industry—production capacity for ordinary DRAM and NAND flash memory needed for smartphones, PCs, and other consumer electronics is being continuously compressed.

Dylan Patel pointed out that AI growth is essentially achieved by preempt (seizing) manufacturing resources from other industries. Memory manufacturers, driven by profit considerations, will prioritize allocating capacity to higher-priced, more profitable HBM rather than consumer electronics memory. Since 2025, the consumer electronics industry has already shown signs of memory supply tightness, with some mid-range and low-end smartphones and PCs facing delayed releases due to memory shortages. This reallocation of resources is reshaping the global semiconductor supply chain: The prosperity of the AI industry is coming at the cost of sacrificing production capacity in the consumer electronics industry.

More alarmingly, this resource siphoning effect is spreading throughout the semiconductor manufacturing chain. In addition to HBM, advanced-process wafers and special packaging materials needed for AI chips are all competing for limited manufacturing resources with other industries. As AI becomes the primary source of demand in the semiconductor industry, the logic of capacity allocation across the entire industry will be rewritten, and this rewriting will inevitably be accompanied by inter-industry interest games and resource conflicts.

It's Not About Technology, But About Securing Manufacturing Capacity in Advance

When it comes to AI chips, Nvidia is undoubtedly an unavoidable name. This chip giant, with a market capitalization of $4.4 trillion, dominates the global AI GPU market. People often attribute its success to technological advantages—the barriers of the CUDA ecosystem and the parallel computing capabilities of GPUs. However, Dylan Patel revealed in the podcast Nvidia's true moat: not technological leadership, but the early securing of manufacturing capacity.

Competition in the semiconductor industry follows a 'first-come, first-served' underlying principle. Nvidia understands this well and prioritizes cash flow investment in supply chain security over pure R&D. As early as 2023, Nvidia disclosed in its earnings report that it had signed long-term capacity agreements with TSMC for advanced processes, locking in a large portion of TSMC's 3nm and 4nm process capacities to secure production for its GPUs and CPUs. This 'early bird effect' allows Nvidia to maintain a stable chip supply amid capacity shortages. In 2026, Nvidia's data center business achieved single-quarter revenue exceeding $62 billion, up 75% year-over-year, with stable capacity supply being the core support for its growth.

In contrast, tech giants like Google and Amazon, despite having the technical capabilities to develop their own chips, cannot escape the predicament (predicament) of capacity squeezing. Google's TPU performs well in AI training and inference, but its production capacity is constrained by TSMC's wafer supply and cannot expand rapidly. Amazon's Trainium chip also faces capacity shortages, making it difficult to meet the computing power demands of its cloud business. As Dylan Patel said, in an era where manufacturing capabilities are the bottleneck, technological advantages without capacity support are ultimately just empty talk.

Nvidia's capacity layout (strategic deployment) extends beyond GPUs to the entire AI computing power ecosystem. In 2021, Nvidia released its first data center CPU, Grace, and by 2026, its second-generation CPU, Vera, had entered mass production, securing a multi-year cooperation agreement with Meta to achieve large-scale independent deployment of Grace CPUs, with plans to land Vera CPUs in Meta's data centers by 2027. To secure CPU capacity, Nvidia also locked in relevant manufacturing resources in advance, and its 'robust supply chain' allows it to achieve 'zero delivery delays' despite CPU market supply shortages.

In addition to locking in capacity for its own chips, Nvidia also incorporates more manufacturing resources into its ecosystem through an open ecosystem. In 2025, Nvidia opened up third-party licensing for its NVLink networking technology, reaching cooperation with Intel, Qualcomm, Arm, and other companies to enable better integration of third-party CPUs with Nvidia GPUs. At the same time, Nvidia supports the RISC-V open instruction set, reaching an agreement with SiFive to allow RISC-V chips to connect with Nvidia GPUs via NVLink. This 'platform-agnostic' strategy essentially maximizes the utilization of global limited manufacturing resources through ecosystem integration, further consolidating its computing power advantage.

Before Christmas 2025, Nvidia's $20 billion acquisition of chip technology licensing from Groq and the hiring of its CEO, Jonathan Ross (the developer of Google's first-generation TPU), marked a key step in addressing manufacturing bottlenecks and improve (completing) its computing power ecosystem. Groq's LPU (Language Processing Unit) is designed specifically for AI inference, using on-chip SRAM memory, which is much faster than traditional GPUs and can complement Nvidia's GPUs. Nvidia's acquisition of this technology is not merely a technological upgrade but aims to improve the utilization efficiency of existing computing power through the integration of specialized chip technologies, achieving 'endogenous growth' of computing power under limited manufacturing capacity.

As Jensen Huang said in an earnings call, Nvidia's layout (strategic deployment) of Groq is consistent with its acquisition of Mellanox six years ago—by integrating core technologies from niche fields, extending its architecture, and maximizing the value of manufacturing resources. In the fourth quarter of fiscal year 2026, Nvidia's networking business achieved single-quarter revenue of $11 billion, equivalent to AMD's total revenue, which is the result of Mellanox technology integration. The addition of Groq's technology will position Nvidia more favorably in the AI inference market, improving computing power efficiency through technological integration under limited manufacturing capacity.

CPU Resurgence: A New Manifestation of Manufacturing Bottlenecks in the AI Computing Power Ecosystem

A major highlight of Nvidia's 2026 GTC conference was the resurgence of CPUs. After years of GPU dominance in the AI computing power market, Nvidia announced that it would launch CPUs optimized for agentic AI and even planned to showcase pure CPU racks. This strategic shift, while seemingly a refinement of the computing power ecosystem, is actually another important manifestation of AI manufacturing bottlenecks.

The rise of agentic AI has fundamentally changed the demand structure for AI computing power. Unlike traditional question-and-answer Chatbots, agentic AI is task-oriented intelligent agents that require coordinating multiple agents to work together, moving large amounts of data, and performing complex logical scheduling. This process not only requires the parallel computing capabilities of GPUs but also the general-purpose computing capabilities and serial processing abilities of CPUs—CPUs have become the 'scheduling centers' of AI workflows, responsible for coordinating GPUs, accelerators, and other hardware to ensure the efficient operation of the entire computing power system.

Dion Harris, Nvidia's AI infrastructure chief, stated bluntly: 'In the expansion of AI and agentic workflows, CPUs are becoming the new bottleneck.' And the supply shortage of CPUs is a direct manifestation of insufficient manufacturing capabilities. Currently, the global data center CPU market is dominated by Intel (60%) and AMD (24.3%), with Nvidia accounting for only 6.2%. However, with the surge in demand for agentic AI, the CPU market size is expanding rapidly. Bank of America predicts that from 2025 to 2030, the CPU market size will grow from $27 billion to $60 billion, more than doubling.

Faced with CPU supply bottlenecks, Nvidia's response strategy remains 'customization + capacity locking.' Unlike Intel's and AMD's general-purpose CPUs, Nvidia's Grace and Vera CPUs are designed specifically for AI workflows, abandoning the competition for core counts (Grace CPU has 72 cores, far fewer than Intel's and AMD's 128 cores) and instead improving single-thread performance to ensure that 'expensive GPU resources do not sit idle waiting.' This customized design allows Nvidia's CPUs to better match the computing power demands of GPUs, improving the efficiency of the entire computing power system and maximizing computing power utilization efficiency under limited manufacturing capacity.

At the same time, Nvidia's CPUs are based on the Arm architecture rather than Intel's and AMD's x86 architecture. The low-power characteristics of the Arm architecture are more suitable for data center scenarios, and its manufacturing processes are more flexible, allowing better utilization of existing manufacturing resources. Nvidia's choice of the Arm architecture is also an adaptation to manufacturing bottlenecks—under tight x86 architecture capacity, it achieves rapid CPU capacity expansion through the Arm architecture.

The resurgence of CPUs also reflects the growing complexity of the AI computing power ecosystem. In the past, the core of AI computing power was GPUs, but now, AI computing power requires the collaborative work (collaborative work) of multiple hardware components, including GPUs, CPUs, accelerators, and networking chips. The manufacturing capacity of each hardware component may become a bottleneck for computing power expansion. Nvidia's CPU layout (strategic deployment) aims to connect all links in the computing power ecosystem, eliminate supply chain shortcomings, and build an efficient and complete computing power ecosystem under the backdrop of limited manufacturing capabilities.

Full-chain competition from algorithmic innovation to manufacturing capabilities

Dylan Patel summarized in a podcast that the AI industry is transitioning from virtual concepts to physical manufacturing. Ultimately, it is the capacity constraints of the physical world, rather than algorithmic innovation in the digital world, that will determine the industry's upper limits. This judgment not only reveals the current bottleneck in AI computing power but also points to the future direction of competition in the AI industry—shifting from a singular focus on technological competition to a comprehensive competition encompassing manufacturing, supply chains, and ecosystems.

In this competition, manufacturing capabilities will become the most core competitive advantage. Companies that master core manufacturing technologies such as EUV lithography machines, advanced wafer fabrication, and high-end packaging will occupy the pinnacle of the industrial chain. Those that can secure manufacturing capacity in advance and build stable supply chains will take the initiative in the computing power race. As Nvidia's success has demonstrated, in an era where manufacturing capabilities have become a bottleneck, the importance of supply chain layout even surpasses technological innovation itself.

At the same time, the resource battle in the AI industry will continue to escalate. The squeeze on consumer electronics memory by HBM is just the beginning. In the future, AI will also compete with industries such as automotive, industrial, and aerospace for semiconductor manufacturing resources, continuously reshaping the global semiconductor supply chain landscape. In this zero-sum game, only those industries that can bring higher profits and greater demand to the semiconductor industry will secure more manufacturing resources. And AI is undoubtedly the most competitive player at present.

Although the power issue is no longer a core bottleneck, it will remain an important consideration for the AI industry. As computing power continues to expand, the power demand of AI data centers will keep growing, and capital-driven unconventional power supply models will become mainstream. However, unlike manufacturing capabilities, solving the power issue relies more on a company's financial strength and operational capabilities rather than technological barriers. With manufacturing capabilities becoming the core bottleneck, the power issue will serve as a 'threshold' for screening companies rather than a 'ceiling' that determines the industry's upper limits.

For China's AI industry, the challenges posed by this manufacturing bottleneck are even more severe. Currently, China still lags significantly behind international advanced levels in core areas such as EUV lithography machines, advanced-node wafer fabrication, and high-end HBM memory. The shortage of manufacturing capabilities has become the biggest constraint on the expansion of China's AI computing power. To break through this bottleneck, it is necessary to start from foundational technologies, increase R&D investment in semiconductor manufacturing equipment, materials, and processes, and build an autonomous and controllable semiconductor manufacturing supply chain. At the same time, it is also essential to learn from Nvidia's experience by maximizing the utilization of existing manufacturing resources and improving computing power efficiency through ecosystem integration and capacity layout.

The AI industry in 2026 stands at a critical turning point. The dividends of algorithmic innovation are still being released, but the constraints of manufacturing capabilities have already emerged. When people stop discussing 'what the next big model will be' and start focusing on 'when the next silicon wafer can be produced,' the AI industry will have truly matured. After all, no matter how advanced the algorithms are, they ultimately have to run on tangible chips; no matter how grand the AI vision is, it will ultimately be limited by the manufacturing capabilities of the physical world.

The future of AI does not lack electricity but the ability to manufacture chips. And this race for manufacturing capabilities has only just begun.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.

Newest

Links