Commercialization of ASIC: The Turning Point Has Arrived

06/29 2026 491

In the second quarter of 2026, the density of news in the ASIC chip sector reached unprecedented levels. From Amazon's confirmation on June 18th of discussions to sell its Trainium chips to external data centers, to OpenAI's joint release with Broadcom of its first self-developed inference chip, Jalapeño, on June 24th, two major events unfolded within a week. Looking back further, Google partnered with Blackstone in May to establish a $5 billion joint venture to commercialize TPUs, Microsoft's Maia 200 is in talks with Anthropic for compute rental, and Meta finds itself in a passive position due to failed internal integration following its acquisition of Rivos.

These events point to a single conclusion: ASICs are moving from the edge to the center of the AI computing power landscape. The simultaneous efforts of supercomputing cloud providers and leading AI labs are driven not only by commercial instincts for cost control but also by the struggle for control over the next generation of AI infrastructure. Goldman Sachs predicts that AI-driven demand for ASICs will rival that for GPUs by 2027; in 2026, the combined AI capital expenditures of the four major cloud providers are expected to reach $700 billion to $775 billion, a nearly 78% year-over-year increase—a volume of capital so vast that no single chip supplier can dominate it alone.

01 Cloud Providers: From Internal Use to the Forefront

For a long time, customized AI chips have been seen as 'internal toys' for cloud computing giants, used to meet their massive internal computing power demands. However, multiple strategic shifts in 2026 indicate that this boundary is being broken. The core driver behind this is the structural shift in AI workloads—from 'training-dominated' to 'inference-dominated.' Studies by SemiAnalysis and Bernstein estimate that in large-scale inference deployments, ASICs offer a 40% to 65% advantage in total cost of ownership (TCO) over general-purpose GPUs; AI image generation platform Midjourney saw its monthly computing costs drop from $2.1 million to $700,000 after migrating to Google's seventh-generation TPUs. This economic advantage is directly reflected in cloud service pricing: according to Artificial Analysis, Google's TPU-based Gemini 3.1 Pro has a blended price of approximately $1.74 per million tokens, nearly 60% cheaper than comparable Opus 4.7 ($4.10) and GPT-5.5 ($4.35).

Amazon's moves are particularly noteworthy. On June 18, 2026, Peter DeSantis, head of Amazon's AI business, confirmed in an interview that AWS is in talks to sell its custom Trainium chips to other companies' data centers. As early as April of this year, Amazon CEO Andy Jassy hinted in a shareholder letter that if the chip business operated as a standalone entity, selling this year's production to both AWS and external companies would generate annual revenues of approximately $50 billion. Currently, Amazon's internal chip division has an annual revenue run rate exceeding $20 billion, with its latest Trainium3 chip quickly reaching a 'nearly sold out' status after release and securing over $225 billion in revenue commitments. Its core customers include OpenAI, Anthropic, and Uber, with Anthropic committing to deploy over 1 million Trainium chips and contracting for 5GW of chip capacity.

Google's layout (strategic layout is translated as 'strategic moves' to maintain context) are even more aggressive. In May 2026, Google announced the establishment of a joint venture with Blackstone, 'TPU Cloud,' with Blackstone initially committing $5 billion (up to $25 billion with leverage). The project aims to launch an AI data center with approximately 500MW capacity by 2027, built entirely on Google-provided TPU hardware, software, and services. This marks the first large-scale commercial sale of TPUs outside the Google Cloud ecosystem in their ten-year history. Google also provided up to $3.2 billion in financial guarantees for the 'Lake Mariner' AI data center project in western New York State, which will provide thousands of TPU-based compute nodes for Anthropic. On the supply chain front, Google has placed orders with Intel for over 3 million TPUs, and MediaTek is entering the design supply chain for the next-generation TPU v10, breaking Broadcom's long-standing monopoly.

The self-developed chips of the three major cloud providers have each formed clear product positioning and customer networks: Amazon's Trainium/Inferentia series serves Anthropic, OpenAI, and Uber; Google's TPU v7 Ironwood and v8 series have attracted Anthropic, Meta, and Midjourney; Microsoft's Maia 200 is vying for Anthropic and OpenAI.

Microsoft is also advancing. Its second-generation AI accelerator, Maia 200, built on TSMC's 3nm process, has been deployed in data centers, featuring 216GB of HBM3e memory and delivering over 10 petaflops of FP4 peak performance. In May of this year, Anthropic was in talks with Microsoft to lease Azure servers based on Maia 200, which, if realized, would become Microsoft's first major external customer for its self-developed chip.

02 OpenAI's Nine-Month Chip Development

Notably, participants in self-developed chips are no longer limited to cloud computing giants. On June 24, 2026, OpenAI jointly released its first custom inference chip, Jalapeño, with Broadcom, marking the entry of the world's largest AI model company into the chip track ( track is translated as 'arena' to convey the competitive context).

Jalapeño is defined as an 'intelligent processor,' designed specifically for large language model inference scenarios. OpenAI is responsible for the underlying architecture design, Broadcom for silicon implementation and network hardware, Celestica for board and rack system integration, and TSMC for manufacturing. OpenAI President Greg Brockman revealed that, leveraging the company's self-developed large models for auxiliary optimization, the chip took only nine months from top-level design to tape-out. Early tests show that Jalapeño significantly outperforms existing solutions in performance-per-watt metrics.

OpenAI's chip development logic differs from that of cloud providers. As one of the world's largest GPU purchasers, OpenAI faces the core issue of computing power supply consistently lagging behind business expansion. Brockman admitted at the launch event, 'We have a deep understanding of our workloads and have been seeking specific tasks that are inefficiently served by existing hardware, considering how to build hardware that specifically accelerates them.' Broadcom CEO Hock Tan also stated that the computing power demands of its six major core customers are nearly limitless, 'the computing power shortage will not only persist through 2026 and 2027 but is expected to continue climbing in 2028.'

Physical samples of Jalapeño were delivered to OpenAI on June 24, with plans for small-scale initial deployment by the end of 2026, rapid ramp-up in 2027, and full-scale mass production in the first half of 2028. Long-term planning calls for a maximum total power consumption of up to 10GW. This means OpenAI is building a complete vertical technology stack from models, products, and data centers to chips. As its official statement declares, 'OpenAI is not just developing cutting-edge models or building products; it is designing the infrastructure beneath them—chip architecture, kernels, memory systems, networking, scheduling, deployment systems, and product experiences.'

03 Growing Pains and Barriers: Meta's Lessons

However, self-developed chips are no smooth sailing. Meta's setbacks in this area serve as a wake-up call for the entire industry.

In September 2025, Meta spent over $2 billion to acquire RISC-V chip startup Rivos, intending to accelerate its self-developed AI chip (MTIA) project. However, within just six months, the marriage turned sour. Reports indicate that the integration process was hindered by severe conflicts over compensation and strategic direction between Meta's existing employees and the Rivos team, with political struggles erupting over whether future chips should rely on Meta's existing IP or Rivos's technology, directly causing multiple project delays. Sources even revealed that the Rivos project has been substantially canceled within Meta.

This exposes the deep-seated contradictions of internet companies venturing into semiconductors: software code can be iterated and updated at any time, but once a chip is taped out, any minor architectural misstep means hundreds of millions of dollars and months of time lost. Microsoft's Maia project also experienced production delays, pushed back from 2025 to 2026. These cases illustrate that even with hundreds of billions of dollars in R&D budgets, building a mature chip ecosystem from scratch requires sustained long-term investment.

In contrast, OpenAI chose a more pragmatic path: instead of building its own chip team, it deepened its collaboration with Broadcom, leveraging the latter's mature silicon implementation capabilities and supply chain resources to translate its understanding of model workloads into chip architecture design, completing the entire process from design to tape-out within nine months. This division of labor model—'model company defines architecture + semiconductor company implements manufacturing'—may represent a more efficient paradigm for industrial collaboration.

04 Industrial Chain Restructuring and Market Outlook

The surge in self-developed chips by cloud giants and AI labs is reshaping the value distribution across the semiconductor industrial chain. Broadcom and Marvell control approximately 95% of the global custom AI ASIC co-design market. Broadcom's AI semiconductor revenue reached $10.8 billion in the second quarter of fiscal 2026, a staggering 143% year-over-year increase, with CEO Hock Tan projecting AI chip revenue to exceed $100 billion by 2027. Qualcomm, leveraging its low-power architecture accumulation, has successfully secured orders for millions of AI chips from ByteDance. The CEO of Taiwan's ASIC design company Alchip predicts that AI ASIC revenue will grow from approximately $13 billion in 2024 to over $150 billion by 2030, with a compound annual growth rate of nearly 50%.

According to TrendForce's latest forecast, custom AI chip shipments will grow by 44.6% in 2026, while commercial GPU shipments will increase by 16.1% in the same period. This marks the first time since the AI era began that custom chip shipment growth has significantly outpaced that of general-purpose GPUs. Bloomberg Intelligence expects custom chip demand to grow by an average of 27% annually through 2033. Goldman Sachs Global Institute estimates that approximately $7.6 trillion in capital investment will be required globally in the AI sector between 2026 and 2031.

From the perspective of semiconductor industry evolution, the divide between 'general-purpose' and 'specialized' is becoming clearer. General-purpose GPUs, with software ecosystems like CUDA, will maintain their dominance in model training and multi-purpose AI development. However, in the Commercial deployment (commercial deployment is used to convey the context) phase, where inference scales exponentially, custom ASICs with ultimate cost advantages are rapidly carving out market share. TrendForce data shows that ASIC AI server shipments are expected to reach 27.8% of the total AI server market by 2026.

For leading model companies, a 'multi-chip strategy' is becoming standard. Anthropic currently operates simultaneously on four hardware platforms: AWS Trainium, Google TPU, Azure GPU, and the Microsoft Maia under discussion; OpenAI is deploying its self-developed Jalapeño while using Amazon Trainium, AMD, and Cerebras chips. Adam Fisher, a partner at Bessemer Venture Partners, publicly stated, 'Some emerging cloud companies cannot afford to rely solely on purchasing a single full-stack hardware solution due to fears of quota reductions'—but as computing power shortages intensify, more companies are breaking free from these constraints.

The concentrated emergence of non-GPU AI chips in 2026 is merely the prologue to the restructuring of AI infrastructure. As model companies begin to define chip architectures and cloud providers start selling silicon externally, the traditional division of labor in the semiconductor industry is being rewritten.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.