07/01 2026
424

When you look at R&D, technology, governance, economics, science, medicine, and more together, it’s clear that the AI industry in 2025 can no longer be summed up in just four words: 'model progress.'
The first half was about who could train a stronger model; the second half is about who can turn models into stable, reliable, regulated, commercializable, and sustainable productivity.
Translated by | Dou Dou
Produced by | Industrial Intelligence
Over the past year, the most significant change in the AI industry has been how we judge AI progress. For a long time, the outside world has been accustomed to understanding AI through parameter scales, leaderboard rankings, funding amounts, and product release rhythms. But after entering 2025, this narrative has started to seem insufficient. Models continue to grow stronger, computing power keeps expanding, capital keeps pouring in, and AI is entering deeper industrial scenarios such as science, healthcare, education, enterprise services, and autonomous driving. Yet at the same time, the performance gap between top models is narrowing, the transparency of cutting-edge systems is declining, computing power and chip supply chains are becoming increasingly concentrated, and governance, energy, employment, and equity issues are starting to move from behind the scenes to the forefront.
This means that AI competition is shifting from a single-point technological contest into a more complex systemic competition. What truly deserves scrutiny is no longer just 'whose model is stronger,' but 'who can turn AI into stable, trustworthy, and scalable productivity.'
It is at this juncture that the AI Index project under Stanford University’s Institute for Human-Centered Artificial Intelligence released the 2026 AI Index Report. As an annual report long cited by global policymakers, academics, industry, and media, it is not just a technology leaderboard but attempts to use data spanning R&D, technical performance, responsible AI, economics, science, medicine, education, and other dimensions to redraw the true coordinates of the AI industry.
The signals from this report are clear: In the first half of AI, the competition was about model capabilities and technological breakthroughs; in the second half, it will be about infrastructure, real-world scenarios, commercial efficiency, and social trust. In other words, AI is no longer just a story for tech companies—it is becoming a new infrastructure that redistributes industrial resources, talent structures, and global competitive advantages.
Around this report, we attempt to comb, sort out, organize, arrange, streamline (organize) key changes in the AI industry since 2025 across seven dimensions. A growing consensus is that as model capabilities gradually converge, what will truly determine the next industrial landscape is the ability to embed technology deep into industries.
Below is a translated and summarized version of the report:
Key Takeaways:
1. AI is far from peaking—instead, it’s accelerating wildly, infiltrating the masses with unprecedented breadth.
2. The performance gap between Chinese and U.S. AI models has now been virtually closed.
3. Capable of winning gold at the International Mathematical Olympiad yet failing at telling time on an analog clock in seconds, AI is stuck in an extremely unbalanced 'jagged frontier.'
4. While robots excel in controlled environments, they remain helpless with most household tasks.
5. The pace of responsible AI development lags behind AI capability gains, with safety benchmarks trailing and related incidents surging.
6. AI adoption is setting speed records, with consumers deriving significant value from these often freely available tools.
7. AI is transforming clinical healthcare, yet rigorous evidence remains limited.
I. R&D Enters the Era of Giants: AI Grows Stronger—and More Opaque
AI R&D in 2025 presents a stark contradiction: On one hand, resources supporting AI continue to grow, with computing power, open-source projects, papers, and patents all expanding. On the other, model systems at the true frontier are becoming increasingly concentrated, and transparency is declining.
The most direct change is that industry has become the absolute protagonist in AI model development. In 2025, industry produced over 90% of well-known AI models, further marginalizing academic institutions’ roles in frontier models. The reason is simple: The computing power, data, engineering teams, and capital investments required to train a top-tier model are beyond what most universities or research institutions can independently shoulder. AI R&D is shifting from a relatively open scientific competition to an infrastructure contest among a few giants.

But problems have followed. The most capable models tend to be the least transparent. High-resource-consuming systems from OpenAI, Anthropic, Google, and others no longer fully disclose training code, parameter counts, dataset scales, or training durations. This makes it hard for outsiders to judge whether model capabilities stem from algorithmic breakthroughs, data quality, post-training optimization, or mere computing power stacking. The more critical AI becomes, the more society needs to understand it—but the more cutting-edge AI is, the harder it is for outsiders to see clearly.

Meanwhile, the global AI R&D landscape is shifting. China leads in paper output, citations, and patent grants, with its share of the top 100 most-cited AI papers steadily rising. The U.S. maintains its lead in developing well-known models, with 59 in 2025 versus China’s 35. In other words, China has greater scale advantages in research output and knowledge accumulation, while the U.S. still holds more frontier models and high-impact patents.
Computing power is the hardest foundation of this competition. Since 2022, global AI computing capacity has grown roughly 3.3x annually, reaching 17.1 million H100-equivalent chips by 2025. NVIDIA accounts for over 60% of total compute, with Google and Amazon supplying most of the rest. Huawei’s share remains small but is growing. Behind computing expansion lies surging demand for hyperscale data center construction and frontier model training/inference.
But the deeper you look, the more concentrated risks become. The U.S. has the most data centers globally—ten times more than any other country—while TSMC manufactures nearly all mainstream AI chips, making the global AI hardware supply chain highly dependent on one Taiwanese foundry. What appears as a cloud-based software revolution is underpinned by highly concentrated semiconductor manufacturing, energy supply, and data center deployment capabilities.
This expansion also carries environmental costs. In 2025, training Grok 4 emitted an estimated 72,816 metric tons of CO2 equivalent; AI data center power capacity rose to 29.6 GW, nearing New York State’s peak demand; and GPT-4o’s annual inference water use alone may exceed the drinking water needs of over 1.2 million people.
The stronger AI gets, the more it becomes not just a technological issue but one of energy, supply chains, and public resources.
II. After Model Capabilities Converge, Competition Shifts from 'Benchmarks' to 'Usefulness'
AI technical performance continues to rise rapidly, but the most significant signal in 2025 is not 'models are getting stronger again' but 'it’s increasingly hard to differentiate between strong models.'
On benchmarks for language, reasoning, coding, and math, frontier models’ scores keep climbing, even surpassing some human levels. Frontier models improved by 30 percentage points in one year on a high-difficulty benchmark dubbed the 'human ultimate exam,' and many tests once thought capable of evaluating models for years were cracked in months. This shows that AI progress has outpaced assessment system updates—the old rulers are shrinking.

When benchmarks are quickly saturated, the meaning of model rankings changes. By March 2026, Anthropic, xAI, Google, OpenAI, Alibaba, and DeepSeek were all in the top tier of Arena Elo scores, with many firms’ models separated by tiny margins. The performance gap between top Chinese and U.S. models has also nearly closed. Since early 2025, Chinese and U.S. models have repeatedly swapped the top spot on performance leaderboards; DeepSeek-R1 briefly matched U.S. top models, and as of March 2026, U.S. top models led by just 2.7%.

The industrial implications are clear: When 'capability leadership' no longer suffices for dominance, competitive pressure shifts to cost, reliability, domain-specific performance, and real-world usability. Whether a model can be called cheaply, complete tasks stably, and deliver results in tax, legal, financial, customer service, coding, healthcare, and other professional scenarios matters more than raw benchmark scores.
The open-source vs. closed-source landscape has also shifted. In 2024, open-source models dramatically closed the gap with closed-source ones, but by 2025, the gap widened again. As of March 2026, top closed-source models led top open-source ones by 3.3%, with six of the top ten Arena leaderboard models being closed-source. This suggests that while open source remains vital for ecosystem diffusion and industrial innovation, closed-source giants retain leads in cutting-edge capabilities through computing power, data, and engineering advantages.
Meanwhile, AI capabilities exhibit 'jagged intelligence.' It can win gold at the International Mathematical Olympiad yet still fail reliably at reading analog clocks. Gemini Deep Think scored 35 points for a gold medal at IMO 2025, but top models correctly identified analog clocks only 50.6% of the time on ClockBench (humans: 90.1%). This reminds us that AI doesn’t get linearly smarter—it surges ahead on some tasks while remaining fragile on others.
More notably, AI is moving from digital tasks to the physical world. Video generation models now capture object motion laws beyond just realistic imagery. Google DeepMind’s Veo 3 demonstrated abilities like simulating buoyancy and solving mazes across over 18,000 generated video tests—without specialized training. Agents are also evolving from answering questions to completing tasks, with accuracy in OSWorld tests rising from ~12% to 66.3%, narrowing the gap with humans to under 6 percentage points.

But entering the physical world is no easy feat. Robots achieve 89.4% success in simulated environments but only 12% in real household tasks.
By contrast, autonomous driving is a rare exception with large-scale deployment. Waymo completes ~450,000 weekly trips across five U.S. cities, while Baidu Apollo Go has finished 11 million fully driverless trips.
AI is nearing the physical world, but truly stable understanding and transformation of it remain far off.
III. Responsible AI Starts Playing Catch-Up: Governance Lags Deployment, and Risks Are Already Real
As AI capabilities expand, governance issues are thrust to the fore. A core contradiction in 2025 is that responsible AI infrastructure is being built—but far slower than AI deployment.
Safety benchmarks are increasing, more organizations are crafting responsible AI policies, and government-backed AI safety institutes are expanding to more countries. But these moves feel more like catching up than leading proactively. Nearly all leading model developers publish results for capability benchmarks like MMLU and SWE-bench, yet reports on responsible AI benchmarks remain rare. In other words, firms eagerly showcase how strong their models are but are less willing to fully disclose how safe, fair, or transparent they are.
Risks are already accumulating in reality. The AI Incident Database recorded 362 AI events in 2025, up from 233 in 2024. Model hallucinations remain a highlight (prominent) issue.

In one accuracy benchmark, hallucination rates among 26 mainstream models ranged from 22% to 94%. More subtly, models struggle to distinguish 'knowledge' from 'belief.' When false statements are framed as others’ opinions, models handle them relatively well; but when framed as the user’s own views, performance drops sharply. This suggests models don’t just fabricate information—they may also be influenced by questioning styles and user stances.
Corporate governance awareness is indeed rising. In 2025, AI-specific governance roles grew by 17%, and the share of firms without responsible AI policies fell from 24% to 11%.
But implementation remains constrained by real-world conditions, with major obstacles including knowledge gaps, budget limits, and regulatory uncertainty. Many firms aren’t unwilling to govern—they just don’t know how, lack resources, or are unsure how regulations will evolve.
Regulatory frameworks are also shifting toward AI-specific systems. While GDPR remains the most cited regulatory influence, its share is declining. Meanwhile, tools like ISO/IEC 42001 AI management system standards and NIST AI risk management frameworks are entering firms’ vision (field of view). Regulation is moving beyond privacy and data compliance toward model development, deployment, monitoring, and risk management.
But a deeper challenge is that AI governance isn’t about optimizing a single metric. Safety, fairness, privacy, and explainability may conflict. Recent research finds that training techniques improving one responsible AI dimension can persistently harm others. For example, privacy enhancements may weaken fairness, and safety optimizations may reduce accuracy. The industry still lacks mature frameworks to handle these trade-offs.
Declining transparency makes governance even harder. The Foundational Model Transparency Index rose from 37 in 2023 to 58 in 2024 but fell back to 40 in 2025. Major gaps remain in disclosures about training data, compute resources, and post-deployment impacts.

The more models are used in real-world industries, the more the external world needs to understand their limitations; however, the fiercer the competition among leading companies, the more inclined they are to conceal key details. This tension will become one of the biggest institutional challenges for AI in its next phase.
IV. Capital Continues to Pour into AI, but the Distribution of Dividends is Uneven
Beyond technological competition, the AI economy is expanding at an unprecedented pace. By 2025, global corporate AI investment has more than doubled, with private investment growing the fastest at a rate of 127.5%, accounting for 60% of total investment. Generative AI is at the core of this growth, with investment increasing by over 200%, accounting for nearly half of total private AI financing. The number of newly funded AI companies has risen by 71%, and the number of billion-dollar financing deals has nearly doubled.

However, funding is not evenly distributed globally. The United States continues to lead in private AI investment, with 23 times more funding than China. In generative AI, U.S. investment significantly surpasses the combined total of China and Europe. This gap indicates that the U.S. still dominates the global AI capital market. Nevertheless, private investment data may underestimate China's AI spending, as Chinese government-guided funds have invested heavily in AI companies over the past two decades. The U.S. excels in market capital and cloud infrastructure, while China demonstrates a combination of industrial policies, manufacturing capabilities, and application scenarios.
AI companies are also achieving record revenue growth. Leading firms have generated substantial revenues in a short period, but computing costs and infrastructure expenditures have also surged. Cloud service providers have accelerated capital spending, with Google disclosing annual capital expenditures exceeding $150 billion in 2025. This demonstrates that AI commercialization is not a lightweight asset myth but a capital-intensive competition. The faster model revenues grow, the greater the investment in chips, servers, energy, and data centers behind them.
On the consumer side, AI has proven its real value. By early 2026, U.S. consumers' annual consumer surplus from generative AI is estimated to reach $172 billion, up significantly from $112 billion a year earlier, with the median value per user tripling. More critically, most of these tools remain free or nearly free. AI is following a path similar to search engines and social networks, first achieving widespread adoption through low-cost, high-frequency use before gradually reshaping business models.
Corporate adoption rates are also rising. In 2025, 88% of surveyed companies have adopted AI, with 70% applying generative AI in at least one business function. China and Europe have seen the highest year-over-year growth. However, agent applications are still in their early stages, with deployment numbers remaining in the single digits across nearly all business functions. Companies have embraced AI tools but have not yet large-scale (massively) entrusted AI to automated processes and critical business loops.

Productivity gains are not universal. AI has shown the most significant impact in structured, quantifiable tasks with easily monitorable outputs, such as improving customer support efficiency by 14-15%, software development efficiency by 26%, and marketing results by 50%. However, gains have been smaller in tasks requiring deep reasoning, complex judgment, and long-term experience. AI is creating value, but it is first transforming processes with clear workflows, accessible data, and explicit feedback—not all jobs.
V. Medical AI Moves Beyond Flashy Demonstrations to the Battle for Clinical Evidence
Healthcare is the most anticipated yet cautious field for AI. By 2025, medical AI has made notable progress in molecular biology, clinical reasoning, clinical documentation, diagnostic assistance, and health search, but a core issue has become increasingly prominent: high model scores do not equate to real-world clinical effectiveness.
In molecular biology, smaller models are challenging the "larger model obsession." MSAPairformer, with only 111 million parameters, outperforms previous leading methods in the ProteinGym benchmark test; GPN-Star, a genomics model with 200 million parameters, surpasses a model with 40 billion parameters. This suggests that the healthcare and life sciences fields do not always require larger general-purpose models—smaller, more specialized models trained on domain-specific data may be more effective.

Virtual cell models have emerged as a new frontier. Arc Institute's Evo 2, STATE, and DeepMind's AlphaGenome all aim to predict cellular responses to drugs and genetic perturbations without wet-lab experiments. If this approach matures, the cost structure of drug discovery and biological experiments will be rewritten. However, at this stage, these systems still require experimental validation—AI cannot yet replace real biological evidence.
In clinical applications, the first tools to gain traction are not the most flashy diagnostic models but those that integrate into doctors' workflows. By 2025, AI tools that automatically generate clinical notes from patient visits have been widely adopted. In multiple healthcare institutions, doctors report up to an 83% reduction in documentation time, significantly decreased burnout, and some institutions have achieved a 112% return on investment.
At the regulatory level, the number of AI medical devices is growing rapidly. In 2025, the U.S. FDA approved 258 AI medical devices, but most were cleared through channels that do not require new clinical trials. The vast majority entered the market via device modification pathways, relying on existing safety and efficacy evidence rather than new randomized trials. Only 2.4% of clinical research devices were supported by randomized trial data. This means the commercialization speed of medical AI has significantly outpaced the accumulation of clinical evidence.
Diagnostic capabilities are also improving. Microsoft's AI Diagnostic Coordinator, paired with OpenAI's o3 model, scored 85.5% in complex medical case studies, compared to just 20% for doctors without assistive tools. Multi-agent frameworks have improved diagnostic accuracy by 7-60% over single-agent benchmark models. However, these results must be interpreted cautiously, as tests often rely on difficult cases from medical literature and do not fully represent real-world hospital diagnosis and treatment process (clinical workflows).
Meanwhile, patients are encountering AI-generated health information earlier. Today, 84-92% of health-related Google search results display AI-generated summaries at the top. Symptoms and common health issues are most likely to trigger AI overviews. This means many patients form preliminary understandings of diseases, treatments, and risks through AI before seeing a doctor. The problem is that this information often bypasses formal medical device regulation yet may influence patient decisions.
Therefore, the keywords for medical AI's next phase are not "how powerful the model is" but evidence, governance, and ethics. It requires randomized trials, real-world data, clear clinical accountability boundaries, and more robust ethical discussions. AI's impact on healthcare is already visible, but for true large-scale clinical adoption, it must pass the test of medical evidence systems—not just rely on demonstrations and leaderboards.
Epilogue:
When viewing R&D, technology, governance, economics, science, medicine, and education together, the AI industry in 2025 can no longer be summarized by "model progress" alone.
It is certainly still growing stronger. Model capabilities are rapidly improving, the U.S.-China gap is narrowing, video models are beginning to understand physical laws, agents are completing complex tasks, and AI is entering high-value fields like science, healthcare, and education. However, AI has also become more expensive, more concentrated, less transparent, more infrastructure-dependent, and has brought more governance, energy, employment, and equity challenges.
This marks AI's entry into its second half. The first half was about who could train the most powerful model; the second half will be about who can turn models into stable, reliable, regulatable, commercializable, and sustainable productivity.
True industrial value will not belong solely to models with the most parameters or highest leaderboard rankings but to systems that can complete closed loops in real-world scenarios. They must integrate into corporate workflows, withstand medical evidence scrutiny, pass safety reviews, explain cost-benefit tradeoffs, and create new capabilities in education and the labor market—rather than simply replacing old jobs.
The AI story is accelerating, but it is no longer just a story for tech companies. It is a story about the computing supply chain, global capital flows, changes in scientific paradigms, and the collective participation of doctors, teachers, students, engineers, and everyday users.
The biggest suspense for AI's next phase is not whether it can continue to grow stronger but whether society can truly absorb it into a trustworthy, controllable, and equitably distributable form of productivity. Whoever can answer this question will stand at the center of AI's next industrial wave.