NVIDIA-Backed AI Unicorn Secures $1.5 Billion Funding: Revenue Soars 2000%

06/25 2026 572

Cover image generated by ChatGPT

AI infrastructure provider Baseten has recently secured an additional $1.5 billion (approximately RMB 10.1 billion) in funding, propelling its valuation to $13 billion (approximately RMB 88.2 billion).

Baseten does not train large AI models on its own; rather, it assists enterprises in running various AI models in a stable, cost-effective, and efficient manner.

In just 18 months, Baseten has successfully completed four rounds of funding. The company reports that its revenue has increased twentyfold over the past year. Several media outlets have reported that its annualized revenue reached $600 million in the first quarter of this year.

What business opportunities does this news reveal? For more details, refer to the Qianbidao Opportunity Intelligence System: Gain insights into the business opportunities behind the news in just 5 minutes.

- 01 -

Valuation Soars 2.5 Times in Six Months

Founded in 2019 and headquartered in San Francisco, Baseten initially focused on machine learning applications for fraud detection, abuse identification, and user-generated content processing. However, for the first three years, its revenue was "negligible." By the end of 2022, following the release of ChatGPT, Baseten decided to shift its focus to helping clients simplify the deployment of large language models.

For companies developing real AI applications, they do not always require the most powerful and expensive closed-source models. What they need is to run products reliably, at a controlled cost, and at a stable speed using the appropriate models for different scenarios.

For instance, an AI programming tool might utilize a cutting-edge model for complex code generation while employing open-source or self-developed models for simpler tasks such as completion, retrieval, classification, and context organization. Similarly, an AI sales tool might distribute tasks among various models.

The capabilities of open-source models are continuously improving. Models like Llama, Qwen, and DeepSeek offer enterprises more options. However, open-source models are not yet ready-to-use products capable of serving millions of users. Enterprises must still address numerous challenges: sourcing GPUs, deploying them, optimizing throughput, reducing latency, managing traffic fluctuations, monitoring failures, controlling costs, and protecting data.

Baseten provides the system to "bring models into production." It procures computing power from multiple cloud providers and delivers it through its proprietary software stack for scheduling, optimization, and delivery. For clients, there is no need to scramble for GPUs or rebuild an entire inference platform.

Many of Baseten's clients are rapidly growing AI application companies, including Abridge, Clay, Cursor, Lovable, Mercor, and OpenEvidence. These companies operate in healthcare, sales, programming, recruitment, enterprise software, and other fields, but they all have one thing in common: their products heavily rely on model calls.

Some of Baseten's clients

This business model is reminiscent of the early days of cloud computing.

When internet companies experienced rapid growth, Amazon Web Services did not sell individual websites but rather the servers, storage, databases, and elastic computing capabilities behind them. When the mobile internet exploded, cloud providers reaped the rewards of the entire app ecosystem. Today, the more AI applications there are and the more frequently models are called, the greater the demand for inference infrastructure.

Baseten aims to become the infrastructure layer of the AI inference era—the fundamental reason for its rapidly rising valuation.

In January, Baseten announced a $300 million funding round, valuing the company at $5 billion. A few months later, a new round pushed its valuation to $13 billion—an increase of more than 2.5 times in just six months.

- 02 -

Three Major Risks

Baseten's rise is occurring against a broader backdrop: AI companies are beginning to take cost calculations seriously.

Inference is not a one-time investment but an ongoing expense. The more popular an AI product becomes, the more model calls it generates, and the higher its costs may rise. If every call relies on the most expensive closed-source model, scaling the product could lead to greater losses.

This compels AI application companies to adopt more flexible model strategies—not every task requires the most powerful model. Whether an AI application can be profitable largely depends on whether inference costs can be reduced.

As AI integrates into real workflows, the demand for inference becomes enormous. Optimizing inference is not a problem that model companies can solve alone, nor is it something that ordinary cloud providers naturally excel at. It requires a layer of system software between models, GPUs, cloud providers, and applications. The more complex this layer becomes, the more likely new infrastructure companies will emerge.

Baseten is seizing this opportunity. It enables AI products to become viable businesses.

However, inference infrastructure may be one of the most attractive—and ruthless—sectors in the AI industry.

The first risk is competition.

Baseten is not the only company targeting the inference opportunity. Fireworks AI, Together AI, Modal, Replicate, Groq, Cerebras, and major cloud providers are all vying for this market. For example, Groq was reported to be seeking up to $650 million in funding in May 2026; Together AI planned to raise $1 billion, valuing it at $7.5 billion.

The second risk is gross margins.

Inference infrastructure companies may appear to be software companies, but they are not entirely software-based. They require substantial GPU and cloud resources. Computing costs, procurement prices, utilization rates, and client pricing all affect profits.

If GPUs remain scarce and procurement costs stay high, while clients continue to demand price cuts, inference platforms' gross margins will come under pressure. Especially as competition intensifies and everyone uses low prices to attract clients, this business could become a capital-intensive service with "high revenue but low profits."

This is why inference optimization capabilities are so crucial.

The same GPU can serve more requests, reduce idle time, improve throughput, and minimize latency only if the software is efficient. Inference infrastructure is not just about reselling computing power but squeezing out profits through software efficiency.

The third risk is rapid technological change.

AI models, chips, compilers, inference frameworks, and open-source ecosystems are all evolving rapidly. Today's optimal deployment method may become outdated in a few months.

This presents both an opportunity and a challenge for Baseten.

The opportunity lies in the fact that the faster technology changes, the more reluctant clients become to maintain complex infrastructure themselves, preferring to outsource to specialized companies. The challenge is that Baseten must always stay ahead. If its technology stack falls behind new models, chips, or demands, clients may quickly switch to other platforms.

- 03 -

NVIDIA Invests

Baseten has an important shareholder: NVIDIA. In January 2026, NVIDIA became a major investor in Baseten's Series E round ($150 million).

This investment appears to have been prompted by DeepSeek.

In early 2025, DeepSeek suddenly gained popularity. Its impact on the U.S. tech market was not just that "China can also build powerful models" but something more significant: powerful models could be made incredibly affordable.

This struck a chord with NVIDIA. For years, NVIDIA's core narrative had been: as AI grows stronger, it needs more and more GPUs.

But after DeepSeek emerged, the market began to question: If model training costs can drop significantly, if fewer chips can produce nearly top-tier models, can NVIDIA continue its rapid growth?

This question once caused NVIDIA's stock price to plummet.

However, NVIDIA's subsequent investment in Baseten suggests it sees another direction: cheaper models may not reduce demand for computing power but could amplify it.

The reason is straightforward.

When AI models were expensive, only a few large companies could afford to use them. But when models like DeepSeek drive prices down and open-source models grow stronger, more startups, SMEs, and vertical industries will adopt AI.

Baseten provides AI inference infrastructure, helping enterprises deploy open-source, self-developed, and custom models into production environments while maximizing speed and minimizing costs.

At this point, the industry's real bottleneck shifts: it's no longer just "who can train the most powerful model" but "who can run thousands of models stably, affordably, and quickly."

Baseten assists enterprises in deploying and scheduling GPUs, controlling costs, reducing latency, and handling traffic spikes. Many companies will use different models; many models will run in different scenarios; many inference tasks will continuously consume GPUs.

At this stage, NVIDIA needs not just big clients like OpenAI and Microsoft but also infrastructure companies like Baseten to bring more open-source, specialized, and enterprise models to run on NVIDIA GPUs.

NVIDIA's investment in Baseten is essentially laying the groundwork for its next phase of growth.

This article does not constitute any investment advice.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.