06/22 2026
373

Cover Image | Created by ChatGPT
In recent news, the AI data unicorn Databricks is seeking funds once again. This time, its desired valuation could reach up to $175 billion (approximately RMB 1.26 trillion). With SpaceX going public and OpenAI and Anthropic secretly submitting their IPO prospectuses, Databricks may be the last active mega AI unicorn in the primary market. Databricks is an AI company that helps enterprises manage their data. It raised $5 billion in February and is currently valued at $134 billion (approximately RMB 964.8 billion).
- 01 - Making Enterprise Private Data 'Valuable'
Databricks' story began with a piece of code in a Berkeley lab. In 2013, several researchers from the UC Berkeley AMPLab founded Databricks. Their core technological asset was Apache Spark, a software engine capable of processing massive amounts of data across hundreds or thousands of servers simultaneously. 
Relationship Between Apache Spark and Databricks | Source: Public Information
If data is likened to ore, Spark is the mining machine; Databricks has built an entire modern mining site.
Internet companies, banks, retailers, and automotive companies generate vast amounts of data daily. User clicks, transaction records, inventory changes, sensor signals, log files, customer profiles, and advertising results all pile up in systems. The problem is, the more data there is, the harder it is to process.
Databricks helps enterprises unlock the value of their data. After the AI boom, this position has suddenly become extremely important because large models themselves do not understand a company's specific business.
They do not know which region a retailer is running low on inventory today, which transactions are abnormal for a bank, or which batch of battery test data has issues for an automotive company.
For models to truly work for enterprises, they must access internal corporate data. However, internal corporate data is often chaotic. Some data is on the cloud, some on local servers.
Some data is in data warehouses, some in business systems. Some data is structured in tables, while other data comes in the form of customer service recordings, contract texts, images, and logs. More troublingly, not all data can be freely used by AI. The financial, healthcare, manufacturing, and retail industries all have strict requirements for permissions, security, and compliance.
This is precisely Databricks' opportunity. It can tell enterprises: You don't need to move all your data again or build AI infrastructure from scratch. You can manage data, train models, deploy AI applications, and establish governance rules on a unified platform, allowing AI to truly utilize your company's own data. In the AI era, the most valuable thing may not be the model itself but the connection layer between the model and real-world business. Databricks is building this connection layer.
- 02 - $5.4 Billion in Annual Revenue
Databricks' approach to making money is somewhat different from that of traditional software companies. Traditional software is more like selling licenses. Enterprises purchase a system and pay annually, allowing employees to use it. Databricks is more like a cloud computing company. Customers do not simply buy a software account but process data, train models, run AI applications, and utilize computing resources on its platform. The more they use, the higher the bill. 
Databricks Data Intelligence Platform | Source: Databricks Official Website
This is also what makes Databricks most attractive. An enterprise may initially use it just for data analysis, such as placing sales, inventory, order, and user behavior data on the platform to generate reports, identify trends, and predict demand. Later, the enterprise may start training machine learning models. Then, with the advent of the AI era, the enterprise may want to develop AI assistants, AI agents, intelligent customer service, and risk control systems based on its internal data. Each additional scenario increases Databricks' usage.
Therefore, Databricks does not sell one-time software but a set of 'enterprise data and AI infrastructure.' Its revenue growth comes from two sources.
First, an increase in new customers. More and more large enterprises are organizing their data and building AI capabilities and will purchase Databricks.
Second, existing customers use it more and more. This is even more critical. Databricks discloses a net revenue retention rate exceeding 140%, meaning the same group of existing customers who spent $100 last year may spend over $140 this year.
For investors, this is a very impressive metric because it indicates that customers are not just trying it out and stopping but are using it more deeply and extensively.
There is strong business logic behind this. Once an enterprise's data is integrated into Databricks, it is not just about storing a few tables but about building data pipelines, permission management, analytical models, and AI application development processes on top of it. The sales department uses it, the finance department uses it, the customer service department uses it, and the R&D department uses it. As data and AI applications increase, migration costs also rise, creating strong customer loyalty.
One important reason investors continue to give Databricks high valuations is that it has proven it can not only tell AI stories but also truly make money. The company discloses that its current annual revenue exceeds $5.4 billion. More critically, many customers spend more and more after their initial purchase. Once an enterprise's data, AI models, and business systems are integrated into Databricks, new usage scenarios are continuously added.
For example, a retail enterprise may initially use it just for sales data analysis. Later, it starts training AI models, deploying intelligent customer service, and developing AI assistants. Each new function generates additional costs.
This means Databricks does not rely on constantly finding new customers to make money; instead, existing customers themselves continuously increase their spending. Currently, Databricks has over 800 customers spending more than $1 million annually and over 70 customers spending more than $10 million annually.
For an enterprise software company, this indicates that it has penetrated the core systems of many large companies rather than being just a minor, dispensable tool. This is also the business model investors love the most: customers cannot do without it, revenue sustained growth (continues to grow), and with the popularize (popularization) of AI, there is even more room for growth.
- 03 - Becoming the Enterprise's AI Overseer
In the past, enterprises purchased Databricks primarily for data processing. For example, a retail company might want to know which stores perform well, which products are overstocked, and which customers may churn. It could place sales, inventory, membership, and logistics data into Databricks and have the data team perform analysis. This was still the business of traditional data platforms. However, with the advent of AI, Databricks' goals have changed. It not only wants to help enterprises 'understand data' but also 'use AI to mobilize data.'
'Understanding data' is primarily used by data analysts, engineers, and business leaders. It solves reporting, forecasting, and analytical problems. 'Using AI to mobilize data' means that every ordinary employee can directly interact with company data. Salespeople can ask: What has this customer purchased in the past? Customer service can ask: Why is this user's order delayed? Finance can ask: Which expenses are abnormal? Supply chain personnel can ask: Which warehouse may run out of stock? 
Databricks Official Website: Your data. Your AI. Your future. | Source: Databricks Official Website
In the past, these questions required the data team to write SQL, generate reports, and build dashboards. In the future, Databricks hopes AI agents will handle these tasks directly. This is why it has launched products like Genie One and Agent Bricks.
Databricks is not aiming to create an ordinary chatbot but an AI assistant capable of accessing enterprise's real data, understanding business contexts, and helping employees make decisions. In other words, while OpenAI and Anthropic develop general-purpose large models, Databricks wants to create 'business-savvy AI' within enterprises.
No matter how powerful a large model is, without access to an enterprise's internal data, it remains just an external tool. No matter how advanced an AI agent is, without permission management, data governance, cost control, and security systems, it is difficult to integrate into core business operations.
Databricks aims to encompass all these aspects, becoming the unified operating layer for enterprise AI. It can provide AI assistants upward, allowing employees to directly interact with company data. It can provide databases downward, connecting business systems and AI systems.
It can horizontally enter scenarios such as marketing, security, customer service, and developer tools. It can also manage AI costs. As enterprises extensively use AI agents, bills become increasingly difficult to predict. An employee, an agent, or an automated process may continuously call models in the background, resulting in significant costs.
Databricks' launch of AI expenditure control tools essentially aims to become the 'master valve' for enterprise AI budgets. This is similar to the early days of cloud computing. Initially, enterprises simply moved their servers to the cloud.
Later, cloud vendors did not just sell servers but also databases, data warehouses, AI services, security services, development tools, and cost management tools. The more customers used, the harder it was for them to leave. Databricks also wants to follow this path. This article does not constitute any investment advice.