The Surge of AI Paid Services Fuels the Boom of Computing Power Leasing

05/11 2026 553

What is driving this unprecedented surge in demand for computing power? The answer can be simply summed up as the exponential explosion in token usage.

According to the latest estimates from OpenRouter, as of early April 2026, the total global usage of tokens in AI large models has reached 27 trillion, representing an 18.9% month-over-month increase. Meanwhile, China's token usage in AI large models has surpassed that of the United States for five consecutive weeks. More tokens consumed means more servers at work, more GPUs heating up, and more electricity being burned—and behind all these 'mores' lies the fundamental growth logic of the computing power leasing business.

The concept of 'leasing instead of buying' is rapidly gaining popularity among SMEs. A single NVIDIA A100 graphics card costs approximately RMB 100,000 to 150,000, while the H100 can go as high as RMB 200,000 to 300,000. For a startup, the hardware procurement costs alone can be crippling. Through leasing, companies can significantly reduce their initial investment by over 70% and can upgrade to the latest hardware as needed. In China, some computer equipment leasing companies have even seen high-performance models sell out 'the moment they hit the shelves.' This phenomenon proves that the core value of computing power leasing is no longer just about 'saving money' but providing a viable path for SMEs to keep up with technological advancements at lower costs and risks.

01

Domestic and International Players Are Stepping Up Their Efforts

If booming demand paints a broad prospect for computing power leasing, then the recent spate of high-profile deals directly tells the market: this track (translation: 'sector') is far deeper and more diverse in terms of participants than imagined, with much heavier hitters involved.

On May 4, ByteDance's AI application Doubao added a paid version service statement to the App Store, introducing three tiers of value-added services on top of its free version: Standard at RMB 68/month on a recurring basis, Enhanced at RMB 200/month on a recurring basis, and Professional at RMB 500/month on a recurring basis. The paid features primarily focus on high-complexity productivity scenarios such as PPT generation, data analysis, and film and television production.

Why is Doubao choosing to charge fees at this moment? Three key figures suffice to explain: Doubao's average daily token consumption has reached 120 trillion, a 1,000-fold increase since its initial launch in May 2024, and it has doubled in just the first three months of 2026. Of ByteDance's planned approximately RMB 160 billion in capital expenditures for 2026, nearly half is expected to go toward AI chip procurement. According to industry estimates, the demand for inference computing power is already 10-15 times that of the training phase. In other words, the more paying users, the higher the usage frequency, and the more complex the tasks, the faster the computing power is consumed—and the terminal Conduction effect (translation: 'transmission effect') of this closed-loop logic is that computing power leasing firms are swamped with orders.

If Doubao's move to charge fees reflects the demand-driven pressure from AI users and commercialization, then Anthropic's major computing power leasing deal with SpaceX represents a tangible reshuffling of the computing power supply landscape. On May 6, AI unicorn Anthropic officially announced that it had leased the entire computing power capacity of SpaceX's Colossus 1 data center in Memphis, Tennessee, USA.

What does this mean? Colossus 1 is equipped with over 220,000 NVIDIA GPUs, including H100, H200, and even next-generation GB200 accelerators, with a total power consumption of 300 megawatts—equivalent to the electricity consumption of a medium-sized city with about 200,000 to 300,000 households. The data center was planned and completed in just 122 days, fully demonstrating SpaceX's astonishing coordination capabilities in engineering construction and chip procurement. For Anthropic, this agreement means acquiring over 300 megawatts of additional computing power within just one month, bringing its total computing power from less than 100,000 H100 equivalents to a level on par with or even surpassing major competitors like OpenAI and Google DeepMind. For SpaceX, this is a multifaceted and ingenious commercial move: Colossus 1, originally built for xAI to train Grok, had been left idle after xAI was dissolved and merged into SpaceX, which renamed it SpaceXAI. Leasing it to Anthropic not only directly monetizes the idle top-tier computing power but also provides a highly persuasive cash flow story for its upcoming IPO. The two companies are even discussing collaborating on developing 'multi-gigawatt orbital AI computing power,' accelerating the engineering steps to deploy data centers in low-Earth orbit.

Notably, Anthropic is not putting all its eggs in SpaceX's basket when it comes to leasing computing power. Just a month earlier, Anthropic had already signed computing power supply agreements totaling approximately 5 gigawatts with Amazon and Google/Broadcom, respectively. The agreement with Google alone is estimated to involve an investment of nearly USD 200 billion over five years. Combined with its previous USD 30 billion computing power contract with Microsoft Azure, Anthropic's total commitments in computing power now amount to hundreds of billions of dollars. These staggering figures reveal a reality: the competition among top AI models is not just a race of algorithms and data but a pure war of computing power consumption. And in this war, the most formidable players are not the model developers themselves but the 'computing power kings' who possess massive GPU resources and have superior supply chain integration capabilities.

Shifting the focus back to China, the player landscape in the computing power leasing market is also rapidly taking shape. Core players can be broadly categorized into three typical paths: The first category consists of IDC/AIDC vendors with deep ties to core customers, such as Sinnet Technology, Ranzhi Technology, and Aofei Data, who have established themselves through node reserves and expansion capabilities. The second category includes cloud service providers with global scheduling, management, and self-built capabilities. Alibaba, Baidu, and Tencent have all announced multiple rounds of price increases for their AI computing power products, with hikes ranging from 5% to 34%. The third category comprises computing power leasing firms and cross-border transformation targets with differentiated resource integration capabilities that are currently in the phase of accelerating performance realization. Xiechuang Data is one such example—in the first quarter of 2026, it achieved a net profit attributable to shareholders of RMB 750 million, up 343% year-over-year. Litong Electronics reported a net profit attributable to shareholders of RMB 270 million for the same period, up 821% year-over-year. On May 5, 2026, Dongyangguang announced that its controlled subsidiary had signed a framework contract for computing power service procurement with a total order value ranging from RMB 16 billion to 19 billion. Looking overseas, emerging cloud provider CoreWeave's capital expenditure plan has jumped from USD 10.3 billion to USD 30 billion to USD 35 billion, with its order backlog nearing USD 96 billion. Oracle's 4.5 GW computing power leasing agreement with OpenAI has further pushed its 2026 capital expenditures to over USD 50 billion.

02

From 'Selling Computing Power' to 'Selling Tokens': The Business Model Has Changed

The rapid ascent of computing power leasing from a 'niche business' to a 'core asset' hinges on a fundamental generational evolution in its business logic. Five years ago, such deals were akin to simple 'server subletting'—platforms procured a batch of GPUs, listed them online at an hourly rate, and clients used them on demand, much like a vending machine-style standard commodity transaction. Today, however, the industry is undergoing a profound mode transformation.

One of the most visible changes comes from the pricing side. According to industry monitoring data, the price of a one-year lease contract for an H100 GPU surged from a low of USD 1.70 per card per hour in October 2025 to USD 2.35 in March 2026, a nearly 40% increase. Meanwhile, rental prices for high-end GPUs like NVIDIA's H200 and H100 have generally risen by 15% to 30% month-over-month—the H200 now rents for RMB 7.5 to 8.0 per card per hour, or RMB 60,000 to 66,000 per month, a 25% to 30% increase; the H100's monthly rent has risen to RMB 55,000 to 60,000, a 15% to 20% increase. More tellingly, delivery lead times have extended significantly to the second quarter of 2027 (for H200) and the first quarter of 2027 (for H100). This lengthening of the timeline is itself the truest signal of market tightness.

Behind these price hikes lies a more fundamental shift in demand structure—computing power leasing is transitioning from 'selling computing power' to 'selling tokens.' This is a seemingly simple yet far-reaching transformation. In the past, users purchased 'GPU runtime,' and enterprises paid by the hour, with actual utilization entirely dependent on the users themselves. Today, as AI applications fully enter the inference-driven phase, a large number of developers, enterprises, and individuals are essentially 'consuming tokens' when calling large models—every API call, every intelligent agent interaction, and every complex task inference consumes a massive volume of tokens. Demand-side players no longer care how many GPU hours they use; they only care how many tokens the model can generate and how many tasks it can complete. Correspondingly, the service delivery model of computing power leasing firms has also upgraded from 'bare computing power rental' to 'model service as computing power' and even to 'token revenue-sharing models.'

However, behind the glossy facade of this business logic lie An undeniable hidden concern (translation: 'non-negligible concerns'). Some have used the current H100 rental rate of USD 2.35 per hour and an optimistic assumption of 100% occupancy to spin tales of 'recouping costs in four months.' For startup and fringe players, such calculations often harbor traps. Industry insiders point out that the more realistic accounting is this: GPU assets fully recoup their costs in about 2.5 years. A 2.5-year payback period is quite respectable in capital-intensive industries, but it comes with caveats—you must secure orders for today's most scarce high-end GPUs, your data center must have sufficient power capacity and cooling capabilities, and your customer relationships must involve long-term lock-in contracts of 3 to 5 years rather than one-off transactions. Against the backdrop of the four major tech companies (Meta, Alphabet, Microsoft, Amazon) planning a combined USD 725 billion in AI capital expenditures for 2026, small and medium-sized computing power leasing participants are inherently at a disadvantage in supply chain negotiations. It is foreseeable that resources and orders in the computing power leasing track (translation: 'sector') will further concentrate among a few top players, with a 'winner-takes-all' pattern (translation: 'landscape') quietly emerging.

03

Breaking Through in Computing Power Requires Answering Three Must-Ask Questions

If the previous sections discussed the 'present' of computing power leasing, then what truly determines the future trajectory of this industry are three deeper, longer-term structural questions.

The first question is the acceleration of technological generational competition. NVIDIA is updating its product roadmap at a breathtaking pace: After the Blackwell platform, the Vera Rubin platform is slated to ship in the second half of this year, promising a 10-fold improvement in performance per watt over the previous generation; Rubin Ultra is expected in 2027, and Feynman is planned for 2028. Each leap in GPU computing power means the book value of older-generation servers can quickly depreciate, posing non-negligible depreciation and asset impairment risks for computing power leasing firms that have invested hundreds of billions in fixed assets. But precisely because of this, the leasing model demonstrates stronger commercial resilience—only through large-scale, rapidly iterating cluster operations can the costs of hardware upgrades be smoothly spread across a large customer base, turning the negative impact of technological iteration into flexibility and scale advantages. Mercedes-Benz's F1 team doesn't sit out this year's championship just because it will switch to a new engine next year; similarly, AI companies won't forgo leasing today's available GPUs just because next year's computing power hardware will be stronger. In a world of accelerating technology, computing power leasing can turn a profit rather than incur losses—tenants don't have to worry about asset depreciation, while lessors can absorb iteration costs through economies of scale.

The second question is the game theory (translation: 'tug-of-war') between the 'computing power divide' and localization. A rather darkly humorous reality is unfolding at some smart computing centers in China: cabinets equipped with NVIDIA GPUs have over 90% occupancy rates, while those with domestic GPUs remain largely idle. The core bottleneck is not the hardware itself but the software ecosystem. CUDA has accumulated over a decade of developer toolchains and ecosystem advantages, while domestic GPUs still lag noticeably in framework adaptation and compilation optimization. Huatai Securities judges 2026 as the ' the first year (translation: 'Year Zero') of domestic super-nodes,' attempting to bridge the single-chip gap through system-level reconstruction of super-node architectures and reduce communication overhead to convert theoretical computing power into usable throughput. But this path will undoubtedly require time. While the computing power leasing industry enjoys the dividends of explosive demand for large models, it must also confront the structural dilemma of 'chip shortage and soul deficiency.' The Ministry of Industry and Information Technology has already issued the 'Notice on Launching a Special Initiative for Inclusive Computing Power to Empower the Development of Small and Medium-Sized Enterprises,' making deployments around innovative businesses such as 'computing power banks' and 'computing power supermarkets' to support SMEs in depositing idle computing power resources. This policy guidance aims to break down the heavy-asset barriers of computing power, but if the foundational computing power base remains dominated by overseas high-end chips, so-called 'inclusive computing power' will still be subject to external supply chain uncertainties at a deep level.

The third question is the synergy between computing power leasing and AI transformation in real-world industries. The current market frenzy is primarily driven by demand from internet-based large model companies, with exponential token growth concentrated in cloud applications and intelligent agent tracks. But the true golden age of computing power is far from over—if smart manufacturing, medical imaging, autonomous driving, digital twins, and other real-world industries fully embrace AI, the ensuing surge in computing power demand will multiply several times over. Against this backdrop, the computing power leasing industry faces a choice: Should it continue to follow the current path of chasing top-tier clients and act as a mere 'arms dealer' of computing power among internet giants? Or should it proactively dive deeper, collaborating with various industries to make computing power a true public infrastructure for the national economy? The outcomes of these two paths will differ greatly. The former offers 'quick money,' while the latter promises 'big money'—but the path to big money requires patience, industry knowledge, and cross-domain collaboration capabilities.

The era of free AI is accelerating to a close, and computing power, as the most in-demand and scarcest production factor in the AI world, is turning its 'leasers' into the most certain beneficiaries of this industrial wave. Not everyone can cash in on the algorithmic breakthroughs of large models, nor can every company afford to build multi-billion-dollar computing power clusters in-house. But nearly every enterprise that wants to implement AI applications cannot bypass the computing power leasing path of 'monthly payments, token-based billing, and on-demand scaling.' As the AI paid services wave propagates from C-end applications to B-end infrastructure, computing power leasing is no longer a niche business in some niche segment (translation: 'sub-sector') but an indispensable 'utilities' component in the commercial closed loop (translation: 'closed loop') of the entire AI industry.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.