12/20 2024 466
Advanced storage has emerged as the cornerstone infrastructure driving digital and intelligent transformation.
With the rapid advancement of artificial intelligence and the ongoing digital and intelligent transformation, the significance of storage has become increasingly evident.
As data volumes soar and the need for real-time analysis in large model training and business operations intensifies, traditional data centers are proving inadequate. Storage systems must now evolve into advanced data infrastructures offering higher throughput, lower latency, and more efficient data management.
IDC, in its whitepaper titled "Building Advanced Storage Centers for the Age of Intelligence," underscores the necessity of "moderately advancing the construction of advanced storage centers."
01
In the Era of AI, Storage is a First-Class Citizen
IDC predicts that by 2024, China will generate a total of 39.5ZB of data, a figure that will escalate to 97.1ZB by 2028.
If this storage capacity isn't relatable, consider this: to store 1ZB of data, you would need 1 billion mobile phones, each with 1TB of memory. Data is experiencing explosive growth. From the internet to mobile internet, to the Internet of Things and artificial intelligence, the daily data generation is mounting, encompassing not just structured data but also vast amounts of unstructured and semi-structured data.
This massive data influx places significant demands on storage systems. Analyzing government and enterprise tenders and procurements, Digital Frontier found hundreds of storage-related projects in the first eight months of this year, primarily in sectors such as finance, manufacturing, energy, telecom operators, and transportation. The storage market is thriving amid the popularity of large models.
Furthermore, much of this stored data remains unused, underutilizing its potential value. However, current AI and business demands necessitate high-frequency, high-speed, and large-bandwidth real-time read and write operations. Storage systems must support high bandwidth, low latency, and high concurrency to facilitate rapid read/write operations and real-time data analysis.
"The rise of AI rediscovers the value of data," notes Guo Zhaobin, Vice President of Sugon Storage. Previously, storage was a passive response to upper-layer demands, but in the era of digital intelligence, data's value has soared. "People once viewed data as static, but now, through iterative training, it generates intelligence and new data, garnering attention," he adds.
Historically, there was a tendency to "emphasize computing power over storage capacity," leading to the construction of numerous GPU-based intelligent computing infrastructures while neglecting advanced storage. Early last year, NVIDIA released the budget allocation for an AI data center, with storage accounting for merely 20%.
Yet, an increasing number of individuals recognize that storage performance significantly impacts computing power.
Particularly during large model training, addressing computing power and data issues alone is insufficient. Many enterprises struggle to operate at full GPU capacity, often encountering network and storage bottlenecks, causing delays and waste, which hampers overall model training efficiency.
While traditional businesses typically require storage performance in the hundreds of gigabytes, large model training demands have surged into the terabytes. Whether loading massive training data, resuming training from petabyte-level breakpoints, or conducting high-concurrency inference and question-answering, storage performance directly influences GPU utilization throughout the training and inference process. In clusters with tens of thousands of GPUs, poor storage performance significantly increases GPU idle time, leading to immense resource waste.
A report suggests that, given the same GPU computing power, variations in storage performance can result in severalfold differences in model training cycles.
Moreover, compared to traditional AI demands, large model scenarios involve vast data volumes, large parameter scales, and extended training periods. These scenarios demand higher storage capacity and emphasize extreme performance attributes such as high throughput, high IOPS, high bandwidth, and low latency.
Storage systems are evolving into data platforms offering higher throughput, lower latency, and more efficient data management. Traditional data centers are undergoing upgrades, with one emerging form being the construction of advanced storage centers that harmonize storage and computing.
Guo Zhaobin emphasizes that, in the era of digital intelligence, storage is no longer a passive component but develops in tandem with computing power.
Qian Depei, an academician of the Chinese Academy of Sciences, asserts bluntly, "Storage is also a first-class citizen." In the AI era, without advanced storage capacity, computing power cannot be fully utilized; the two are now mutually supportive.
More individuals are recognizing storage's value. This year, the China Computer Federation (CCF) held its inaugural storage conference. In October 2024, the China Electronics Standardization Institute established a Data Storage Professional Committee, with Sugon as the chair unit. Some in the industry even posit that storage is the fourth pillar of AI infrastructure, alongside algorithms, computing power, and data.
However, Guo Zhaobin notes that storage standards still have a long way to go. For instance, there are currently no unified standards for storage evaluation and testing methods. Additionally, storage protocols are relatively outdated, causing numerous application-side inconveniences.
02
Advanced Storage: Unlocking China's Storage Industry
Building advanced storage centers has become imperative for addressing data challenges in the intelligent era.
IDC's whitepaper mentions that advanced storage centers' technological breakthroughs and applications are evident in their capabilities, including smooth EB-level scalability, multi-protocol support, advanced software architecture, multiple protection mechanisms, intelligent management platforms, liquid cooling, and other advanced technologies. These collectively endow them with five characteristics: efficient integration, quality and efficiency enhancement, ubiquitous data flow, security and reliability, and green and low carbon.
These characteristics address current storage system challenges.
For example, as data sources and formats become increasingly complex, data silos emerge between different storage systems, hindering data management and utilization. Moreover, many traditional storage systems lack scalability, making it difficult to meet business development needs.
Advanced storage centers offer flexibility and scalability. They are compatible with diverse architectural technology stacks, allowing users to select technical solutions based on their needs. Facing high-concurrency and large-dataset storage demands, they can scale smoothly.
In the intelligent era, high capacity, speed, and low latency have become crucial storage system performance indicators. However, current storage devices' capacity and read/write performance pose significant bottlenecks for quality and efficiency improvements. Advanced storage centers demand higher performance, achieving massive storage space, ultra-high throughput, and IOPS capabilities through increased NVMe all-flash storage and multi-level data acceleration technologies.
The essence of data flow is the movement of data elements. Data flow must overcome关键技术such as cross-domain storage cluster combination management, data cold-warm-hot classification perception, intelligent cross-domain network data flow, and cross-domain seamless access, supporting optimal storage resource allocation.
Advanced storage centers must support ubiquitous data flow, encompassing cross-platform flow between centralized and distributed storage, flow between cloud and localized data, and flow across hot, warm, and cold data forms.
Additionally, green and low power consumption are crucial for advanced storage centers.
IDC data reveals that storage energy consumption accounts for approximately 35% of data center energy consumption. Zhou Zhengang, Vice President of IDC China, notes that previously, computing centers had higher energy consumption requirements, with storage and GPU consumption not on par. However, as large model training surges in storage I/O throughput access, storage power consumption has risen, increasing demand for green technologies like liquid cooling.
As an advocate and pioneer of advanced storage centers, Sugon Storage rapidly adapts to the AI era's storage market needs.
In June 2024, Sugon Storage unveiled FlashNexus, the world's first centralized all-flash storage exceeding 100 million IOPS, offering "epoch-making performance innovation" and becoming the industry's only centralized storage product with petabyte-level scalability. It's primarily used in core business systems like finance, telecom operators, and healthcare.
Currently, the development of all-flash media is an industry consensus. Compared to traditional HDDs, all-flash media supports high IOPS and low latency, making it ideal for random read/write scenarios during AI large model training.
While launching its first centralized all-flash product, Sugon also upgraded its distributed storage product, ParaStor all-flash storage, targeting AI applications. Leveraging NVMe all-flash technology optimization, it achieves a maximum single-node bandwidth of 150GB/s and 3.2 million IOPS.
ParaStor all-flash storage employs the industry's first five-level acceleration solution. For instance, the BurstBuffer acceleration layer stores key data only on the compute node's local NVMe disk, avoiding massive network data transmission and remote storage access. This is suitable for storing and rapidly accessing massive small files, enhancing read performance several to ten times. Another example is XDS dual-stack compatibility, allowing GPUs to directly access storage, reducing CPU overhead, shortening the I/O path, and lowering latency.
Leveraging distribution's scalability, Sugon's ParaStor distributed all-flash storage is widely implemented in sectors like science and education, finance, telecom operators, bioinformatics, and cutting-edge AI applications like autonomous driving. For example, Zhiyuan Robotics' rapid product iteration over the past year is supported by Sugon's ParaStor distributed all-flash storage.
Today, Sugon Storage boasts two major product lines: FlashNexus centralized storage and ParaStor distributed storage. Sugon refers to them as strong storage and intelligent storage, respectively. From their names, it's evident that centralized storage targets core business scenarios like finance and telecom operators, demanding high performance and reliability. Conversely, intelligent storage, represented by distributed storage, caters to agile business needs such as AI.
Bridging these two architectural product lines, Sugon introduces a universal storage solution enabling seamless data flow, one-click disaster recovery across platforms, seamless cross-form hot-warm-cold data flow, and a comprehensive cross-domain resource pool view. This significantly enhances storage resource utilization and better supports scenarios like eastern data storage in the west, eastern data rendering in the west, and eastern data training in the west.
Through the combination of strong storage, intelligent storage, and universal storage products and solutions, Sugon's new data infrastructure for the AI era is gaining recognition from an increasing number of users. According to the latest IDC data for the first half of the year, Sugon's market growth rate reached 19.2%, significantly surpassing the market average.
03
Advanced Customers Showcase Best Practices
The Advanced Data Center in Western Science City, Chongqing, is a demonstration project for eastern data and western computing and a key node in the Chengdu-Chongqing hub. Through a combination of high-density liquid-cooled and air-cooled racks, its core computing equipment achieves a PUE of 1.04, with energy consumption well below the industry average.
As a representative of advanced storage centers, it boasts an impressive storage capacity of up to hundreds of petabytes. It supports seamless cross-regional scheduling across cold, warm, and hot tiered storage, and can flexibly adapt to front-end applications by providing support for various protocols, including file, block, and big data, as needed.
IDC believes that when constructing regional advanced storage centers, governments must consider multiple factors comprehensively, such as infrastructure supply, construction and operating costs, the overall storage and computing performance ratio of the center, data security and privacy protection, ecological openness, and green and low-carbon development requirements.
To meet the diverse needs of customers for advanced storage, Sugon has explored three deployment models.
Apart from constructing regional advanced storage centers for governments, there is ample potential for advanced storage centers to play a pivotal role in the industrial sector. For instance, high-end computing is already prevalent in fields like meteorology, environment, and oceans, which have stringent demands for data processing capabilities. Consequently, constructing industry/industrial chain advanced storage centers has become crucial.
The China Meteorological Administration has collaborated with Sugon to build a cross-regional storage platform, establishing a unified national and provincial data environment. Leveraging NVMe all-flash storage, it delivers ultra-high IOPS performance. According to a report by Euromonitor International, Sugon ranks first in market share revenue for China's meteorological high-end computing services in 2023, with a 52% market share.
A Sugon insider revealed that for industry-specific advanced storage centers, Sugon prepares for cross-domain circulation based on industry data aggregation demands. For example, China Mobile has partnered with Sugon Storage to create the industry's first intelligent storage scheduling platform. Its core capabilities include data tiering and policy management, enabling tiered management based on cold, warm, and hot data; unified observation, supporting unified management of heterogeneous storage with clear data and storage distribution; and cross-regional data migration, facilitating seamless business access through free data migration between different resource pools.
Enterprises also have the need to establish their own advanced storage centers. Massive amounts of data form the foundation for analysis and large model training. Enterprise-level advanced storage centers not only offer efficient and secure data storage solutions but also support rapid data access and processing capabilities, which are crucial for accelerating AI model training and inference processes.
For instance, an artificial intelligence company faced challenges with massive training data and high read/write speed demands during the inference process. By partnering with Sugon, the company constructed an advanced storage center, leveraging a streamlined system architecture, efficient data flow performance, and ultra-high-speed metadata access performance in the microseconds range to solidly support its business.
In addition to AI vendors, autonomous driving companies also have strong demands for advanced storage centers. Sugon Storage supports the model iteration of an autonomous driving company with over 100PB of storage capacity.
Behind these advanced customers lie the trends and demands of their respective industries and sectors. By leveraging technological innovation and resource integration capabilities, Sugon Storage has bridged the upstream and downstream of the storage industry, continuously promoted the implementation of the three innovative deployment models, and achieved phased results.
This success is underpinned by a long-term commitment to industrial practice and technological accumulation. Sugon has been a pioneer in the storage field for two decades, firmly choosing the path of independent research and development as early as 2004, dedicated to solving the underlying technical challenges of China's storage industry. In 2009, Sugon's self-developed storage system, ParaStor, was officially launched. In November 2022, Sugon's ParaStor Distributed Unified Storage System topped the IO5O0 Global Storage Performance Benchmark.
In 2023, as a pioneer and explorer in the storage field, Sugon Storage took the lead in proposing the concept of 'Advanced Storage Power' and completed the construction of some advanced storage power centers in the first batch of pilot projects. Having undergone several industrial upgrades, Sugon Storage has not only helped users achieve updates and iterations of their data infrastructure but has also established best practices in serving users, achieving a remarkable transformation.