07/22 2024 349
Preface:
In the current annual context, especially for the domestic market, the implementation of large models will mainly focus on the to B service sector.
As the cost of large model inference deployment continues to decline, to C application explorations will also gradually increase, which is expected to bring more innovative super applications to the market, thereby driving further development of the entire industry.
Author | Fang Wensan
Image Source | Network
SiliconFlow Becomes the "Shovel Seller" of AI Applications
Recently, SiliconFlow successfully completed an angel+ round of funding totaling close to RMB 100 million.
This round of funding was led by a well-known industry player, with follow-on investors including industry-leading enterprises and institutions such as Zhipu AI, 360, and Tsinghua Watermu Alumni Fund. Additionally, existing shareholder Yaotu Capital also continued to participate in this round of funding with an oversubscribed amount.
Regarding the company's future development plans, SiliconFlow will focus on technological product innovation and global commercialization.
The company will continuously optimize its self-developed SiliconLLM and OneDiff inference engines, dedicated to improving model inference efficiency and user experience.
Furthermore, SiliconFlow will upgrade the SiliconCloud platform to continuously launch high-performance, low-cost AI model cloud services.
SiliconFlow's founder, Yuan Jinhui, previously served as a lead researcher at OneFlow and Microsoft Research Asia, where his LightLDA system received a special award from the director of Microsoft Research Asia.
Founded in August 2023, SiliconFlow aims to build a large-scale, standardized, and high-performance generative AI computing infrastructure platform.
The company offers various products, including the model cloud service platform SiliconCloud, the large language model inference engine SiliconLLM, and the high-performance text-to-image/video acceleration library OneDiff, to help enterprises and individual users efficiently deploy AI models.
Since 2016, the OneFlow team led by Yuan Jinhui, as the only startup team globally focused on developing industrial-grade general deep learning frameworks, has successfully launched a high-performance distributed deep learning framework.
With the rise of technological trends represented by large models such as GPT, the large model training techniques and insights accumulated by the OneFlow team have been fully validated.
In 2023, at the height of the large model trend, the OneFlow team was acquired by Guangnian Zhiwai, founded by Wang Huiwen, the former co-founder of Meituan.
Subsequently, Guangnian Zhiwai was acquired by Meituan for certain reasons, prompting Yuan Jinhui to lead his team to establish the new company SiliconFlow.
Compared to large corporations, SiliconFlow's core advantages lie in two aspects.
① The company possesses profound large model technological accumulation and innovation, with a top-tier AI Infra technology capability team and works. The original technical team has already developed the open-source training framework OneFlow in the industry.
② Secondly, as a startup team, SiliconFlow can quickly capture changes in industry demands and flexibly adapt accordingly.
To date, SiliconFlow has undergone two rounds of funding. In January of this year, the company completed its previous round of RMB 50 million angel funding led by Innovation Works, with follow-on investments from Yaotu Capital, Qijitang, and Wang Huiwen, the co-founder of Meituan, among others. The post-investment valuation reached hundreds of millions of RMB.
Core Product System has Initially Taken Shape
SiliconFlow's self-developed SiliconLLM large model inference engine, through in-depth optimization of the kernel, framework, mechanisms, and models, has achieved industry-leading inference efficiency, with speeds significantly exceeding those of similar open-source products by more than tenfold.
When tackling complex scenarios such as MoE architecture, ultra-long context processing, and ultra-low latency, SiliconFlow's products demonstrate industry-leading capabilities.
SiliconFlow recently launched the one-stop cloud service platform SiliconCloud, dedicated to providing high-performance, low-cost multi-category AI model services (MaaS).
SiliconCloud not only integrates the latest and most cutting-edge open-source models globally but also significantly reduces the cost of large model inference through its self-developed inference engine suite (SiliconLLM & OneDiff), offering users exceptional performance experiences.
This enables developers to focus on product innovation without worrying about the high computing costs associated with large-scale deployment.
SiliconCloud brings together numerous mainstream large models, including Ali Group's Tongyi large model Qwen2, Zhipu's GLM-4, Huantong Quant's DeepSeek V2 series of open-source models, as well as text-to-image models such as SDXL, SDXL Lightning, PhotoMaker, and InstantID.
Based on SiliconFlow's deep accumulation in the AI Infra field, the large models on the SiliconCloud platform exhibit faster response speeds and lower computing costs, significantly enhancing AI application development efficiency and reducing deployment costs.
For example, using SiliconCloud to invoke the text-to-image model Stable Diffusion can achieve high-efficiency image generation in just one second.
When invoking the large model DeepSeek V2, its response speed can reach 50 Tokens/s.
This is thanks to SiliconCloud's integrated video generation inference engine OneDiff, which can accelerate the performance of text-to-image models like SDXL by up to 3 times.
The Gradual Emergence of AI Infra's Importance Benefits the Track
AI Infra (Artificial Intelligence Infrastructure) refers to a series of underlying software technology facilities built to support the training and deployment processes of large models in the large model ecosystem, in addition to computing power.
These facilities provide developers with a convenient and efficient environment for designing or using models, eliminating the need to focus excessively on the allocation of underlying computing resources.
As the intermediate link connecting the AI application layer and the computing chip layer, the AI Infra layer plays a core role similar to that of an "operating system" in the current era of large models.
Faced with challenges such as optimizing the efficiency of large model training and inference, fully exploiting the potential of underlying hardware, and reducing the thresholds and costs of generative AI application development, the AI Infra layer bears the responsibility of solving these critical issues.
With the popularity of technologies such as ChatGPT, large models and related applications continue to emerge. As the AI middleware infrastructure connecting computing power and applications, the technical and commercial development prospects of AI Infra are highly anticipated.
Currently, the development of large models is still in its infancy, and rapidly building and fine-tuning models has become the focus of industry attention.
However, as the industry gradually matures and the application layer flourishes, the supporting role of infrastructure will become increasingly prominent.
AI Infra not only bridges the gap between application developers, hardware, and models, enhancing development efficiency and innovation capabilities, but it also effectively meets the urgent market demand for high-performance, low-cost AI solutions.
In China, innovative enterprises in the AI Infra field include Wuwen Xinqiong and Qingcheng Jizhi, both of which are backed by Tsinghua University and supported by investors such as Zhipu AI.
Among them, Wuwen Xinqiong was initiated by Wang Yu, the director of the Department of Electronic Engineering at Tsinghua University, with founder Xia Lixue as his student. Meanwhile, the founder of Qingcheng Jizhi comes from Tsinghua's Computer Science Department.
Internationally, companies such as NVIDIA, Amazon, Lepton AI, OctoAI, and the vLLM developed by Berkeley University are also competing in this field.
Compared to application-layer large model products such as Wenxin Yiyan and Tongyi Qianwen, the AI Infra track focused on by SiliconFlow places more emphasis on the AI middleware infrastructure connecting computing power and applications, encompassing data preparation, model training, model deployment, and application integration.
According to CICC data forecasts, the AI Infra industry is currently in the early stages of rapid growth, and it is expected that various sub-tracks will maintain high growth rates exceeding 30% over the next 3-5 years.
Conclusion:
Looking ahead, as models continue to upgrade, architectures are optimized, and cost-saving and efficiency-enhancing measures such as customized chips are further implemented, the profitability of AI applications is expected to see significant improvements, thereby gradually highlighting the value of the AI application layer.
In this process, the AI Infra ecosystem closely related to developers will demonstrate significant advantages.
It is also worth noting that the number of parameters in future AI models will continue to grow.
As model sizes expand, existing deep learning frameworks may fail to meet the actual needs of developers, necessitating the reconstruction of underlying AI frameworks.
This is not only an inevitable consequence of technological progress but also presents new development opportunities for startups.
Some reference materials: Venture Capital Daily: "Zhipu AI, 360 Invest in This AI Application 'Shovel Seller'", Intelligent Emergence: "Yuan Jinhui's New Company 'SiliconFlow' Raises Nearly RMB 100 Million in Angel+ Round Funding", Daily Economic News: "SiliconFlow Completes Nearly RMB 100 Million Angel+ Round Funding", Jinjiao Finance: "Domestic AI Escapes a Disaster", IPO Early Knowledge: "SiliconFlow Raises Nearly RMB 100 Million in Angel+ Round Funding Again", LatePost: "Co-founder of Guangnian Zhiwai Starts Again, Talking with Yuan Jinhui About What AI Infra Does", AI Technology Review: "OneFlow's Yuan Jinhui Starts Again, Founding New Company 'SiliconFlow'"