05/21 2025
447
The crux lies in "transcending the speaker's role".
Can the smart speaker industry, amidst four consecutive years of declining sales, solely pin its hopes on AI large models?
According to Runto's "Monthly Tracking of China's Smart Speaker Retail Market" report, in 2024, China's smart speaker sales amounted to 15.7 million units, a year-on-year decrease of 25.6%, with sales revenue totaling 4.2 billion yuan, a year-on-year drop of 29.4%. From 2021 to 2024, China's smart speaker market has experienced four consecutive years of sales decline.
(Image source: Runto)
In the first quarter of this year, China's smart speaker sales stood at 3.699 million units, a year-on-year decrease of 5.6%, marking the smallest quarterly sales decline over the past three years. National subsidies played a pivotal role in the market's recovery. Runto statistics reveal that following the introduction of national subsidies in the fourth quarter of last year, the monthly sales decline of smart speakers narrowed to within 20%.
However, Runto does not anticipate national subsidies to reverse the sales decline trend, forecasting that China's smart speaker sales will reach 13.5 million units in 2025, a year-on-year decrease of 14%. The smart speaker industry appears to have hit a wall, and the burgeoning AI landscape might offer a lifeline to related enterprises.
Smart Speakers Await a New Dawn
Prior to 2021, the smart speaker industry enjoyed a golden era marked by surging sales. In 2019, sales surged by 125% year-on-year. Nevertheless, the peak-reaching industry swiftly plummeted, with the year-on-year growth rate dwindling to 3.3% in 2020, followed by four consecutive years of sales decline.
Often dubbed the "control center of smart homes," smart speakers facilitate remote or voice control of smart home devices via mobile phones. Products across brands are highly homogenized, and even within the same brand but different grades, functionalities remain similar, with sound quality being the primary differentiator.
Playing music on smart speakers presents several limitations. For instance, Xiao Lei's Xiaomi AI Speaker Pro cannot display his collected and created playlists after binding with NetEase Cloud Music. Additionally, due to the lack of cooperation between NetEase Cloud Music and Xiaomi speakers regarding black vinyl VIP, certain music cannot be played on this smart speaker and can only be streamed via Bluetooth connection from a mobile phone. The sound quality advantages of high-end smart speakers fail to shine through.
As a result, the smart speaker category lacks significant differentiation between high-end and low-end products. The primary price range for product sales hovers around 300 yuan, leading to a lack of consumer willingness to upgrade, which in turn slows down product update cycles. Xiao Lei purchased the Xiaomi AI Speaker Pro in 2021 and, despite changing mobile phones and computers multiple times since, has never considered replacing his smart speaker.
(Image source: Runto)
During the peak of smart speaker market growth, numerous enterprises entered the fray. To swiftly capture market share, Tmall Genie, Xiaomi, and Baidu DuerOS all slashed their smart speaker prices below 100 yuan. At that time, Xiao Lei also acquired the Xiaomi AI Speaker Play for 79 yuan. Affordable prices indeed enticed a significant number of consumers to purchase smart speakers but also prematurely exhausted market potential, leading to early market saturation.
Market research firm Virtue Market Research noted in its report that the global smart speaker market size was approximately $10.2 billion in 2023 and is projected to grow at a compound annual growth rate (CAGR) of 17.5% from 2024 to 2030, reaching $31.5 billion by 2030. The global smart speaker industry's development trajectory starkly contrasts with that of the Chinese market.
Besides product function homogenization, the lack of strong consumer willingness to upgrade is also heavily influenced by smart speakers' functionalities gradually being supplanted by other smart devices. Nowadays, with various household appliances progressively integrating networking capabilities, users no longer need smart speakers to control electronic devices in their homes, as mobile phones and tablets suffice. Moreover, certain TVs, washing machines, and refrigerators can also serve as "control centers of smart homes," enabling voice command control of other appliances.
(Image source: Xiaomi)
Smart speakers might be truly indispensable for outdated appliances reliant on infrared control, such as the old air conditioner installed in Xiao Lei's residence. Since it lacks networking capabilities, using a remote control or mobile phone infrared control is cumbersome, whereas smart speaker control offers greater convenience.
Furthermore, smart speakers to some extent restrict consumers' brand choices when purchasing household appliances. For example, the fierce competition between Gree and Xiaomi prevents the Mijia App from adding Gree appliances. Before selecting household appliances, users must meticulously check if they can connect to the smart speaker ecosystem. However, Chinese consumers have limited brand options for smart speakers.
According to Runto statistics, China's smart speaker industry has evolved into a tripartite confrontation between Xiaomi, Baidu (DuerOS), and Tmall Genie, with their combined market share consistently exceeding 90%. In the first quarter of this year, their combined market share stood at 96.5%, and the number of brands with monitorable sales dwindled to 11, an 8-brand decrease from the first quarter of last year.
(Image source: Runto)
The once thriving smart speaker market attracted China's leading internet and home appliance manufacturers. However, due to the lack of innovative functionalities in smart speakers, consumers lacked the motivation to upgrade. Coupled with enterprises prematurely exhausting market potential and the impact of tablets, mobile phones, smart TVs, and other devices on smart speakers' positioning, China's smart speaker market has continued to shrink, with small brands exiting and market concentration increasing.
The current smart speaker industry necessitates change.
Is AI the Lifeline for Smart Speakers?
Smart speakers operate by controlling smart home devices, answering questions, or simply communicating based on users' voice commands, making them an ideal fit for AI large models.
Indeed, mainstream Chinese smart speaker manufacturers are attempting to integrate AI. For instance, multiple Xiaomi speaker devices have fully rolled out XiaoAi Classmate based on large models, with some devices gradually being updated within October.
Baidu and Alibaba, with their respective large models of ERNIE Bot and Tongyi, are naturally not missing out on the AI wave. For example, Baidu's DuerOS Smart Speaker MatePro, leveraging the ERNIE Bot large model and DUER OS system, enables AI Q&A, companion chat, and even dialect recognition.
(Image source: Baidu)
Runto mentioned in its report that currently, new market products have fully integrated AI large model technology, and the market penetration rate of devices supporting AI large models surpassed 20% in the first quarter of 2025. The issue is that AI large model integration has not altered the smart speaker industry's predicament.
The reason is that AI large models have not resolved the fundamental issue of smart speakers, namely the ecosystem challenge. The core function of smart speakers is to control smart homes. Incorporating AI large models makes smart speakers smarter and more precise in understanding user commands but fails to enrich the smart home ecosystem.
Moreover, smart speaker design heavily relies on voice interaction, lacking visualization and information integration capabilities. When using AI large models on mobile phones, users can upload photos, videos, audio, text, and other modal information to generate corresponding content, which is notably challenging for smart speakers. Even if smart speakers add these functionalities, except for devices equipped with touch-controllable screens, others also require mobile phones, tablets, PCs, or other devices to upload various modal information. If that's the case, why not directly utilize mobile phones, tablets, or PCs to run AI large models?
(Image source: Xiaomi)
Furthermore, mobile phones, tablets, and PCs are progressively integrating NPUs to run edge-side AI large models. Due to cost constraints, smart speakers run AI large models on the cloud side, lacking advantages in response speed and privacy security compared to other devices. Compared to mobile phones, tablets, and emerging buddy machines, smart speakers lag behind in terms of functional richness and interaction logic, except for their price.
Mobile phones are almost a necessity, while tablets and PCs are suitable for entertainment and work scenarios, respectively, enjoying high popularity. IDC data indicates that China's tablet sales reached 29.85 million units in 2024, far surpassing those of smart speakers.
AI can enhance smart speakers' capabilities but cannot transform them into a rigid demand. Merely integrating AI into products can increase smart speakers' value to a certain extent but is insufficient to reverse the sales decline trend. For the smart speaker industry to rejuvenate, more profound changes are imperative.
The Path Forward for Smart Speakers: Transcending the Speaker's Role
At events such as AWE and CES held this year, the LeiTech reporting team observed numerous AI toys developed by Chinese manufacturers, some focusing on companionship functions and connecting to large models like DeepSeek and Tongyi Qianwen, enabling continuous dialogue with users.
In terms of functionality, AI toys available for sale in China resemble smart speakers with AI capabilities. The difference lies in AI toys emphasizing communication and serving as companion robots or educational aids for children. Some products include screens and cameras, whereas smart speakers prioritize smart home control.
(Image source: JD screenshot)
Xiaomi, Baidu, and Tmall Genie have also launched smart speakers with screens and cameras, targeting companionship for elderly city dwellers and children's learning and care. Although the market penetration growth of such devices has stagnated, smart speaker manufacturers can combine AI large models to optimize product design and integrate more voice communication and educational aid functionalities for children.
Smart speakers can also explore adding external keyboards, incorporating learning machine and tablet functionalities, enhancing the product's visualization and information integration capabilities, and facilitating online learning for children of all ages.
The current primary sales product of smart speakers with screens features an 8-inch display. By increasing screen size and clarity, they can attempt to compete with devices like tablets and buddy machines. Large-screen smart speakers focusing on smart home control, audiovisual experience, and children's education have the potential to attract more consumers. Configuration upgrades imply increased costs, and smart speaker manufacturers can also leverage these upgrades to make another push into the high-end market.
(Image source: Baidu)
Beyond route transformation, what smart speakers need to prioritize is upgrading their software and hardware ecosystems. At the hardware level, this involves collaborating with more home appliance manufacturers to increase the number of connected appliances within the ecosystem. At the software level, it entails optimizing smart speakers' AI experience, such as recognizing and remembering family members, generating personalized recommended content, and efficiently executing complex tasks, enabling cross-device access to applications or functionalities of mobile phones, tablets, and PCs.
AI is a bonus for smart speakers, not a decisive factor. Enterprises must integrate AI functionalities into smart speakers, but products cannot solely rely on AI as a selling point. Brands like Xiaomi, Tmall Genie, and Baidu should enhance smart speakers' companionship and educational aid roles by adding configurations like high-definition large screens and high-resolution cameras, expanding the smart ecosystem's coverage, and enriching smart speakers' functionalities.
Source: LeiTech