11/21 2024 531
"Those who truly care about software should manufacture their own hardware."
Fifty years later, the hardware industry is still adhering to this quote from Allen Kay, as paraphrased by Steve Jobs. Jobs, who aspired to create a graphical user interface operating system, led the PC and mobile device eras consecutively. In the AI era, countless AI devices are initiating a "creativity race".
One of the hottest sectors at present is undoubtedly glasses equipped with large AI models.
Especially Ray-Ban Meta, which has validated market demand with an expected sales volume of 3 million units (as of the end of this year). Subsequently, Baidu and Rokid have released their AI glasses products, and OPPO, vivo, Huawei, Tencent, and ByteDance are also evaluating AI glasses projects. Apple is also rumored to be organizing a team to research the AI glasses market.
Driven by this trend, many employees from internet and traditional manufacturing industries have also begun to compete in various types of AI hardware sectors.
"The reason for the failure of the previous generation of AI hardware was inadequate underlying AI capabilities. Now that the infrastructure is complete, AI can provide real value to users." As the former head of Tmall Genie, an intelligent speaker, and CEO of YueRan Innovation, which has undergone two generations of AI technology iterations, Li Yong believes that there is a fundamental difference between the two generations of AI hardware.
Regarding the improvements brought about by AI infrastructure, Rokid CEO Zhu Mingming also remarked, "Six months ago, it took four or five seconds to get a response to an AI query, but now it only takes one or two seconds. In face-to-face translation scenarios, this is a qualitative difference."
From the changes in this round of AI applications, it can be seen that from large language models to multimodal evolution, AI hardware can complete more and more tasks for users. AI is no longer limited to providing question-and-answer and summary functions; AI hardware is transforming from intelligent hardware to intelligent agents.
However, the many participants in AI glasses are just the tip of the iceberg in the entire AI hardware industry. Driven by large AI models, the era of AI hardware for everything is approaching.
Has the explosion of AI glasses arrived?
With the entry of numerous manufacturers, 2025 may be the first year of the explosion of AI glasses.
Among them, Hive Tech fired the first shot for Chinese AI glasses. In August this year, it released the JieHuan AI glasses, which support domestic mainstream large models. On November 12, at Baidu World, DuerOS released AI glasses equipped with the ERNIE Bot large model, expected to be released in the first half of 2025. Subsequently, on November 18, Rokid, a domestic AR veteran, also released Rokid Glasses equipped with the Tongyi large model, expected to be available in the second quarter of 2025. In addition, Xiaomi and Respect the Unknown are also set to release products in 2025.
From the three released products, it can be seen that domestic AI glasses manufacturers have, to some extent, borrowed from the successful path of Meta AI glasses, basically combining conventional glasses with AI large models. Taking Rokid, which has the most complete current product, as an example, its product concept can be understood as an iteration of Meta AI glasses.
In terms of appearance, Rokid Glasses are co-produced with Bolon. With black frames, thick temples, and a weight of 49g, they are very similar to ordinary glasses. The similarity is so high that even Zhu Mingming often gets confused about whether he is wearing AI glasses or not. In terms of product design, Rokid Glasses are also equipped with cameras, speakers, and AI large models, enabling functions such as voice announcements, photography, and voice question-and-answer. The addition of a display module (single-optic waveguide) allows Rokid to interact with users in the highest information density manner.
Looking at the AI glasses already on the market, the most basic functions are voice announcements and large model conversational abilities. For example, the AI capabilities of Huawei's smart glasses can be considered as basically "externally connected" to XiaoE Assistant. The core function of Hive Tech's JieHuan AI glasses revolves around voice announcements, with AI automatically summarizing mobile phone messages for users.
On the Rokid side, through deep integration with Alibaba's Tongyi large model and functional adaptation with other partners, Rokid Glasses are no longer just a "rough house" with simple AI functions. "This is the most important aspect that sets it apart from other products," Zhu Mingming said.
One of the most interesting features is the payment scenario in collaboration with ZhiXiaoBao. With voiceprint payment technology, users can scan and pay directly through the glasses' camera without taking out their phones.
In other scenarios, there is also a face-to-face translation function. The translated content is also directly projected onto the lenses, making it convenient for users to review and check. For the AR navigation function, the glasses display an image similar to an HUD (heads-up display) in a car; for the font enlargement function, visually impaired users can automatically enlarge text when reading newspapers.
From the functional implementation of Rokid, it can be seen that relying on the cognitive abilities of large models, AI glasses are transforming from intelligent hardware to intelligent agents. AI must not only "explain" the outside world to users but also "substitute" for users in performing operations that originally required manual intervention, from a visual and auditory perspective. Considering that future AI glasses may integrate ZhiXiaoBao with Ele.me, the "automation level" of ordering coffee through AI glasses will be higher than that of mobile phone intelligent agents.
This is the consensus reached by the entire AI hardware industry, and this route is also the core logic for the implementation of large AI models in various intelligent hardware.
How can AI hardware become a rigid demand?
Regarding how AI hardware can meet user needs, it may be helpful to refer to the product perceptions of Li Yong and Zhu Mingming.
"When developing Tmall Genie, we found that most users were children. So, we wanted to try intelligent hardware targeted at children. By combining AI large models with plush toys, we provide companionship for children." Li Yong, the founder of Tmall Genie, introduced to Guangzhui Intelligence as he transitioned from intelligent speakers to AI toys.
Zhu Mingming believes that the core lies in price, experience, and core functions. In terms of price, Rokid Glasses are comparable to Meta AI glasses, with a price range of over 2,000 yuan, which is basically the average price for similar AI glasses. At the same time, to make AI glasses more acceptable to more channels and consumers, Rokid also plans to allow consumers to purchase glasses at any offline optical store.
Summarizing the product perceptions of these two CEOs, some commonalities in creating AI hardware have emerged—inheriting existing demands while relying on AI functions to create sufficient differentiation.
"The core of Meta Ray-Ban AI glasses' success is still creating a good pair of sunglasses. But the brilliance of Meta lies in the fact that the AI version is only a few dozen dollars more expensive than the regular version."
As summarized by an investor after research. Although Meta's AI glasses also have issues such as weak AI recognition capabilities on the client side and blurry camera quality. However, only when the hardware itself meets the standard will consumers consider paying the AI premium, thereby transforming it into a rigid demand for AI.
From this perspective, observing the emerging AI hardware in this round, the goal of "replacing" existing hardware still seems a long way off.
On the AI glasses front, the market is expected to be initially saturated in 2025. However, in terms of price, they still cannot practically compete with ordinary glasses.
Currently, the JieHuan AI glasses are the closest in pricing to ordinary glasses, with a starting price of 699 yuan and a weight of 30.9g, targeting ordinary glasses. Besides, other players that have announced prices, although equipped with display modules, are still a bit expensive at up to 2,000 yuan.
A Huaqiang North electronic accessory supplier revealed to Guangzhui Intelligence, "If only the audio configuration is made, it can be done for about tens of yuan." In other words, just the display module alone has a consumer-end premium of around 1,500 yuan. Given this price difference, it's no wonder Lei Jun only expected sales of 300,000 units for Xiaomi AI glasses.
Regarding the higher pricing of AI hardware, some entrepreneurs believe it is a result of market pricing. As domestic acceptance of consumer-grade AI hardware is not yet high, and the market has not yet reached saturation, pricing models tend to be more in line with international standards.
An entrepreneur in the AI toy sector introduced, "Our products sell better overseas. Chinese families tend to view AI toys more as educational tools, and parents are less receptive to the companionship functions highlighted by AI large models." Guangzhui Intelligence also found that pricing a "dialogue box" that can voice-activate a plush toy, considering only one year of play for a child, would be in the hundreds of yuan range.
Faced with such difficulties, Li Yong also revealed that starting a business has been challenging in the past two years, with financing not easy to obtain. "To impress investors, it's not enough to just have a demo; you also need to present a product and demonstrate PMF (Product-Market Fit)."
However, in the long run, the pricing of AI hardware will eventually decline. In the era of large models, AI infrastructure has come a long way.
The most notable change is that the software functions of AI hardware can be realized by connecting to the APIs of large AI models from major companies. AI hardware manufacturers can freely choose responses from large models such as Doubao, Kouzi, Tongyi Qianwen, and ERNIE Bot based on the content type of interaction. Considering the possibility of long-term price wars among cloud vendors providing basic large models, AI hardware products are expected to see their overall prices fall back to a range slightly higher than that of the previous generation of hardware, as key components are mass-produced and competition within the industry intensifies.
Referring to the disruption caused by smartphones to feature phones. After AI hardware manufacturers gain the ability to replace "old hardware," with continuous competition within the industry, rigid demands based on AI may emerge.
Before that, however, entrepreneurs in AI hardware still have numerous details to resolve.
AI hardware still needs more trial and error
The identity of technical personnel is a label for most entrepreneurs in this round of AI hardware. However, this identity can, to some extent, be detrimental to product design.
Numerous past cases have shown that technically-oriented entrepreneurs sometimes "extol" technology, resulting in the launch of overly "idealistic" products.
For example, Rabbit, founded by Lv Cheng, the former general manager of Baidu's smart home hardware. In an interview, regarding the development pace of Rabbit R1, Lv Cheng stated that the entire process was very smooth, with virtually no hesitations. Specifically, the product appearance was determined with partners in just 10 minutes. From sketching to product launch, it only took two months.
This smooth development cycle, however, resulted in a "bug-ridden" product. In terms of business models, as Rabbit R1 relies on calling large AI models in the cloud for interaction, Rabbit has to pay a fee to the large model vendor for each user interaction. Moreover, Rabbit does not want to charge users a subscription fee.
As a result, issues arise. How long can Rabbit's hardware profits support user usage?
Even though Lv Cheng explained, "In great innovations, you must first focus on the innovative aspects and then consider profitability." However, from a common-sense business model perspective, a product that continuously generates negative assets for a company has effectively reached the end of its service life. Against the backdrop of the expected continuous iteration of cloud-based large models, Rabbit has set the service life of its AI hardware at a brief one and a half years.
Through Rabbit's experience, it can be seen that designing a successful AI hardware product requires considerable refinement.
At the hardware design level, Zhu Mingming believes that the primary focus is balancing cost and product form. "For example, if a dual-optic module is used for display, the effect is indeed better, enabling 3D display. However, in that case, the optics would need to be placed in the middle of the glasses, making the product form different from ordinary glasses."
To create AI glasses that better align with consumer perceptions, many AI glasses manufacturers have also chosen to collaborate with traditional glasses manufacturers. They hope to quickly gain industry experience and improve product completion through this approach. For example, Hive Tech has collaborated with optical chain stores such as Besta and BMW Eyewear; Rokid has partnered with Bolon; and Sanag has chosen to work with LOHO, a Hong Kong fast-fashion eyewear brand.
After the hardware form "passes," designing core functions and selling points is also quite complex.
"Teams with a purely technical background may lack scenario insight." In Li Yong's view, implementing AI hardware cannot rely solely on AI capabilities. "In the marketing process, due to the diverse capabilities of large models, consumers may not necessarily like the functions recommended by manufacturers during actual sales but will instead explore new functions on their own."
To solve the problem of function design, Zhu Mingming believes that building an ecosystem is key. In Rokid's press conference, we can also see from the list of partners such as DingTalk, Alipay, iQIYI, Bilibili, Taobao, and Zhixiang Weilai that AI glasses require support from industry developers. Just like Apple operates the App Store, the functional design of AI hardware will also obey the "80-20 rule."
"What we can do is to first complete the core functions of photography, video, translation, and live streaming, and leave the rest to developers. We will refine 20% of the functions, while the remaining 80% will be handled by more professional people. For example, applications like enlarging text on newspapers are the creativity of developers. Only in this way can we maintain brand competitiveness in an expected fiercely competitive market. Just like mobile phone companies, the true giants are those who refine the details."
Regardless, the future of the entire AI hardware industry is quite optimistic. After all, the maturation of AI hardware occurs in stages. The toy sector may mature first, while glasses, AR, and VR may lag slightly.
As Li Yong lamented, "As an application company of large models, we hope to bring joy to children, who will face a world with even more AI in the future." The same applies to us.