11/06 2024 496
With the development of artificial intelligence technology, compared to the previous IoT era, today's AIoT may bring us more imagination. As we all know, generative AI has been injecting new impetus into various industries, and the smart hardware industry, with AI as its carrier, is no exception. More intelligent multimodal large models have endowed smart hardware with more possibilities, and a new human-machine interaction experience and intelligent transformation are taking place.
At the recent RTE2024 IoT session, industry leaders such as Wu Changru, head of the IoT industry at Agora, Tan Guohao, co-founder of Hipapa, Du Chao, head of XiaomiVela Open Source, Yang Wang, president of the software department at Lianou Technology, Shi Zehong, general manager of the value-added business department at Megvii, Xu Weien, technical director at Zuo Zhen, and other industry leaders gathered to share from different perspectives on how GenAI drives innovation in smart hardware, as well as new technological trends and scenario implementations in the AIoT era.
In the AIoT era, how will human-machine interaction change?
In 1960, the command-line interface (CLI) interaction method emerged, allowing people to interact with computers by typing commands on a keyboard. In 1980, the graphical user interface (GUI) emerged, making graphical elements the mainstream for computer interaction. In 2010, the birth of the iPhone ushered in the era of touch interaction, with interaction methods such as tapping, dragging, and gestures making human-machine interaction more natural. After 2020, voice/multimodal/conversational (LUI/MUI/CUI) interaction methods gradually matured. This year, the release of the OpenAI Realtime API marked a significant advancement in real-time interaction, making communication between humans and AI as natural as communication between humans.
With the advent of multimodal, conversational interaction, and large models, the AI Agent, a simulated human intelligent behavior system with a large language model (LLM) as its core engine, has also become popular. Its advantage lies in its ability to perceive the environment, make decisions, and execute tasks to achieve specific goals. Wu Changru, head of the IoT industry at Agora, stated that with the rapid development of AI technology, AI Agent hardware products are flourishing and bringing new intelligent experiences to various fields. These products can not only achieve high automation and personalization through artificial intelligence but also interact naturally with users through hardware devices, with smart hardware + AI Agent bringing about a true upgrade of scenarios.
Judging from the current market situation, there are currently two main application directions for smart hardware + AI Agent: one is as a productivity tool, and the other is for emotional companionship. Wu Changru believes that smart hardware + AI Agent will drive a shift in the IoT field from one-time hardware sales to long-term service provision, with manufacturers being able to obtain continuous revenue sources through subscriptions and value-added services in the future.
He also introduced that to better solve the real-time interaction issues brought about by AI Agents, Agora previously launched the Agora AI Agent x IoT smart hardware solution. This solution enables rapid access to large models on low-power, low-compute chips, featuring low-latency real-time interaction and low-cost, flexible adaptability. It builds authentic and natural AI voice interaction experiences in smart hardware scenarios through rich functionality. Currently, the Agora AI Agent x IoT smart hardware solution already provides capabilities including large network real-time transmission, audio processing, speech recognition, text processing, video processing, etc., supporting application scenarios such as smart butlers, security assistants, virtual companions, life assistants, and real-time translation.
Xiaomi's exploration and practice in the IoT operating system field
When discussing the development of the IoT field, Xiaomi is unavoidable. Data shows that Xiaomi IoT currently has 822 million connectable devices, with 96.9 million monthly active users on the Mijia APP and 16.1 million users owning 5 or more devices. Whether in terms of scale or activity, Xiaomi IoT can be considered a globally leading consumer-grade IoT platform. So, from Xiaomi's perspective, what new evolutions will AI + IoT bring?
Du Chao, head of XiaomiVela Open Source at Xiaomi, believes that fragmentation is the core pain point in the IoT field. He introduced that to solve this fragmentation issue, Xiaomi initiated the development of the Vela self-developed operating system in 2017, aiming to bridge fragmented IoT applications and provide unified software services on various hardware platforms, building infrastructure for the prosperity of IoT.
After several years of development, XiaomiVela is currently applied to over 50 million devices in the smart wearables and smart home fields. Its five advantages of flexible deployment, cross-terminal interconnection, edge AI, security assurance, and developer ecosystem enable device manufacturers to successfully develop high-experience smart products with minimal R&D investment and the shortest development cycle.
Du Chao stated that the combination of AI and IoT will bring an unimaginable intelligent experience to everyone. Today's IoT devices are no longer limited to individuals; the broader family usage scenarios necessitate services reaching a wider audience and permeating the entire human-vehicle-home ecosystem. AI has great value in this trend, as it can not only be used to gain insights into various life scenarios, accurately capture and analyze user intentions, but also promote seamless collaboration between multiple devices and applications through intelligent orchestration of multiple applications.
In the future, through the innovative model of AI + quick apps, it is expected to achieve a fundamental shift from traditional apps based on user instructions to proactive services based on user intentions, completely reshaping cross-terminal experiences.
Cloud-edge integration injects intelligent genes into hardware products
As an AI company focused on IoT scenarios, Megvii has profound insights and extensive practices regarding the future development of AI + IoT. At the RTE2024 IoT Forum, Shi Zehong, general manager of Megvii's value-added business department, presented a sharing titled "Cloud-edge Integration Injects Intelligent Genes into Hardware Products."
According to Shi Zehong, as a pragmatic leader in the AI industry, Megvii has consistently provided industry users with end-to-end solutions based on large models, leveraging its full-stack technological capabilities encompassing algorithms, systems, and hardware, and fully embracing the new wave of AI. In terms of hardware, Megvii focuses on the development of sensor-type and robotic hardware products. In terms of systems, its self-developed AI productivity platform Brain++ enables more efficient and cost-effective deployment of large models. In terms of algorithms, Megvii has continuously accumulated expertise in both general and industry-specific large models, launching the Megvii Taiyi large model and algorithm production platform AIS. In AI hardware, it also continues to invest and practice in the directions of chip sensors and robotic hardware.
For application implementation, one set of the Brain++ algorithm support system, two Megvii AIoT platforms, the Megvii AI algorithm service platform, and a series of embedded module hardware together constitute Megvii's 1+2+N cloud-edge integrated solution.
Shi Zehong also highlighted the Megvii AIoT platform, which enables rapid product development, at the scene. It is reported that the Megvii AIoT platform can centrally manage devices, data, local algorithm applications, cloud algorithm applications, and large model applications, providing private cloud integration for customers and quickly completing one-stop connections for apps, mini-programs, and devices. Relying on Agora's capabilities, the platform can already provide a lower-latency video experience and algorithm dissemination experience.
AI drives new consumption upgrades in hardware: baby monitoring, spatial gesture interaction, smart glasses
From changes in interaction facilities, reshaping of cross-terminal experiences, to cloud-edge integration, we have witnessed the infrastructure construction and technological changes in the AIoT era. At the practical application level, several practitioners have also brought different evolutionary demonstrations.
Starting with Hipapa, for enterprises engaged in edge computing, how to better integrate with large models and provide better product experiences is particularly important in the AIoT era. Taking Hipapa's baby monitoring device as an example, currently supported by AI technology, it already has functions such as AI face-covering reminders, crying detection, and sleep monitoring. In the future, through AI technology, such products can not only become family parenting assistants but also customize educational content based on each child's specific situation, making them more intelligent and humanized.
In the view of Tan Guohao, co-founder of Hipapa, AI's future empowerment of the hardware consumer sector will be comprehensive. In terms of personalized interaction, AIGC technology can analyze user data and behavior to generate content that meets individual needs, making devices more humanized. For example, AI in smart hardware can automatically generate music, stories, or interactive games, actively soothing or helping children learn. In terms of emotion and demand recognition, AI + smart hardware can generate content suitable for specific emotions or needs and generate personalized voice dialogues based on children's emotions, actively improving users' emotional states.
Tan Guohao believes that AIGC is leading the transformation of smart hardware. AI is no longer just passive monitoring and responding; it will interact with users by generating meaningful content, driving smart hardware to shift from a tool to a service.
As one of the hottest industries currently, is it possible for live streaming to break language barriers and innovate interaction methods through AI + smart hardware? An excellent example is the L-Ring 2 demonstrated by Yang Wang, president of the software department at Lianou Technology, at the RTE2024 IoT Forum.
According to the introduction, the latest spatial ring L-Ring 2, driven by gesture algorithms and empowered by AI capabilities, is an exploration by Lianou Technology in the field of AR and hardware integration. L-Ring 2 can not only provide accurate speech recognition and real-time language translation capabilities for live streaming scenarios but will also integrate technologies such as voiceprint-simulated speech synthesis, emotion simulation, and lip-sync video synthesis in the future, making live streaming translation more natural and smooth. In terms of interaction methods, L-Ring 2 can easily interact through gestures, eliminating the need for touch screens and controllers, enhancing the user experience.
It is undeniable that presentation lectures, large-space interaction, real-time control, intelligent driving operations, and live streaming interactions are becoming application scenarios for real-time spatial gesture interaction. Yang Wang also stated that future gesture recognition algorithms combined with AI+RTE will provide an integrated solution for enhanced real-time interaction on live streaming platforms. As a company deeply engaged in large-space technology, Lianou Technology will continue to combine RTE, spatial algorithms, gesture recognition algorithms, AI large models, and other technologies, striving to achieve seamless integration between the real and virtual worlds and provide users with a more immersive interactive experience.
In addition to products such as baby monitoring and spatial gesture interaction, XR technology, represented by VR/AR, has also experienced explosive growth in recent years and is widely applied in various industries. What changes will occur when XR, AI, and the Internet of Things are integrated?
At the scene, Xu Weien, technical director at Zuo Zhen, highlighted the AR glasses launched by Zuo Zhen. He pointed out that the glasses can break down time and space barriers, increasing connections between people, objects, spaces, and digital content, and have been applied in scenarios such as smart healthcare, education, and smart buildings. In addition, Xu Weien also introduced Zuo Zhen's remote multi-person collaboration solutions, including XR exhibition halls and exhibitions, immersive simulations, remote collaboration, and 5G live streaming. In various application scenarios, experts can assist on-site personnel with issues online through real-time interaction technology, truly realizing real-time remote collaboration.
In Xu Weien's view, XR is not just a technology but a transformation. It unleashes the potential of Gen AI to new heights, allowing people to explore infinite possibilities more intelligently between the virtual and real worlds. The combination of the two will bring innovation and improvement to various fields and industrial developments in the future, continuously unleashing unlimited potential.
It has to be acknowledged that under the wave of GenAI, the AIoT era is rapidly approaching. In this new era, the forms of human-machine interaction, product commercial value, and profit models will change. What remains unchanged is Agora's initial aspiration to welcome the new era and embrace new technologies together with practitioners from various scenarios and fields.