Group intelligence accelerates the "science fictionization" of reality

09/27 2024 491

Imagine a future world where artificial intelligence (AI) can read, communicate, and even learn and grow autonomously in various environments, just like humans. But this is no longer the exclusive realm of science fiction; it's precisely what RockAI's edge-side large models are doing.

Drones, robots, PCs, and other terminals equipped with the Yan1.3 multimodal large model can recognize environments in real-time, accurately understand users' vague instructions and intentions, think like humans, and control their mechanical bodies to efficiently complete various complex tasks accordingly.

Amid the global wave of digital transformation, edge-side large models are demonstrating enormous application potential and market demand with their efficiency, security, and personalization.

The Most Powerful "True Edge-Side" Brain

On September 26, RockAI unveiled the Yan1.3 multimodal large model, also known as the Group Intelligence Unit Large Model, capable of lossless operation on devices with varying computing power levels.

Compared to the Yan1.0 large model first released in January this year, Yan1.3 boasts powerful multimodal capabilities, efficiently processing text, images, voice, and other multimodal information. Furthermore, Yan1.3 further optimizes the underlying neural network architecture and achieves modal partition activation through a selection algorithm based on biomimetic neuron-driven mechanisms, enabling offline and lossless deployment of the large model on a wider range of devices, even on ordinary CPUs.

In terms of intelligence, the Yan1.3 large model with 3B parameters outperforms the 8B parameter Llama3 in performance, consuming less computing power and offering more efficient training and inference.

RockAI also demonstrated the exceptional capabilities of the Yan1.3 multimodal large model across drones, robots, PCs, and other terminals.

Take the Flying Dragon drone as an example. Unlike most drones that rely on cloud-edge collaboration, the Flying Dragon's intelligent brain, Yan1.3, is directly deployed on the device end, enabling instant judgments and responses to critical information and emergencies. Its multimodal processing capabilities allow it to "listen, speak, and see" like humans, supporting intelligent inspections in various environments for applications such as power inspections, security monitoring, and environmental monitoring in urban governance and industrial settings.

RockAI explains that a drone project in collaboration with a certain vendor addresses the high cost of 5G-A. Typically, drones capture high-quality images, which are transmitted back to the flight control center via 5G-A for quick response, incurring high transmission costs. After deploying the Yan1.3 model on the drone, the device can independently determine which information is worth reporting and which can be processed locally, reducing costs and fostering an ecosystem.

The Flying Dragon drone, designed for individual users, is also highly versatile. It can be widely used in outdoor scenarios like AI photography, travel, and mountain biking, not only exploring environments and planning itineraries but also freeing up hands to act as a "photo partner," automatically capturing the best angles and selecting the best photos.

Group intelligence stems from research on the collective behavior of social insects like ants and bees, referring to the collective wisdom and decision-making capabilities formed through collaboration and interaction among multiple individuals. The group is decentralized and self-organizing, making decisions or completing tasks through information sharing and collective action among individuals. The higher the intelligence level of individuals, the stronger the group intelligence performance, which in turn enhances the intelligence of each individual in the group.

Drawing inspiration from biological group intelligence, RockAI has developed a technical approach to create a Group Intelligence Unit Large Model based on the Yan architecture, injecting unique intelligent genes into each device.

The Rise of Edge-Side AI

The emergence of edge-side AI is driven by both technological advancements and market demand. Amid the increasing complexity of cloud-based large models, high computing costs, concerns over data privacy and security, and the urgent need for personalized services have fueled the development of edge-side AI.

Edge-side AI large models refer to large-scale AI models operating on device ends, typically deployed on local devices such as smartphones, IoT devices, PCs, and robots. Compared to traditional cloud-based AI large models, edge-side AI models require fewer parameters, reducing network dependency, protecting user privacy, and minimizing latency in data transmission and processing.

Over the past few years, edge-side AI has transitioned from theoretical exploration to practical application, with rapid global development. In the first half of last year, Google launched the PaLM2 lightweight model, "Gecko," capable of offline operation on mobile devices, marking the first step in edge-side AI large models.

Since then, French startup Mistral AI released the Mixtral 8x7B model, followed by Microsoft with its cost-effective Phi-3 series of small language models. Google's Gemma model competes with Meta's Llama-2, while Apple actively advances edge-side AI development.

Domestic players like Mianbi Intelligence's MiniCPM-Llama3-V2.5 and SenseTime's SenseChat-Lite have also demonstrated strong competitiveness in the field of edge-side AI.

Despite their multiple advantages, deploying edge-side large models faces numerous challenges, particularly performance loss and limited learning capabilities due to model quantization, compression, and pruning, which can lead to operational instability.

For instance, popular AIPC solutions deploy Transformer-based models onto personal computers through quantization and compression, requiring custom PC chips to handle the 7 billion parameters. Thus, enhancing model efficiency while ensuring accuracy poses a significant obstacle to the practical application of edge-side AI.

As technology advances and market demands diversify, edge-side AI no longer solely pursues lightness and low power consumption but moves towards efficiency, personalization, and secure controllability.

Amid this revolutionary wave, RockAI's unique innovative technology and practices pave new paths for edge-side AI's broader application scenarios.

Facing the Challenges of Edge-Side AI, RockAI's Core Strategy Lies in Revolutionizing Traditional Architectures

In January this year, RockAI released the Yan1.0 model, China's first general-purpose large model without an attention mechanism. Abandoning the Transformer architecture, Yan adopts a less computationally intensive Yan architecture, achieving breakthroughs in model efficiency and cost control while significantly improving inference throughput.

After iterations, RockAI's Yan1.2 large model, launched in July, can run "natively and losslessly" on a Raspberry Pi with only one-eighth of the computing power of an ordinary PC, at a speed of 6+ tokens/s.

Successful operation on low-end devices like the Raspberry Pi proves that RockAI's edge-side models can maintain high performance while adapting to a wider range of hardware environments, truly "born for devices."

Crucially, RockAI's edge-side large models elevate user privacy protection and personalized services to new levels. With AI technology increasingly prevalent, users are more concerned than ever about data privacy. RockAI's edge-side deployment strategy processes data locally, significantly enhancing user data security.

In recent years, as computing power and AI technology have advanced, edge-side AI has begun to play a vital role in smartphones, smart homes, autonomous driving, and other fields. RockAI is accelerating the commercialization of edge-side AI and exploring the adaptability and applications of edge-side large models.

Cross-Domain Integration for Deep Empowerment

RockAI chooses device ends to explore the lower limits of models, demonstrate their operational capabilities on mid-to-low-end devices, and achieve commercialization and accessibility of large models. Initially, Yan models were deployed on Raspberry Pis with minimal computing power, with subsequent device adaptation requests largely originating from partners and clients.

For instance, Panghu robots, equipped with powerful multimodal cognitive abilities, can accurately understand vague instructions offline and efficiently complete complex tasks under Yan1.3's "brain" control, demonstrating capabilities like composing poems in seven steps and performing Wing Chun kung fu. Powered by Raspberry Pi's fifth-generation chip renowned for its low computing power, these robots achieve remarkable multimodal capabilities. Currently, RockAI's guide robots for entertainment and culture enterprises offer basic functions like introductions, scene descriptions, and free Q&A interactions.

RockAI's on-site demonstration of the XunTu Smart Assistant (AIPC) showcased its offline capabilities to understand human speech, recognize images, and quickly search for related content, accurately fulfilling vague instructions like "record and organize meeting minutes" and "delete all pictures of orange cats," safeguarding user privacy while precisely grasping user intent.

At WAIC 2024 in July, RockAI's intelligent robot "Xiaozhi" also demonstrated impressive multimodal interaction abilities. For instance, it recognized and responded to vague instructions like "move aside, I need to put something down" and even tackled complex tasks requiring coordination between the brain and body, such as "compose a poem on maple leaves in four steps," with remarkable success.

Behind these capabilities, RockAI's lossless deployment of large models onto edge devices involved disruptive innovations in large model architectures and various groundbreaking techniques.

RockAI pioneered the concept of "synchronous learning," allowing models to learn and update in real-time during inference, eliminating the need for retraining or pre-training offsite, enabling large models to build their unique knowledge systems akin to human learning.

This breakthrough promises to endow edge-side AI large models with "autonomous learning" capabilities, enabling each device to optimize itself in real-time based on user needs and environmental changes, fostering truly personalized intelligence.

To achieve lossless adaptation on devices like Raspberry Pis, RockAI further optimized the underlying backpropagation algorithm of artificial neural networks based on its proprietary Yan architecture, aiming to reduce costs and increase efficiency.

Inspired by the partitioned activation of human brain neurons, RockAI adopted a selection algorithm driven by biomimetic neurons, realizing brain-like partitioned activation. This mechanism activates model partitions based on learning types and knowledge scopes, reducing data training volume and harnessing multimodal potential. Consequently, by iteration to version 1.2, the model can now run losslessly on PCs, smartphones, Raspberry Pis, robots, and other devices.

Envisioning Future Application Scenarios, we can foresee edge-side AI integrating into smart homes, learning and adapting device operation modes based on family members' habits and needs for intelligent home control. In education, edge-side AI can become students' learning partners, offering personalized learning content and guidance tailored to their progress and characteristics. In healthcare, edge-side AI can assist doctors in diagnosis and treatment, providing health management and rehabilitation guidance through the analysis of vast medical data.

It is foreseeable that RockAI's edge-side AI robots and the Yan series of large models behind them are empowering smart devices with "brains," preparing for the in-depth empowerment of various industries through "AI+".

Accelerating the Emergence of Group Intelligence

Traditionally, group intelligence focused on simple intelligence combination, resulting in linearly scalable group intelligence with the number of agents. For instance, assembling multiple drones enhances loading capabilities beyond those of a single drone.

However, RockAI pursues a more synergistic form of group intelligence, emphasizing the complementary strengths of different individuals. Combined, these synergies can exponentially grow group intelligence. Thus, RockAI aims to foster group intelligence rather than a "god-like" superintelligence akin to OpenAI's approach.

Liu Fanping, CEO of RockAI, stated, "Group intelligence is the key path to general AI. By building a Group Intelligence Unit Large Model with the Yan architecture, RockAI seeks not just to enhance individual devices but to inject a new, fundamental intelligent gene into machines, endowing them with robust environmental adaptability and autonomous learning capabilities. When deployed, each Yan-equipped device becomes an intelligent unit, fostering the emergence of group intelligence through continuous collaboration and interaction."

As group intelligence emerges, human-machine collaboration will transform, with machines serving humans more effectively while negotiating task completion among themselves. This future of group intelligence holds more potential than we can imagine.

Behind a series of technological innovations and application explorations, RockAI has not only achieved technological breakthroughs but also redefined the core value of large models in the intelligent era for commercial applications and ecosystem building.

Currently, RockAI has implemented B-end business, expanding the Yan architecture's large model in medical, financial, energy, and telecommunications sectors for in-depth empowerment. Meanwhile, Yan model commercialization is gradually shifting from B-end to C-end, leveraging hardware-software integration to seize opportunities in the untapped C-end market.

Leveraging the core advantage of Yan architecture's "lossless adaptability" to a full range of terminals, RockAI has initiated cooperation with chip and terminal manufacturers to break hardware limitations and create more Yan-based intelligent units, enabling more consumers to equally enjoy "true edge-side intelligence."

Regarding Yan's future, RockAI aims to build a universal AI operating system based on the Yan architecture, positioning Yan as an essential platform akin to Windows, Android, or iOS, enabling all developers to build applications upon it.

When deployed on smartphones, robots, and other diverse devices, Yan models will become personal companions, learning and serving individuals based on their habits, increasingly offering personalized value. This will empower smart homes, including smartphones, computers, TVs, and speakers, with greater adaptability and highly personalized interaction capabilities, fostering an interactive, diverse intelligent ecosystem like group intelligence.

Simultaneously, RockAI faces challenges in deepening synchronous learning, integrating all modalities, and ensuring data security and privacy protection while maintaining personalization. These challenges drive AI evolution, pushing RockAI and the entire industry forward towards a smarter, more personalized AI era.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.