Trend | Offline + Memory: Offline Large Models Poised to Shape the Next Era of General AI

08/14 2025 401

Preface:

In the digital age, large model technology is evolving at breakneck speed, emerging as a pivotal force driving transformation across industries.

In 2025, the realm of large models underwent a fresh wave of explosive growth and structural refinement. From technological advancements to commercial applications, market competitions to industrial ecosystem development, remarkable new trends emerged. Notably, offline large models garnered significant attention at this year's World Artificial Intelligence Conference (WAIC).

Author | Fang Wensan

Image Source | Network

Multimodal Fusion: Expanding the Horizons of Perception

In the first half of 2025, large models achieved significant breakthroughs in multimodal technology, transitioning from single-text interaction to full-modal fusion encompassing "text, image, audio, and video".

This advancement enables large models to comprehend and process information more comprehensively and accurately, offering users a richer and more intuitive interactive experience.

Gartner forecasts that by 2027, 40% of generative AI solutions will adopt multimodal technology, a substantial increase from 1% in 2023.

In China, Volcano Engine unveiled new models such as Doubao Large Model 1.6 and video generation model Seedance 1.0 pro in June 2025. Notably, the Doubao 1.6 series supports multimodal understanding and graphical interface operation, allowing users to interact with the model via images and speech, significantly broadening application scenarios.

SenseTime's RDI large model has continuously iterated its technology, enhancing multimodal processing capabilities, from the native fusion modal version launched in January to the V6 upgrade in April, achieving a breakthrough in multimodal reasoning.

Kuaishou's Keling AI has established a multimodal creative productivity platform. Since its launch over a year ago, it has generated 168 million videos and 344 million images, infusing new energy into the content creation landscape.

Offline Large Models: Independent Operation and Memory Innovation

Amidst the burgeoning development of large model technology, offline large models have emerged as a new industry focus due to their unique advantages.

Their standout feature is the ability to operate independently without network connectivity, effectively mitigating service disruptions caused by unstable or interrupted networks.

In wilderness exploration, remote operations, and areas with poor network signals, offline large model devices can steadily provide intelligent services.

From a technical standpoint, offline large models facilitate localized model deployment, with data processing and computation occurring on local devices, significantly bolstering privacy and security.

Industries with stringent data confidentiality requirements, such as healthcare and finance, can leverage offline large models to ensure sensitive data remains local, reducing the risk of data breaches.

Showcasing Innovations at WAIC 2025

RockAI introduced the latest version of its non-Transformer architecture large model, Yan 2.0 Preview, which excels in offline and memory functionalities.

It revolutionizes the norm by extending the deployment reach of offline large models to the "thousand-yuan device" level, enabling low-configuration devices to perform offline real-time AI computations.

The newly incorporated "memory module" represents a significant breakthrough, akin to the hippocampus in the human brain, capable of storing crucial information during the learning process and swiftly retrieving it in new scenarios.

In a live demonstration, a robot dog equipped with this model successfully executed a custom new action learned through the memory module without preset programs. This continuous evolution and autonomous decision-making capability is challenging to achieve with traditional large models.

Google DeepMind's Gemini Robotics On-Device model also garnered considerable attention. As a VLA base model tailored for dual-arm robots, it can directly interpret natural language instructions and drive the robot to perform corresponding actions.

Its core strength lies in its ability to operate offline locally on the robot while concurrently handling visual recognition, language understanding, and action execution tasks. In scenarios demanding high real-time performance and stability, such as medical operations, disaster relief, and factory automation, it effectively circumvents delays and potential risks associated with cloud transmission.

Furthermore, Google's open-source AI Edge Gallery allows users to run large models locally on their phones, completely offline and free of charge. It supports downloading various large models from Hugging Face, enabling functions like chatting, image recognition, code generation, and text reasoning without an internet connection, catering to users' needs for AI in terms of privacy protection, local computing power utilization, and usage in weak network environments.

Intel showcased a local-based large language model adopting a three-in-one architecture, capable of operating offline. The Chinese corpus is equipped with the GIM2-6B model, boasting 6.2 billion parameters, a 32K context support capability, and pre-training with over 1.4 trillion English and Chinese tokens, resulting in robust model performance and reasoning abilities.

It also locally deploys the starcolder-15.5B model, specializing in processing programming languages, facilitating code generation and understanding for programmers in offline environments.

These companies' achievements not only demonstrate the technological breakthroughs of offline large models but also indicate their vast application potential in diverse fields, including future smart devices, industrial production, and personal privacy protection.

Conclusion: Are We Prepared for the "Final Frontier" of Intelligence?

When an autonomous mobile intelligent agent can operate offline in our physical world, the imaginative possibilities it unleashes are boundless. From elderly care companions at home to surgical assistants capable of delicate operations, to rescue team members venturing into hazardous situations, the application boundaries of robots have expanded unprecedentedly.

However, this also presents new considerations. When a machine's decision-making process is entirely local and becomes less transparent and controllable, how do we ensure the safety and reliability of its actions? When a robot can autonomously learn and act without external supervision, how do we define the boundaries of responsibility?

Google DeepMind's stride is undoubtedly a significant step towards general AI, opening the door to intelligence in the physical world. Yet, behind this door lie both unprecedented opportunities and challenges that we must navigate with prudence. This is not merely an issue for engineers and scientists but a future that each of us must start contemplating.

Content Source:

36Kr: Offline + Memory, a Turning Point in the Evolution of Large Models

Ga Yi Long Technology Teahouse: Google's Offline Robot Model, Transitioning from "Cloud Sage" to "Ground Practitioner"

Medical Device Notebook: Network Disconnection No Barrier, Google Unveils Embodied Intelligence Offline Model, Supporting Localized Deployment

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.