AI Smartphones Haven't Gone Mainstream Yet, But Agent Smartphones Are Already Taking Off? Breaking Down the AI Strategies of Eight Smartphone Giants

06/22 2026 541

AI smartphones are evolving into Agent smartphones.

A recent social media clash in the tech community has shattered the illusion surrounding major manufacturers in the AI era.

A Xiaomi engineer openly complained on Weibo: "Now, when it comes to large models, some companies only compete in terms of volume and sentiment, engaging in bundling tactics." Onlookers speculate that this jab is aimed at Huawei's Pangu large model, which had just made a high-profile announcement at HDC 2026, with Yu Chengdong reassuming leadership and vowing to take the industry by storm.

Image Source (Lei Technology)

To outsiders, this may seem like everyday bickering, but for Lei Technology (ID: leitech), these complaints strike at the heart of the collective anxiety in the 2026 smartphone AI battleground.

Everyone realizes that simply piling on cloud-based parameters and chasing benchmark scores is no longer effective. For smartphone AI to survive today, it must follow these two trends:

One is whether on-device AI can withstand the computational demands locally;

The other is whether system-level agents can truly assist users by breaking down the barriers between apps.

Following these two threads and combining nearly six months of hands-on testing and industry observations, Lei Technology (ID: leitech) has uncovered the underlying strategies of major manufacturers to see what cards each holds in this new AI battle.

At HDC 2026, Huawei's approach demonstrated not just breakthroughs in individual technologies but a rare "full-stack, all-scenario" systematic strategy within the industry.

From foundational computing chips to top-level AI applications, Huawei is building an ecological barrier with a fully autonomous technological chain.

Leveraging the Ascend computing architecture and a comprehensive cloud-based computing infrastructure, Huawei provides a steady stream of data throughput to support the continuous evolution of large models.

On the device side, Huawei deeply integrates "Kirin chip affinity" technology, adhering to the "Tao's Law" of energy efficiency and computing power synergy, successfully housing a native 30B on-device model (including 2B active parameters) in regular RAM. Through quantization pruning and expert prediction algorithms, the phone achieves low-latency responses for frequent, small tasks during local operation while avoiding overheating and excessive power consumption.

As the intellectual core of the entire system, the openPangu 2.0 large model once again showcases its technical depth this year. It supports long contexts of up to 512K and plans to gradually open-source seven core components, enabling on-device and cloud collaboration.

At the system level, Harmony OS stands as the only fully self-developed system in China. The newly released HarmonyOS 7 places the "Agent-affinitive system architecture" at its core, directly reconstructing the relationship between applications and the system. It can disassemble and reorganize traditional applications into readily callable Skills and agents, enabling one-step service access.

Xiao Yi, the system agent at the forefront, now sees 3 billion daily activations, boasting over 2,100 system-level capabilities and 500+ partner-selected Skills. With HarmonyOS 7, Xiao Yi transforms into a super scheduler with spatial awareness, capable of executing complex cross-device tasks.

Huawei's greatest advantage lies in its ability to finally connect chips, systems, ecosystems, and AI into a closed loop, providing users with multiple devices a highly cohesive Agent experience and further solidifying the barriers of the Harmony ecosystem.

Apple's Apple Intelligence and the new Siri AI unveiled at WWDC 2026, while still a mirage for Chinese users, showcase a high level of system-layer integration. Apple's core large model, AFM, is essentially a modified and privatized version of Google's Gemini recipe.

Apple has delved deeply into interaction and system permissions. Siri AI now features an independent interaction history app and multimodal screen perception capabilities. It can calculate shared expenses directly from on-screen bills and cross-app searches through personal emails to automatically plan a three-day trip.

At the application layer, Safari and shortcuts introduce Vibe Coding. The Photos app can even use on-device models to re-render 2D photos into spatial composition wallpapers with Z-axis depth information.

With a stringent Private Cloud Computing (PCC) architecture, Apple maintains its privacy standards. By leveraging Google's brainpower to boost its own intelligence while firmly controlling system scheduling, Apple remains as calculating as ever.

Combined with its software ecosystem influence, Siri AI may become the fastest-growing Agent.

Xiaomi represents the deep dive into on-device computing power in 2026. This year, the company invested over 16 billion yuan in AI and just released the Xiaomi MiMo-V2.5-Pro flagship base, with on-device active parameters soaring to 42B.

To adaptation (adapt) such a massive model across more devices, Xiaomi specializes in FP4 (4-bit floating-point) quantization technology. It maximizes native inference accuracy while drastically compressing model size, with a specially tuned version achieving generation speeds of up to 1,000 tokens/s on general-purpose GPUs.

With computing power as its foundation, Xiaomi previously initiated a small-scale closed beta for Xiaomi miclaw, a native mobile agent. It operates at the system's core, capable of invoking over 50 system-level tools. For instance, upon receiving a ticket purchase SMS, it can automatically read the message, create a calendar event, set an alarm, check the weather, and even pre-open a transit code—all in seven steps, requiring only your final confirmation.

Even more formidable, it fully integrates with the Mi Home IoT ecosystem, accessing and scheduling over 1 billion smart devices.

Instead of competing solely on software applications, Xiaomi leverages Agents to activate its vast Mi Home hardware ecosystem—this is its unique moat.

OPPO, which two years ago boldly declared "all in on AI smartphones," finally reconstructed its system with a cohesive AI strategy at ODC25 and ColorOS 16. Instead of chasing parameter counts, OPPO introduced three technological pillars:

On-Device Compute achieves peak theoretical performance of 300 tokens/s locally and supports 128K long contexts;

PersonaX memory symbiosis engine builds multimodal "lifelong memories" for users;

Agent Matrix intelligent agent ecosystem framework empowers Xiaobu with cross-device task execution capabilities.

At the functional level, activating "One-Tap Flash Note" while watching Bilibili allows AI to generate outlines and mind maps nearly in real-time. Clicking on the outline timeline instantly jumps back to the corresponding video segment. One-tap bookkeeping and meal code recording via image recognition are also practical, complete with exclusive dynamic icons.

For ordinary users, these "small joys" that save daily hassles and grow more intuitive with use are far more perceptible than model computing power and technical details.

As one of the earliest domestic manufacturers to dive deep into self-developed large model matrices, vivo has been sprinting toward lightweight on-device AI since releasing the Blue Heart large model in 2023.

Vivo understands user pain points: Does AI become useless without an internet connection? Through Xiao V Memory 2.0, vivo constructs a fully offline knowledge graph directly on the phone. Even without an internet connection and with absolute privacy protection, Blue Heart Xiao V can still accurately retrieve information from vast photos and complex files.

As mentioned earlier in Lei Technology's (ID: leitech) hands-on testing, while budget phones struggle with large models, the flagship vivo X300 Pro can handle complex image recognition in just 32 seconds. This deep expertise in computing power scheduling generates high expectations for the upcoming on-device AI foldable vivo Fold 6.

In 2026, Honor largely avoids discussing its large model parameters, instead focusing on clever strategies to reconstruct underlying interactions and hardware forms.

In terms of hardware, the Robot Phone showcased at MWC 2026 features a miniature three-axis mechanical stabilization gimbal on its back, allowing the lens to automatically track subjects like a neck or even move to the rhythm of music—offering a physical interaction approach for homogenized imaging flagships.

On the system side, the YOYO agent, based on the AHI (Personal + Global + Edge Collaboration) strategy, can automatically execute over 3,000 scenarios and was the first to integrate with WeChat's A2A protocol.

Honor's strategy of avoiding direct competition and using system-level scheduling to connect third-party vertical large models has proven surprisingly smooth in breaking down app silos.

As the parent of the Android ecosystem, Google's ambitions in the AI era extend far beyond being just an app—it aims to fully control the system foundation.

In on-device deployment, Google introduced the Gemma 4 model, designed for completely offline operation, and tested the Mobile Actions feature in Google AI Edge Gallery, attempting to convert natural language instructions directly into system-level operations.

While Lei Technology's (ID: leitech) earlier hands-on testing revealed poor performance on budget phones, this is actually Google setting industry standards—using system-level software ecosystem requirements to pressure chip manufacturers like Qualcomm and MediaTek to accelerate the decentralization of mid-to-low-end NPU computing power.

Google's most formidable trump card lies in its ecological dominance. With the Google suite above and the Android ecosystem below, coupled with the strong capabilities of Gemini itself—when Apple needs deep integration with Gemini and when Android flagships use it as the core brain for all-scenario Agents—Google has already secured a dominant position.

Google is not just setting system-level scheduling standards for on-device AI; it's reissuing tickets for the mobile ecosystem of the next decade.

In the race for on-device large models and Agents, Samsung takes a highly pragmatic approach: starting too late, so why not outsource?

In overseas markets, Samsung deeply binds with Google, using the Gemini large model as the foundation for the Galaxy S26 series. During MWC demonstrations, its Agent could scan family group chats in the background, detect discussions about ordering pizza, and automatically open food delivery apps to add items to the cart—only pausing for user confirmation before checkout.

In the Chinese market, to comply with regulations, Samsung flexibly integrates AI services from domestic giants like Baidu's ERNIE Bot and Meitu.

While this may seem like piecing together capabilities from various sources, Samsung's expertise in refining the user experience shines through.

Whether it's circle-to-search, real-time call translation, or intelligent photo editing, Samsung seamlessly integrates these seemingly patchwork features. As long as they work smoothly with its top-tier hardware to keep consumers satisfied, the underlying engine doesn't matter.

Running large models directly on-device sounds ideal—no network dependency, zero latency, and absolute privacy protection.

But the reality is that flagship phones enjoy the benefits of on-device AI, while mid-to-low-end devices suffer.

In April, Google released the Gemma 4 model for completely offline mobile operation, receiving widespread acclaim in tests run on high-end flagship phones.

However, when Lei Technology (ID: leitech) tested it on the vivo Y500 Pro, powered by a mid-range Dimensity 7400 chip with NPU 655, the results were eye-opening.

Recommendations became a minefield of irrelevant information: When asked to suggest movies for a long high-speed rail (high-speed rail) ride, it churned out 500 words locally, taking 2.8 minutes and ending with unnecessary reminders to bring headphones.

Logic puzzles left it stumped: Solving a seating arrangement problem, it calculated on-screen for 3.3 minutes (without allowing background tasks) and still produced an incorrect answer.

Image recognition crashed the system: Given a photo of a large mall, it failed to recognize the prominent Apple Store sign. Presented with a green plant image, it got stuck loading for 5 minutes before crashing.

In contrast, the same model on the flagship vivo X300 Pro solved the logic puzzle in 1.6 minutes and recognized images in just 32 seconds.

This is the harsh reality of the industry: Without powerful hardware computing power, on-device large models are nothing more than marketing gimmicks that torture users.

To address the bottleneck of local RAM and bandwidth overload, manufacturers are modifying algorithms at the foundational level.

For instance, Xiaomi specializes in FP4 (4-bit floating-point) quantization technology, maximizing native inference accuracy while drastically compressing model size, achieving generation speeds of up to 1,000 tokens/s on general-purpose GPUs.

Infinix takes a practical approach, compressing offline models into phones to enable real-time translation of complex dialects in regions with poor network connectivity like Africa and the Middle East—effectively eliminating the digital divide through on-device AI.

AI smartphones in 2026 are essentially vying for operating system entry points.

Manufacturers collectively suffer from entry point anxiety, cramming AI into power buttons, negative screens, and sidebars—some even testing physical AI buttons.

But the more entry points pile up, the more confused users become. Truly practical Agents should enable phones to complete operations autonomously, reducing user steps.

Historically, the industry's biggest headache has been app silos. Phone assistants trying to send a WeChat message had to rely on brute-force screen reading and simulated clicks, often getting stuck on risk controls—as seen with last year's Doubao phone.

Recently, WeChat finally opened a crack in its door, launching the A2A (Agent-to-Agent) protocol with Huawei, Honor, Xiaomi, and other major manufacturers. Large models no longer pretend to misunderstand—phone assistants now send work orders directly to WeChat's Agent, which executes and returns results.

Lei Technology's (ID: leitech) hands-on testing with the Honor Magic8 RS revealed that after activating YOYO and saying, "Tell Sandwich to launch Genshin Impact," the system could execute the command across ecological barriers in one sentence.

Without A2A integration, phone assistants could only go as far as opening WeChat before being blocked by system pop-ups.

WeChat's openness provides the industry with a blueprint for efficient collaboration between large manufacturers' agents without competing for traffic.

At MWC 2026, we also saw excellent examples, such as the Nubia M153, which uses on-device Nebula-GUI to run virtual machines in the background, simulating human finger movements to operate apps without APIs—enabling one-sentence cross-platform price comparisons and bookings.

After examining the true strategies of major manufacturers in 2026, it's clear that despite executive rhetoric, everyone is converging toward the same direction at the implementation level:

Models must compress toward on-device deployment; otherwise, low-end computing power will only torture users;

AI must evolve into multimodal Agents—from AI smartphones to Agent smartphones, future devices must grow "eyes and hands" to break down app barriers;

Privacy and security must remain an unshakable foundation.

The real war for AI smartphones isn't about who has the longer model name—it's about how much time they save ordinary people each day. After all, phones aren't thesis defense arenas; in the end, what matters isn't the parameter sheet but those few minutes spent holding the device daily.

Xiaomi and Huawei AI Phones vs. Apple

Source: Leikeji

The images in this article are from 123RF royalty-free image library. Source: Leikeji

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.