02/07 2026
426
By Bai Jiajia
In the fierce AI competition among smartphone brands, vivo is emerging as the 'pawn crossing the river.'
In Chinese chess, the pawn is known for its steadfastness, initially limited to moving forward. However, once it crosses the river, it gains the ability to engage in combat on both sides, becoming, in the eyes of chess masters, as formidable as the most powerful rook.
Late last year, Jiemian News reported that hardware manufacturers, including vivo, Tecno, and Lenovo, were advancing their AI smartphone collaboration plans with ByteDance. Several vivo employees confirmed that the company had indeed established cooperation intentions with ByteDance, though specific plans were still under discussion. However, when Jiemian News sought confirmation from vivo, no response was received.
What makes this news particularly noteworthy is vivo's involvement.
Judging from the previous experience with the 'Doubao smartphone,' the AI smartphone route it represents clearly pushes some sensitive boundaries within the smartphone ecosystem. While the motivations of hardware manufacturers like Nubia and Lenovo are understandable—they may hope to disrupt the market and secure a foothold—vivo's presence seems somewhat out of place.
vivo is firmly entrenched among the top five smartphone brands in the Chinese market and even claimed the top spot in domestic shipments in the third quarter of 2025, before news of the collaboration emerged.
Clearly, it doesn't need a dramatic comeback.
So, what drives vivo to abandon its past 'low-key, duty-bound' approach and, like an unstoppable pawn crossing the river, stand at the forefront of the AI transformation?
Perhaps its corporate DNA destines vivo to lead the charge in the AI era's transformation. Once it overcomes current obstacles, it will occupy a more pivotal ecological niche in the industry chain.
1. Vivo's Transformation into a 'Pawn Crossing the River' is Inevitable
The root cause of vivo's transformation lies in its 'user-driven' corporate DNA.
At first glance, this may sound like corporate jargon. After all, which company doesn't claim to be 'user-centric'? The key lies in 'driven.'
Contrast Xiaomi's and vivo's AI strategies. Xiaomi is committed to building a comprehensive ecosystem encompassing humans, vehicles, and homes, so its AI strategy serves this overarching goal. While vivo also extends beyond smartphones, all its efforts revolve around users' wearable devices. In other words, vivo has a stronger drive to understand users and excel in interaction.
This value was exemplified at vivo's 30th-anniversary celebration last September.
At the time, vivo's founder, president, and CEO, Shen Wei, reminded his colleagues of 'what duty means.' He referred to duty as the 'underlying logic of all vivo's strategies' and summarized 15 principles akin to a 'manifesto.'
One of these principles states: Duty is a user-oriented approach that ensures technology always serves people.

The question that follows is: Why does a user-driven approach make vivo the 'pawn crossing the river?'
To understand this, we must return to the two paths of AI application in smartphones: cloud-based and on-device. Cloud computing offers greater computational power, while on-device AI is closer to users and provides better confidentiality.
Regarding their practical applications, Zhou Wei, vivo's vice president and head of the vivo AI Global Research Institute, once cited a classic example.
Roughly, it means that when asking an AI about the weather, even the most powerful cloud-based large model inevitably takes several seconds or even over ten seconds to upload and receive information. In contrast, an on-device model directly opening the weather app is much faster.
This example reveals that the quality of service experience provided by AI depends not solely on the model's scale and computational power but also on a deep understanding of users and scenarios.
In some scenarios, especially fragmented and privacy-sensitive needs arising from daily user-smartphone interactions, such as searching for information in photo albums or memos, on-device models have a distinct advantage.

Clearly, a user-driven approach leads vivo toward on-device models, a path that also favors establishing its differentiation.
Vivo's demand for the cloud is relatively weaker; on-device capabilities will determine its future competitiveness. The simplest logic is that, without an internet connection, on-device AI determines a hardware device's capabilities, thereby affecting the user experience.
It's important to note that all smartphone brands are simultaneously deploying cloud-on-device collaboration. Here, the 'strength' of demand merely corresponds to the relative importance of cloud and on-device AI in each brand's AI strategy.
Starting in early 2025, vivo focused its R&D efforts on on-device models and, at the vivo Developer Conference in October of that year, unveiled its latest achievements.
In addition to releasing an operating system, it introduced the world's first 3B (3 billion-parameter) model specifically designed for on-device Agents and the industry's first on-device model training engine.

Looking back at vivo's collaboration with ByteDance, it becomes evident that the Doubao smartphone's approach of executing cross-app operations through Agents highly aligns with vivo's AI strategy.
In other words, vivo's collaboration with ByteDance merely provides observers with a window into vivo's true identity as a 'pawn crossing the river.' Regardless of whether the collaboration occurs, a user-driven vivo will move in the same direction.
2. How Does Vivo 'Cross the River?'
As a 'pawn crossing the river,' the question arises: How does vivo 'cross the river?' That is, how does it use on-device AI to achieve a better user experience?
The on-device models and on-device model training engine unveiled at the vivo Developer Conference in October last year provide the answer. Vivo aims to use them to reconstruct human-computer interaction and thereby occupy a unique ecological niche in the overall AI landscape.
Breaking it down:
Agent is the focus of vivo's newly released on-device model. To this end, the vivo team specifically constructed training data for smartphone UI operations, enabling this on-device model to naturally understand phone interfaces and execute cross-app operations.
For example, having the phone assistant order a cup of coffee, summarize key information from screenshots or pages, or help with photo editing.
Essentially, integrating Agents into smartphones is equivalent to adding an assistant between traditional human-computer interactions, tasked with mobilizing information and software within the phone to complete tasks on behalf of the user. This change manifests primarily in two ways:
First, it lowers the barrier to product usage. Users can shift from manually opening apps one by one to issuing tasks to the phone through natural language. Even elderly individuals who struggle to use smartphones can perform complex operations through Agents.

Second, it reduces the user's burden. For instance, users no longer need to remember where a specific file or photo is stored on their phone; the Agent can locate it. Simple but tedious tasks like price comparisons can also be delegated to the Agent.

Objectively speaking, the idea of reconstructing human-computer interaction through Agents is not unique to vivo. Many smartphone AI assistants also possess the ability to understand phone interfaces and execute cross-app operations.
Vivo's distinctiveness primarily lies in its introduction of the first on-device model training engine.
As the name suggests, the on-device model training engine is primarily used to train on-device models, enabling them to acquire more powerful functions.
In the past, AI learning and evolution could only occur in massive cloud-based computing centers, influenced by brand iteration cycles and lacking personalization.
With the integration of an on-device model training engine into smartphones, each generation of phones gains the ability to grow based on user operation habits. For example, users can upload some photos they have edited, and the phone can learn their unique editing habits, automatically applying the same edits next time.


Understanding this in conjunction with Agents, the latter is like 'hiring a butler' for the phone, while the former endows the butler with 'a growing brain.'
From a long-term perspective, the emergence of this 'personal intelligence with growth attributes' and training engines will have threefold impacts.
On the one hand, it enhances user loyalty to vivo. After all, who would arbitrarily 'fire' an assistant they have grown accustomed to after a long period of use?
On the other hand, users become 'optimizers' for vivo's on-device AI. By forming AI usage patterns based on their genuine needs, vivo can discover more differentiating points to build competitiveness and accelerate the rotation of its technological flywheel.
Most crucially, vivo's ecological niche within the industry chain will also change.
Take smartphones as an example.
In the past smartphone ecosystem, apps were like buildings erected on land owned by Google and Apple. Which land to enter and which building to visit had nothing to do with smartphone manufacturers.
However, once users form the habit of executing tasks through Agents, they effectively delegate decision-making power regarding 'which app to open, where to make purchases, and even where to view ads' to the Agent to a certain extent.
This ecological niche transcends any single hardware and serves as a deterministic anchor for vivo's future development.
As Zhou Wei said, 'In the future, with the realization and maturation of artificial intelligence and brain-computer interfaces, we believe smartphones will be replaced by more diverse forms of mobile devices.'
Vivo, which still primarily relies on smartphones, needs this certainty brought by on-device Agents.
3. The 'Pawn Crossing the River' Awaits Greater Consensus
During the post-conference interview at last year's vivo Developer Conference, Zhou Wei discussed with the media the 'information silo' problem faced by on-device AI as it evolves toward greater intelligence.
'Especially when smartphone agents execute tasks, we can only perform functions and applications developed by the manufacturer itself. For example, showing off smoothness, adjusting brightness, or connecting to Wi-Fi are fine. However, when you want to cross applications, there's a discussion process between the app developer's security authorization standards and the smartphone manufacturer's,' Zhou said.
This is a problem faced by all smartphone manufacturers, but for vivo, which focuses on 'on-device models and personal intelligence,' the need to break this impasse may be even stronger. Otherwise, even if the Agent's capabilities are exceptional, its inability to coordinate with apps may make it difficult to deliver a stunning user experience.
Evidence of this demand can be seen in the simultaneous emergence of news about vivo's collaboration with Tencent, alongside its cooperation with ByteDance.
Although the collaboration with Tencent did not materialize, some useful information can be gleaned.
For instance, like ByteDance, Tencent possesses a vast ecosystem, with businesses covering almost every aspect of consumers' lives—meaning it can significantly expand the Agent's capability boundaries—and wields considerable influence, helping to drive more enterprises to participate in breaking down 'information silos.'
However, while vivo's actions are proactive, it doesn't mean the company is feeling anxious.
Zhou Wei also stated in the interview that vivo has the patience and confidence to wait for a moment of 'perfect alignment' between smartphone manufacturers and the internet industry. 'In the coming years as AI technology matures, vivo will actively promote the establishment of industry standards.'

This patience may stem from vivo's shared values with major internet firms regarding protecting user data security.
When elaborating on what constitutes a 'true AI smartphone,' Zhou Wei proposed 'embracing Agents,' followed closely by 'protecting data and privacy security.' He believes, 'When AI stops demanding data, users will truly trust it.' This once again returns to the logic of user-driven behavior.
Overall, vivo is embracing the AI era more proactively.
While some of vivo's actions may appear bold on the surface, digging deeper reveals that they are all driven by users and adhere to the bottom line of information and privacy security, forging ahead like a 'pawn crossing the river' at its own pace and rhythm.
Throughout human history, technological progress brings short-term change and upheaval but long-term development and empowerment. Amidst the complex interplay between the short and long term, there will always be 'pawns crossing the river.' This is the force of gravity in technology and the inevitability of history.