Input Methods: The Rising Star in AI! WeChat, Doubao, and Qianwen Lead the Charge into the Voice Input Era

06/25 2026 496

The application that aligns most closely with user intent.

Last week, "Dujia" exclusively revealed that Alibaba's Qianwen team is set to launch a mobile input method. Unlike the previous desktop input method component integrated into the Qianwen PC version, this new Qianwen input method will be a standalone mobile application.

Now, all three tech giants—BAT (ByteDance, Alibaba, and Tencent)—are in the fray.

WeChat's input method began integrating AI two years ago and has undergone multiple significant updates centered around AI voice input this year. Doubao input method, which prioritizes voice input, made its mobile debut at the end of last year and introduced a desktop version six months later.

As for Qianwen, after launching an AI voice input method for the Qianwen PC version last month, it's now logical to release a mobile version. Unsurprisingly, the product will also focus on "voice input."

Image source: Qianwen

The significance that internet giants—or rather, AI giants—place on input methods, particularly "voice input," is no secret.

Many may wonder why input methods are worth revisiting when they are already a mature category, covering all essentials—9-key, 26-key, dual spelling, Wubi, handwriting—and veterans like Sogou, iFlytek, and Baidu have been in the game for years. Do big players really need to start from scratch?

The answer is a resounding yes.

Two years ago, Leitech published an article titled "All Accessing Large Models, Input Methods Tell New AI Stories." Back then, major input methods were integrating large models, experimenting with various AI features, and cramming in many AI capabilities already found in chatbots, albeit in a somewhat crude manner.

To be honest, most of these features were hastily added due to FOMO (fear of missing out). At least from feedback by Leitech editors and friends, users didn't want an input method overloaded with features, whether on desktop or mobile.

Image source: Leitech

But this wave of "voice input" represents a return to basics. Instead of focusing on flashy features, it refocuses on the "input" method and experience, using AI to revolutionize input methods once again.

AI voice input surges ahead—the era of typing with your voice has arrived

I must admit that a year ago, I rarely used voice input.

It's not that I didn't want to—after all, speaking is a far more relaxed input method than typing, especially on a phone. The root issue was recognition accuracy. A sentence would often have several wrong words; with non-standard Mandarin, proper nouns, or mixed Chinese-English, results would frequently go astray.

The result? I'd try to save a few keystrokes but end up staring at the screen, checking sentence by sentence, and moving the cursor to correct errors. The effort saved by speaking was offset by manual corrections. This experience was particularly discouraging.

In short, if voice input made too many mistakes, users would revert to the keyboard. Typing might be slower, but at least the results were relatively predictable.

But things have changed now. On one hand, AI technology—or more precisely, advances in speech recognition and language models—has made a significant difference. Many Doubao app users would have noticed long ago that voice input can now fully meet the input requirements for AI interactions.

Not just Doubao input method. In fact, my go-to mobile/desktop input method is still WeChat's input method (more on that later). The key point is that since the 3.0.0 update for iOS/Android at the end of last year, WeChat input method has been optimizing and iterating around "voice input"—upgrading the voice input large model, improving recognition capabilities, and refining the voice input experience.

In the latest update, the WeChat input method's all-platform version once again upgraded its voice input large model and added features like automatically removing filler words, smart punctuation/paragraphing, etc.

Image source: WeChat

Leitech readers may have seen our previous article, "Voice Input Method Showdown: Doubao/Qianwen/Sogou/Typeless—Who is the Ultimate 'Voice Substitute'?" We compared four desktop AI voice input tools: Doubao input method, Qianwen, Sogou input method, and Typeless, so we won't delve into details here.

In testing, Doubao used real-time transcription, essentially outputting text as you spoke, with initially misrecognized content corrected as the conversation continued. Qianwen was slower, often taking 3-4 seconds for short texts and 5-6 seconds for longer ones, but its accuracy, natural sentence breaks, and colloquial refinement were solid.

Regardless, overall voice input accuracy has improved significantly across the board, covering both desktop and mobile.

Image source: Leitech

The results are clear. Over the past six months, I've often used voice input outdoors and at home. From my experience, even with my non-standard Mandarin, most content is accurately recognized. Occasional errors still need correction, but the frequency is low enough not to disrupt my flow.

To sum up, large models have filled a crucial gap. Older voice input was like a transcriber, aiming to convert sound to text. Now, AI input methods begin to understand whole sentences. They correct homophones based on context, auto-add punctuation/paragraphs, remove filler words like "um," "ah," "well," and handle repetitions and self-corrections. What users say is a raw idea with quirks; what appears on screen is polished text ready to send.

The gap is now clear.

On the other hand, with AI advancements, voice input's advantages are simply too compelling—most notably, reducing input burden.

Think about our daily typing: we need to watch the keyboard, select characters, fix typos, and translate thoughts through fingers onto the screen. Speaking, by contrast, is closer to daily conversation—you think, and you speak. This gap widens rapidly when walking outdoors. I can keep inputting while watching the road, without constantly looking at the screen, and with far less mental stress.

Image source: Leitech

Efficiency advantages have long been validated by research. In 2016, teams from Stanford University and Baidu compared mobile voice and keyboard input under lab conditions. Mandarin voice input reached about 123 words per minute, while Pinyin keyboard input reached about 43 words per minute—nearly 2.9 times slower. Of course, lab short texts don't directly represent real environments like subways, streets, and offices, but they at least explain voice input's natural upper limit—as long as recognition is accurate enough, speaking is usually much faster than typing on a phone.

Moreover, voice input (text) is more "friendly" to recipients than pure voice—not just for WeChat contacts but also for reviewing memos, various apps and websites, and AI interaction inputs. Notably, relatively complex inputs often require us to pause, think, and continue based on context. Pure voice is far less efficient and user-friendly than voice input as an interaction form.

Furthermore, as mentioned earlier, despite significant improvements, both typing and voice input inevitably have some errors, but both can be corrected relatively easily.

Capabilities can be many, but interaction must be light

While Doubao is often teased for offering little more than emotional value, its strength in Chinese voice input and output is undeniable. Among the BAT trio, Doubao was also the first to translate this advantage into "voice input" for input methods, launching the Doubao input method in November last year.

However, when it comes to the actual product experience of a "mobile input method," Doubao input method still has plenty of room for improvement as a newcomer.

Especially compared to a mature product like WeChat input method, while Doubao input method has similar basic functions and settings—word association, clipboard, verification code filling, and even dual spelling support, with desktop and mobile versions—it lacks cross-device sync (including personal word banks, images) and device-switching assistants.

Keep in mind, Doubao input method currently lacks WeChat input method's matching code mechanism or other mainstream input methods' account systems.

Coupled with WeChat input method's excellent voice input performance, I still primarily use WeChat input method.

AI is not and should not be the only factor in our product choices. Conversely, I hope the upcoming Qianwen input method app will offer a great product experience. On one hand, Qianwen has already proven its strong voice input capabilities on the computer. If these capabilities carry over to mobile, Qianwen input method will at least have a solid entry ticket.

But a mobile input method called hundreds of times a day can't rely solely on model prowess. Whether word banks, common phrases, and clipboards sync across phone and computer, how fast voice activation is, whether real-time transcription is supported, how much control users have over long-text organization to avoid AI over-rewriting intent, and whether it works stably in weak or offline conditions—all these greatly affect today's input method experience. As Steve Jobs said, "You've got to start with the customer experience and work backward to the technology." Technology can provide possibilities, but the ultimate user experience still comes down to product design and details.

The AI-ization of input methods in previous years is a case in point. Many products started from technology, easily cramming Q&A, translation, writing, search, and agents into the keyboard. But input methods have a different usage logic than AI chat tools—most users only want to quickly finish a sentence each time they bring up the keyboard. Capabilities can be many, but interaction must be light. Otherwise, "all-in-one" easily becomes bloated.

After all, input methods are fundamental tools, and between being useful and being good, there's a lot of design and detail.

Input methods remain the closest entry apps to user intent

Why are ByteDance, Alibaba, and Tencent all getting into input methods?

From a user perspective, it's not hard to understand. For each of us, input methods are simply too close. Whether chatting on WeChat, searching on Taobao, working on DingTalk, browsing, commenting on Xiaohongshu, or writing documents, input methods have a chance to appear whenever users need to convey ideas to their phones. They don't belong to a single app but can cross nearly all apps and are one of the most frequently called system entry points.

With large models, input methods can process a layer closer to intent: what users want to say, how to say it, whether the sentence needs translation, refinement, or summarization. The voice expression form gives this entry point even more raw information, capturing speech speed and pauses.

For ByteDance, Doubao input method can bring Doubao's model capabilities beyond the Doubao app into more scenarios like chat, search, and work. For Tencent, WeChat input method connects WeChat's social ecosystem, Sogou input method's accumulation, and Hunyuan model, with both existing users and the richest Chinese communication scenes. For Alibaba, Qianwen input method has the chance to permeate e-commerce, payments, maps, work, and content creation, transforming Qianwen from an AI assistant that needs active opening into a readily callable underlying capability.

From my own use, voice input has moved past the "occasional emergency" stage, but it still can't replace all keyboard scenarios—it's inconvenient to speak in offices, and passwords and precise edits are still better done manually. When walking, replying quickly, or organizing thoughts, though, I'm increasingly unwilling to "type the old-fashioned way."

AI WeChat Doubao Qianwen Input Method

Source: Leitech

All images in this article come from: 123RF Royalty-Free Library       Source: Leitech

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.