Kimi uses browser extensions to implement AI: more convenient than webpages, but with limited functions

07/10 2024 470

Kimi has explored a new path.

Kimi Smart Assistant, under the startup Dark Side of the Moon, may be the biggest "dark horse" in the domestic large model market. Almost since this year's Spring Festival, it has been on a rapid rise, quickly becoming a focus of discussion in the secondary market and AI circles.

Even in June, Kimi's website traffic still far surpassed other AI chatbots and AI search engines in AIGCRank statistics, including Baidu ERNIE Bot and Meta AI Search, ranking first in China.

More importantly, with its advantages in long-text processing, product-level experimentation, and free usage, Kimi has gained user recognition and favor:

From the 80-episode script of "Empresses in the Palace" to the distillation and summary of the 900,000-word original work "The Three-Body Problem," from links to files of various formats, Kimi can quickly summarize and answer, making it increasingly popular for everyday use by users for work, study, and entertainment, who share their experiences on social networks.

However, when truly delving into user scenarios, the web version of the product has its advantages but also limitations, especially for a front-end application that requires frequent use. So recently, Kimi finally launched its official browser extension:

Kimi Browser Assistant.

Screenshot from Chrome Web Store, Image/Leitech

The Kimi Browser Assistant can eliminate many cumbersome steps, not only by eliminating the need to jump to the Kimi homepage for inquiries and processing but also by allowing users to select relevant text directly on the webpage being browsed and ask Kimi for explanations or expansions.

This is not the first Kimi browser extension. Previously, developers had already created third-party browser extensions based on the Kimi web version, such as Kimi Reading Assistant. However, Kimi officials had been inactive, leading some to believe they did not favor or had abandoned browser extension development plans.

After trying the Kimi Browser Assistant, honestly, there weren't many surprises. I never expected it to offer the full capabilities of the web version, but as a browser assistant, even compared to third-party extensions, the Kimi Browser Assistant still has much room for improvement.

Maximizing Kimi, starting with the browser assistant

First, it should be noted that the current Kimi Browser Assistant only supports browsers based on the Chromium engine, such as Google Chrome. In other words, browsers like Safari and Firefox, which use other engines, are not supported. (An aside: Chrome to Chromium is like Android to AOSP.)

But considering that most browsers are developed based on the Chromium engine, most Kimi web version users can still install and use it by finding the "Browser Assistant" in the sidebar on the Kimi homepage and following the installation guide.

Image/Kimi

Additionally, from the introduction page, one can see the core functions of the Kimi Browser Assistant—selecting text for explanation, summarizing articles, and sidebar mode. Let's talk about sidebar mode separately. In fact, many ChatGPT-related plugins have already adopted this interactive design, including Microsoft's Copilot, which can even achieve system-level sidebar interaction.

Because it can run parallel to the webpage being browsed, the sidebar conversation mode has become a standard feature of AI chatbot browser extensions.

However, Kimi may have a different view on sidebar mode. In terms of application scenarios, the Kimi Browser Assistant emphasizes the use of sidebar mode for continuous conversation and search during writing.

Image/Kimi

In other scenarios, Kimi favors another mode.

In the plugin configuration, it can be seen that the Kimi Browser Assistant defaults to turning on the "Show Kimi Button After Selecting Text" and "Kimi Floating Button," with the window display defaulting to "Global Floating Window" rather than "Sidebar," indirectly illustrating the design preferences and ideas of the Kimi Browser Assistant.

Kimi Browser Assistant settings interface, Image/Leitech

But how well these functions and interactive designs work in practice is the most crucial aspect.

Does the browser assistant make Kimi better to use?

The Kimi Browser Assistant is simple to use. You can directly use it as a Kimi conversation launcher by pressing a shortcut key or clicking the floating button in the lower-right corner of the browser to bring up the Kimi conversation window.

Kimi Browser Assistant launch interface, Image/Leitech

The conversation window here is very simple, with the core being the "Input Box" and "Summarize Full Text." Additionally, it allows one-click access to the Kimi homepage and shows the shortcut keys for bringing up the conversation window.

Then, you can ask Kimi various questions, such as why Kimi launched the Kimi Browser Assistant plugin, or even invoke various agents launched by Kimi. However, unlike the full conversation window on the web version, you cannot upload various files here and must return to the official website homepage for processing.

Nevertheless, the product positioning of the Kimi Browser Assistant determines that it is not just a "launcher" and does not need to be "omnipotent."

In fact, the core of the Kimi Browser Assistant lies in its role as a "browsing assistant," based on the webpage the user is browsing. For example, when reading news about the recent widely discussed chaos in tanker transportation, you can bring up the Kimi conversation window to summarize the article content with one click.

Image/Leitech

Of course, users can also select individual words or phrases they don't understand, such as "coal-to-liquids," and click the Kimi button that appears. Kimi will then explain this concept, which is not familiar to the general public, in context.

Image/Leitech

After the explanation, the selected text will be underlined. Simply move the cursor to the underlined part, and the previous conversation with Kimi will pop up.

As an editor at Leitech, I often need to understand a lot of previously unknown information, which often takes a lot of time, especially when dealing with foreign language materials.

For example, recently, a former AMD employee shared his experiences working at AMD on X (formerly Twitter), mentioning AMD's near-merger with NVIDIA. When finding the original source, I can first bring up the Kimi conversation window through the plugin and ask Kimi to "summarize the full text," providing a Chinese summary of the dozen English tweets:

This directly saves time and improves efficiency by eliminating translation, reading, and sorting steps.

After confirming that he mentioned AMD's near-merger with NVIDIA, I can continue to ask Kimi to elaborate on the part about AMD's near-merger with NVIDIA for a more detailed understanding.

Image/Leitech

From the answer, Kimi indeed provided a satisfactory response, not only comprehensively covering the information shared by the former AMD employee but also organizing it into six sections, such as "AMD's Acquisition Attempts" and "Market Position and Strategy," making it easier to understand the background of the story.

Even if you don't need to summarize the full text, you can ask Kimi to proceed with this step directly.

However, the help of the Kimi Browser Assistant is limited to the webpage being browsed. For example, regarding the tanker incident report mentioned earlier, if asked questions not covered in the article, Kimi starts to "answer irrelevantly":

Image/Leitech

Not even bothering to make up an answer.

In contrast, if you drop a link to Kimi on the web version and ask a question, regardless of the quality of the answer, at least you can get a relevant response. Currently, it seems that the Kimi Browser Assistant restricts the "information source" of the large model to the browsed webpage.

Image/Leitech

However, this strategy of the Kimi Browser Assistant does not align with users' actual needs. When we encounter questions while browsing webpages, the answers we need are unlikely to be fully covered by a single article and often require tapping into the broader "knowledge base" and "internet connectivity" of large models.

In contrast, if the goal is to efficiently extract information from long PDF files or even sets of papers, restricting the "information source" of large models to uploaded files may be more appropriate.

Furthermore, there are still many areas for improvement in the Kimi Browser Assistant.

For example, as mentioned earlier, compared to the web version, the conversation window of the Kimi Browser Assistant is not fully functional, and its capabilities are somewhat limited. I repeatedly wanted to switch to the web version during my usage. However, in actual use, once a conversation is initiated, the initial interface with the shortcut to access the Kimi official website is no longer visible, and it does not support opening the conversation in the web version.

Additionally, if multiple explanations are selected within the same webpage, multiple conversations will actually be generated, scattered throughout the article. Even clicking the "Kimi Floating Button" in the lower-right corner of the browser does not allow for quickly reviewing the previous conversation list.

In summary, compared to third-party Kimi browser extensions, the official Kimi Browser Assistant has a more differentiated positioning, with its core functions being quick article summarization and simple explanation of unfamiliar concepts and words in the article. It does not encourage users to engage in multi-round conversations to gain in-depth understanding and research on a topic.

Final Thoughts

Regular users of various AI tools should not find it difficult to observe that in this wave of AI, many product logics have changed. The "App first" approach of the mobile era has transformed into "Web first" in the AI era, from ChatGPT to Google Gemini, from Baidu ERNIE Bot to Kimi and Ali Tongyi:

None are exceptions.

The reason is not difficult to understand. Taking AI chatbots as an example, daily office work and learning are currently the most core usage scenarios. From this perspective, the PC platform is undoubtedly more important than the mobile platform. Meanwhile, on the PC, the browser is the most core software, and browsing webpages is one of the main needs of PC users.

Therefore, browser extensions have become one of the key ways for AI chatbots to enhance user experience and increase user frequency. The launch of the Kimi Browser Assistant aims to meet this need, with its core being to further simplify the process of information processing and acquisition for users while browsing webpages.

Third-party Kimi browser extension, Image/Leitech

Given that ChatGPT has yet to launch an official browser extension, AI chatbot browser extensions are still in their infancy, with neither third-party nor official parties having developed a universally recognized and effective interactive design.

As for Kimi's attempt, although it didn't bring many surprises, the "lightweight usage" product positioning of the Kimi Browser Assistant is still a means for many Kimi users to improve their daily usage experience.

The first half of 2024 has seen turmoil in the tech industry.

Large models are accelerating their implementation, with AI phones, AI PCs, AI home appliances, AI search, AI e-commerce, and other AI applications emerging one after another;

Vision Pro went on sale and entered the Chinese market, reigniting the wave of XR spatial computing;

HarmonyOS NEXT was officially released, transforming the mobile OS ecosystem;

The automobile industry has fully entered its "second half," with intelligence becoming a top priority;

E-commerce competition has intensified, with price wars and service upgrades;

The wave of going overseas has surged, with Chinese brands embarking on a global journey;

...

As the hot summer of July arrives, Leitech's Mid-Year Review series is launched, summarizing the brands, technologies, and products worthy of note in the tech industry in the first half of 2024, recording the past, and looking forward to the future. Stay tuned.

    Source: Leitech

    Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.