From Search to Agent: Jing Kun Brings Genspark Back to Microsoft's Ecosystem

Home

Finance

ICV

Smart City

Digital Live

Cloud

Optics

Home Finance AI ICV Smart City Digital Live Cloud Optics

05/09 2026 500

After the Manus incident, the most frequently discussed question in Silicon Valley's AI venture capital circle became: Who will be the next 'lucky one'? One answer points to Genspark.

Agent entry products have historically been scarce, and Genspark has been pushed to the forefront because it provides rare hard data in the commercialization dimension of the AI application layer. On May 19 last year, Jing Kun shared on X that the product achieved an ARR of $36 million 45 days after launch, with paid user retention rates stabilizing between 88% and 92% in the first month. This shocked overseas companies, which inquired about the company behind Genspark. In March this year, Genspark revealed that its ARR had reached $200 million.

These figures are outliers among current AI application layer startups. Without heavy subsidies, Genspark can maintain operations without relying entirely on a continuous influx of new users. For any potential acquirer, this 'capital resilience' is more convincing than impressive monthly active user numbers.

Behind Genspark stands a rare founder combination in the current AI industry. Jing Kun's career spans search, voice AI, and hardware ecosystems: he led Bing search R&D in Asia at Microsoft and was one of the early creators of Microsoft Xiaoice; after joining Baidu, he spearheaded the scaling (mass production) of the XiaoDu smart speaker and served as CEO of XiaoDu Technology. Co-founder Zhu Kaihua was formerly a principal architect at Google and later CTO of XiaoDu. The two worked together at Baidu for 11 years and are now collaborating on their second startup.

The significance of this background combination goes beyond credentials. Jing Kun's core capability during his time at XiaoDu was rapidly packaging model capabilities into product forms that users were willing to continuously pay for during periods when technology was not yet mature, and pushing them into large-scale commercial scenarios. This capability holds far more direct value in the Agent track (Agent track) than a pure technical background.

Lei Feng Network previously reported that the Genspark team deliberately positioned itself close to Google in business and PR to stimulate Microsoft's acquisition appetite. This operating logic is not complex: leveraging the competitive anxieties of the two giants over AI entry points to anchor its valuation at a higher starting point.

However, judging by the substance of the April 29 cooperation announcement, the relationship between Microsoft and Genspark clearly goes beyond ordinary ecosystem collaboration. Jing Kun's strategic value to his former employer is shifting from a bargaining chip in competitive bidding to deep product-level integration. After two decades away from Microsoft, passing through Baidu and XiaoDu, Jing Kun has ultimately brought his product back into his former employer's ecosystem. But rather than an accidentally formed acquisition story, this appears more like a strategic exit Genspark reserved for itself from the beginning.

A Hard Business in the AI Application Layer

Genspark's appeal comes from its rare combination of commercial data in the AI application layer.

The financing timeline itself reflects market judgment. Genspark completed a $275 million Series B round in November last year, valuing it at $1.25 billion post-investment; in March this year, the Series B round expanded to $385 million, with the valuation rising to $1.6 billion; ARR surpassed $200 million in the same month. At OpenAI's developer conference last October, Genspark was included in the 'Trillion Token Club,' meaning it is one of the few third-party clients capable of stably consuming massive computational power and demonstrating high-frequency reuse capabilities.

In the AI application layer, most products face dilemmas: either high traffic but low conversion rates, or few paying users but high ARPU (Average Revenue Per User). Genspark combines high paid retention, strong unit economics, and low capital consumption—all simultaneously—making it an outlier among current AI startups.

Specific product architecture reasons underlie this. Genspark's core architecture is MoA (Mixture of Agents), simultaneously invoking over 70 specialized large models to complete tasks through two key mechanisms: model orchestration and improvement loops via the Agent Engine. CTO Zhu Kaihua summarizes the team's philosophy as 'less control, more tools'—not pursuing centralized intelligence but letting the system find optimal solutions itself.

The commercial significance of this architectural choice lies in scalability. Products with single-model architectures have their capability ceilings determined by the underlying base model, whereas the MoA architecture can continuously incorporate stronger specialized capabilities as new models emerge. At dawn on May 8, OpenAI released GPT-Realtime-2, claiming it brings GPT-5-level reasoning capabilities to voice agents; hours later, Genspark announced its Call for Me Agent had integrated this model, with Genspark Realtime Voice set to upgrade accordingly.

Every time model companies advance specialized capabilities, Genspark can translate these into perceptible functional improvements for users without rewriting its product foundation. For enterprise users, this means their paid value won't rapidly depreciate with model iterations—the technical explanation for Genspark's consistently higher retention rates compared to similar products.

Product positioning is equally clear. Genspark focuses on three high-frequency enterprise scenarios: office automation, data analysis, and document generation. Starting from AI search, it completed its transition to a Super Agent by 2024, maintaining an almost weekly update cadence. Lei Feng Network reported that Genspark earns approximately $0.80 per monthly active user, with a churn rate only one-third that of Manus.

Jing Kun's personal background is an unignorable variable in this commercial logic. While leading Bing search R&D in Asia at Microsoft, he handled matching search algorithms with user intent; while spearheading XiaoDu smart speakers, he faced the challenge of introducing a technically immature voice AI system into millions of households while cultivating stable usage habits. The common theme of these two experiences was rapidly packaging capabilities into user-acceptable product forms during periods when technology was not fully ready and driving commercialization.

XiaoDu's later valuation by capital markets approaching $5 billion was determined not just by voice interaction itself but also by Jing Kun's ability to compress technology, hardware, content, and channels into a scalable consumer product through the smart speaker—a Stage entrance (phased entry point).

In contrast, today's XiaoDu is still searching for its next definitive entry point. Products like the 'Bestie Machine' and AI glasses may generate buzz, but one represents a demand more preset (presupposed) by scenarios, the other a follow-the-industry-trend move, neither yet proving they possess the groundbreaking ability to rewrite user habits like the smart speaker once did.

The transition from search to Agent reflects Jing Kun's proactive judgment about the next critical battleground in the AI application layer. In an internal sharing session, he said a 50-person company where every employee can infinitely use AI could already outperform a 500-person company without such capabilities. This statement describes not just Genspark's product philosophy but also his judgment about the value density of the Agent track: AI's value lies in enabling companies to handle workloads previously only digestible by larger organizations with fewer people.

Genspark's team structure—a hybrid team with Chinese core members but Silicon Valley-backed personnel—combines the engineering density of domestic internet giants with the external narrative capabilities of Silicon Valley product companies. This allows it to rapidly implement complex feature launches in short cycles while smoothly engaging with international institutional investors and media. Such dual-track capabilities are uncommon among current AI startups and form one of the foundational conditions for its participation in acquisition negotiations as a 'pure Silicon Valley company.'

Delivery as Moat, but Also Ceiling

Genspark's product value is real, but a quantifiable gap remains between its current delivery quality and its self-positioning as an 'AI employee.'

From practical testing, Genspark's core advantage lies in reducing friction costs from vague demands to structured first drafts. Users propose a complex task—such as compiling an industry competitor report, generating an analysis table with data visualization, or converting a text meeting record into a PPT. Genspark can invoke top models (including Opus 4.7) and various built-in tools to complete the task across steps, ultimately outputting a directly usable document draft. Users no longer need to separately open search engines, data tools, and document software to piece together information.

However, when task complexity exceeds a certain threshold, the system's limitations become apparent. Xinlishang's practical testing revealed that Genspark's response time significantly lengthens when handling multi-variable filtering tasks (e.g., filtering personnel lists by specific conditions and sorting by follower count).

The reason lies in the structural trade-offs of the MoA architecture itself, where multi-model scheduling incurs latency costs: each layer of task decomposition, model selection, and result integration requires additional scheduling time. Multi-model scheduling brings stronger specialized capabilities and better result quality at the cost of perceptible delays—users feel they are 'waiting for AI' rather than 'collaborating with AI.'

Formatting and structural issues also genuinely exist. Media previously reported that Genspark-generated content tends toward standardized layouts—fixed title positions, template-following layouts—making personalized customization difficult while maintaining AI-generated content. Technically, this problem stems from models' tendency to choose frequently occurring layout templates in training data when generating structured documents, as these templates are statistically 'safer' but also lack personalized adaptability.

Thus, the Langhanwei team previously positioned Genspark's testing at the level of a junior intern. It can complete structural framework construction and output a roughly usable first draft, but final judgment, aesthetic details, and revisions still require manual intervention.

In comparison, ByteDance's AnyGen demonstrated advantages in delivery quality, UI refinement, and single-use depth during Xinlishang's testing, with more generous free credit allocations. This gap reflects divergent design orientations between the two products: AnyGen focuses on meticulous single-delivery quality; Genspark retains users through broader tool coverage and higher-density iteration rhythms, voluntarily trading quality ceilings for scenario coverage density.

Behind this orientation difference lie two commercial logics. Products focusing on single-delivery quality suit vertical scenarios with small user bases, high payment willingness, and strong task professionalism; products prioritizing coverage density suit horizontal expansion of high-frequency, low-intensity tasks, building user stickiness through tool richness and update frequency. Users stay not because every output is perfect but because new models and tools continuously emerge, sustaining Genspark's overall value accumulation for renewals.

Therefore, this reality of 'strength in delivery, weakness in delivery' does not weaken Genspark's acquisition value but rather reinforces its attractiveness as a technology-complementary asset. What Microsoft and Google need is not an AI system capable of replacing top knowledge workers but a tool that helps ordinary knowledge workers reduce friction between 'having an idea' and 'having a first draft'—a demand density interval Genspark's current capability boundaries precisely cover.

For potential acquirers, a product with perfect delivery capabilities would be less attractive precisely because it could exist independently without needing acquisition. The ideal acquisition target is a company whose commercial framework is already clearly established, product value validated by the market, but still has obvious room for refinement and integration potential. Genspark's current position precisely falls into this gap.

Returning to Microsoft: Perhaps an Already-Written Ending

Headquartered in Palo Alto, California, Genspark's founding team consists of senior experts from Silicon Valley tech giants like Microsoft, Google, and Meta. Besides Jing Kun, who worked at Microsoft, Zhu Kaihua was formerly a principal architect at Google. As a company with the purest Silicon Valley DNA, Genspark faces far less regulatory pressure and public resistance than Manus under current geopolitical contexts. But this is merely a prerequisite for acquisition.

Leveraging competitive anxieties between Microsoft and Google over AI entry points to anchor its valuation at a higher starting point involves straightforward operating logic but requires precise judgment of both sides' decision-making psychology. Google's motivation for interest in Genspark centers on search entry defense. Genspark initially entered the AI search space with its 'Sparkpages' concept, attempting to replace traditional blue links with AI-generated structured information pages, accumulating over 5 million users before pivoting to the Agent track.

Microsoft's logic differs entirely. It faces an execution-layer gap. Copilot is essentially an embedded AI assistant that helps users modify documents, summarize content, and generate drafts but lacks the ability to autonomously execute complex tasks across applications. Users still need to manually switch between tools, transporting AI outputs to the next work stage. Genspark's MoA architecture precisely covers this execution layer currently missing from Copilot.

Xinlishang thus argues that Genspark's proximity to Google serves more to provoke competitors' acquisition willingness, with Microsoft as the party needing persuasion. This was a negotiation with clear targets and tactical maneuvers deliberately designed to create competitive tension, not an equal-opportunity bidding war.

On April 29, Genspark officially announced a global strategic partnership with Microsoft, embedding its AI Slides, Sheets, and Docs Agents as native plugins within Microsoft 365, with full infrastructure deployment on Azure and plans to enter Agent 365 and the Microsoft Marketplace.

Judson Althoff, CEO of Microsoft's commercial business, emphasized in the joint statement the importance of 'feeling AI value within workflows' rather than functional demonstrations. Once these connections are established, bidirectional dependencies emerge: Genspark's product iterations begin revolving around Microsoft's enterprise ecosystem, while Microsoft enterprise clients start using Genspark's capabilities in daily workflows. From a technical integration perspective, this already exhibits characteristics common to deep pre-acquisition bindings.

Jing Kun's Microsoft background is another reason this path is easier to navigate. During his time leading Bing search R&D in Asia at Microsoft, he developed a deep understanding of Microsoft's internal decision-making structures and product philosophies. Microsoft is not an unfamiliar negotiation counterpart to Jing Kun but the starting point of his career path. This cognitive symmetry produces effects in every detail of acquisition negotiations.

However, the greater the strategic fit, the harder product conflicts become to avoid. Genspark is an 'AI+' enterprise with AI at its core, redefining task execution starting points: users begin with AI before tools. Microsoft Copilot's current model resembles '+AI': embedding AI auxiliary capabilities within existing Office workflows, where software tools remain the primary user interaction subjects, and AI serves as an attached capability enhancer. These two product philosophies will directly clash in post-acquisition product roadmap decisions.

Genspark's current product capabilities partially rely on its flexible configuration of calls to top-tier models such as Anthropic and OpenAI. After entering the Microsoft ecosystem, the extent to which Azure OpenAI services and the Copilot technology stack will constrain this configuration flexibility will directly determine whether Genspark can maintain its current capability advantages post-acquisition.

For a startup with 'AI employees' as its core narrative, these are inevitable costs brought about by the acquisition itself. Historically, Silicon Valley's acquisitions of AI-native products have often resulted in two outcomes: either being fully retained for independent operation, as seen in the early stages of Slack's acquisition by Salesforce, or being rapidly absorbed into the parent company's product line and losing their original characteristics.

However, precisely because these costs are so clear, Jing Kun still chose to integrate Genspark into the Microsoft ecosystem, suggesting that this may not necessarily be a passive choice. If Genspark continues to operate independently, it will need to continuously bear model costs, customer acquisition costs, enterprise sales costs, and competitive pressure from tech giants. By joining Microsoft, it means Genspark can offload the most challenging aspects of enterprise distribution, infrastructure, and trust barriers to its parent company.

This also makes Genspark's acquisition narrative more intriguing. It is not a standard startup story of growing large first and then being acquired by a tech giant. Instead, it resembles a company that is gradually adjusting itself—from product form, customer scenarios, to ecosystem interfaces—to a position where Microsoft can seamlessly integrate it.

The loss of independence, partial loss of original characteristics, and even the compromise from being AI-Native to aligning with Microsoft's workflow post-acquisition may not necessarily be unforeseen outcomes for Jing Kun. On the contrary, they may have been the intended endpoints of this path from the very beginning.

*The featured image and illustrations in the text are sourced from the internet.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.

Newest

Links