06/03 2026
412
Microsoft is also concerned about being overly reliant on OpenAI.
Overnight, Microsoft has solidified its lineup of self-developed AI models.
Early this morning, at the annual Microsoft Build 2026 conference, CEO Satya Nadella unveiled a plethora of new updates, encompassing the entire MAI series of self-developed large models, the Copilot super app, and the Surface Laptop Ultra, which, in collaboration with NVIDIA, heralds the "new era of personal computing."

(Image credit: Leitech)
If, like me, you've been frustrated by the proliferation of Copilot buttons over the past two years, or if you feel that today's AI PCs are merely traditional computers with a fresh coat of paint, then this launch event has indeed introduced some novel and distinct innovations from Microsoft.
Whether you're a programmer coding away, an office worker handling documents daily, or an ordinary consumer seeking a reliable computer without much technical know-how, there's something for everyone this time around.
However, amidst the applause, I also sensed a hint of tension—a subtle undercurrent of competition.
Although Microsoft didn't explicitly state it, its actions spoke volumes.
Next, Leitech (ID: leitech) will guide you through this developer conference to uncover the intriguing new developments—and potential concerns—it holds.
In recent years, whenever Microsoft AI is mentioned, the immediate reaction is often: Oh, the primary financial supporter of OpenAI, the largest distributor of ChatGPT, and the master rebrander of Copilot.
It may sound harsh, but that's the market's perception.

(Image credit: Leitech)
So, at today's developer conference, Microsoft naturally focused on its self-developed MAI series of large models, boasting a total of seven!
I must admit, the names sound very Microsoft—so plain that one might suspect the namer just emerged from a three-hour meeting.
Of course, models like MAI-Transcribe, MAI-Voice, and MAI-Image have already begun integration into Microsoft Foundry, and this Build conference has further enhanced their capabilities in image, voice, reasoning, and code.
The most striking among them is MAI-Thinking-1.

(Image credit: Leitech)
It is Microsoft's inaugural flagship reasoning model, featuring 35 billion active parameters and approximately 1 trillion total parameters, supporting 256K context windows and following the Mixture of Experts (MoE) architecture.
In simpler terms, each reasoning task activates 35 billion parameters, but the total model size approaches the trillion-parameter scale.
Microsoft's promotion of it is quite intriguing—it doesn't claim world domination but states it performs comparably to top-tier models in key software engineering tests and can rival Sonnet 4.6 in blind tests.
More importantly, Microsoft emphasizes that it was trained from scratch using only clean, licensed data, without distillation from third-party models.

(Image credit: Leitech)
Hmm... I can't shake the feeling that there's a subtle hint of trying to outdo Claude here.
However, it's not fully open yet; it's currently in private preview on Microsoft Foundry and will later be available for public testing on MAI Playground.
The second key highlight is MAI-Code-1-Flash.
This model boasts only 5 billion parameters and is a lightweight, efficiency-oriented coding model, immediately integrated into GitHub Copilot and VS Code upon release.

(Image credit: Leitech)
Microsoft officially states that it was specifically trained for coding in real-world development environments, not just for benchmarks. It keeps reasoning concise for simple tasks and automatically allocates more reasoning resources for complex tasks, claiming to reduce theoretical token consumption by 60% compared to similarly positioned large models.
Needless to say, for a product like Copilot with a vast user base, saving tokens translates to real cost savings.
The third is MAI-Image-2.5.

(Image credit: Leitech)
This one makes a bold claim—it ranks second on the Arena image editing leaderboard and third in text-to-image generation, even surpassing Google's Nano Banana 2.
Microsoft directly provides developer pricing: $5 per million tokens for standard text input, $8 for image input, and $47 for image output.
The Flash version is more affordable, with text and image input at $1.75 each and image output at $19.5.
Microsoft says it will integrate this into PowerPoint, OneDrive, design tools, and enterprise content production, transforming it into a cost-effective productivity tool.
There's also MAI-Transcribe-1.5 and MAI-Voice-2.

(Image credit: Leitech)
The former supports 43 languages and can transcribe one hour of audio in 15 seconds, while the latter supports 15 languages and can clone voices using 5 to 60 seconds of reference audio.
So, Microsoft's showcase of self-developed models this time isn't about benchmark comparisons but about demonstrating its ability to control more critical links in the chain.
Previously, its AI narrative was: I have OpenAI, so I have AI.
Now, it wants to change it to: I have OpenAI, and I also have my own models. Even if the external environment changes, my business can still thrive.
OpenAI remains crucial, but Microsoft won't let itself rely on a single source indefinitely.
Have you ever used Microsoft Copilot?
Based on my personal experience and feedback from colleagues, less than 10% of users who have tried Copilot continue to use it, and even fewer still use it today.
The reasons? Aside from network barriers and language differences, there are simply too many versions of Copilot.
Windows has Copilot, Office has Copilot, GitHub has Copilot, Edge has Copilot, and Azure has Copilot. No matter when or where you open a Microsoft product, there always seems to be a Copilot lurking in the top-right corner, just waiting to be accidentally clicked.

(Image credit: Leitech)
The problem is, ordinary users have no idea which one to click.
You might think they're the same thing, but their functions differ. You might think they're different, but they all share the same name: Copilot.
It's like going to a restaurant where every dish on the menu is labeled "signature dish." Ask the waiter what the difference is, and they'll say, "This signature dish is signature, and that signature dish is also signature."
So, Microsoft's launch of Microsoft Scout this time is essentially an attempt to tidy up the previous mess.

(Image credit: Leitech)
Microsoft Scout isn't just another chatbox hidden in an app sidebar; it's more like a personal assistant.
Now, users don't need to know which agent handles emails, which handles files, or which handles meetings.
You just say, "Summarize the key points of today's client meeting, write an email to them, and set up follow-up tasks." Scout will then find the meeting in Teams, write the email in Outlook, locate the files in OneDrive, and present the results to you.
According to The Verge's editor, it can even combine traffic conditions and your schedule to remind you when to leave.

(Image credit: The Verge)
It sounds great—and a bit unsettling.
Because Microsoft Scout is built on the open-source OpenClaw framework, and agents that can "see, click, fill, and execute" naturally raise security concerns.
Microsoft is aware of these fears, so it's placing Scout in a sandbox and deploying enterprise security tools like Agent 365, Defender, and Purview. It even specifically demonstrated a scenario where "Scout fails to delete personal files" on stage, hoping to reassure developers and enterprises.
Currently, Microsoft Scout is integrated into Microsoft 365 apps, including Outlook, OneDrive, and Teams, which it can call upon.
Want to try it? It's said to be available for ordinary users in the fall, so I guess we have plenty of time to wait...
If the software upgrades were somewhat expected, the hardware updates are truly surprising.
For the past few years, Microsoft's Surface series has been in an awkward position, neglected by both consumers and critics. The Surface Go series has stagnated for years, the Surface Pro series always lags in specs while commanding a premium price, and has been jokingly referred to as "buying a system and getting a tablet for free."
This time, Microsoft directly unveiled the Surface Laptop Ultra AI PC, which can be described as a "beast."

(Image credit: Leitech)
This machine's specs are clearly aimed at competing with Apple's MacBook Pro.
It's equipped with NVIDIA RTX Spark, up to 20 Arm CPU cores, 6144 Blackwell GPU cores, 128GB of unified memory, and 1 petaflop of AI computing power, making it ideal for developers, creators, and local agents.
Previously, the biggest issue with local AI was that as soon as models got a bit larger, computers would start to struggle.
Now, Microsoft and NVIDIA want to showcase that 120B-scale models can also run locally, without needing to offload everything to the cloud every time.

(Image credit: Leitech)
The Surface RTX Spark Dev Box is even more direct.
It's like a mini Mini PC, with a 100W thermal design, 128GB of unified memory, and pre-installed with Windows 11 Pro, VS Code, GitHub Copilot, and a developer environment.

(Image credit: Leitech)
This thing isn't for ordinary people.
It's more like a signal from Microsoft to developers: Windows on Arm isn't just for lightweight laptops anymore; it can also handle AI development.
To that end, Microsoft also announced optimized Windows 11 experiences for developers, including Coreutils for Windows, WSL containers, Windows Developer Configurations, Intelligent Terminal, and Windows 365 developer configurations.
In short, it's about making Windows more like a development environment that can simultaneously run Windows, Linux, containers, and AI tools.

(Image credit: Leitech)
Not only that, but Microsoft also wants agents to exist not just on computers and phones but also in new devices like office desks, work badges, healthcare, retail, and frontline work environments.
So, at this conference, they officially announced Project Solara, a new brand for agent hardware.

(Image credit: Leitech)
It's a platform for agent devices, built on the Android-based MDEP framework, designed to complement AI PCs running Windows—yes, you read that right, it's built on Android.
Microsoft showcased two conceptual devices: one resembles a small desktop screen, and the other looks like a work badge.

(Image credit: Leitech)
The small desktop screen can activate an Agent via facial recognition and comes with built-in audio capabilities; the work badge features a camera and fingerprint sensor, enabling recording, transcription, and even allowing the Agent to see what you see.

(Image credit: Leitech)
Essentially, these two devices are templates built upon desktop AI hardware and wearable AI hardware.
Microsoft believes that the next-generation platform will shift from apps to agents. Therefore, neither of these products will have traditional applications; instead, they will directly summon intelligent agents for use through natural language, voice, vision, and context.
Hmm... It can only be said that currently, many so-called Agent hardware products are "futuristic in appearance but lackluster in reality." The biggest issue with Microsoft's AI work badge is not whether it can record but rather who is willing to be recorded by it; the biggest issue with the AI desktop screen is not whether it can activate an Agent but rather what makes it more valuable to use than a smartphone, computer, or smart speaker.
If these two questions cannot be answered satisfactorily, then I believe the future of Microsoft's AI hardware ecosystem is truly uncertain.
After watching the entire launch event, my most intuitive feeling is that the personal computer landscape has completely changed.
In the past, people thought that an AI PC would simply have an additional Copilot key on the keyboard or a built-in internet-connected chat window in the system.
But now, Microsoft has proven with its
But for the average person, isn't this a bit over the top?
After all, in their day-to-day computer usage, people may simply be working on spreadsheets, drafting documents, and perhaps indulging in a bit of procrastination now and then. Do they truly require a high-performance computer, capable of running models with hundreds of billions of parameters, to meet these basic needs?
Furthermore, at present, it remains unclear how many of the grand plans Microsoft has outlined will actually materialize, and how many will simply end up as another bug in a system update.
After all, Microsoft, that wily old fox, has a track record of reneging on its promises.
Microsoft, OpenAI, AIPC, Windows Agent
Source: Leitech
The images featured in this article are sourced from the 123RF Authentic Library (123RF Royalty-Free Image Library). Source: Leitech