NVIDIA RTX Spark: Powerful, but Is It the Right Fit for the Age of Agents?

06/08 2026 436

Predicting the Needs of 2026 with 2023’s Tech

This week, Microsoft unveiled the Surface Laptop Ultra, a device that breaks away from the traditional Surface mold. It stands as Microsoft's most formidable Surface Laptop to date and is the first Windows laptop to incorporate NVIDIA's RTX Spark.

Yet, it may not be the ideal computer for the Age of Agents.

According to Microsoft, the Surface Laptop Ultra is tailored for creators, developers, and AI enthusiasts, capable of handling large-scale 3D scenes, lengthy compilation times, and extensive local models and datasets. NVIDIA positions the RTX Spark as a superchip for Windows PCs, specifically designed for personal AI Agents:

6144 CUDA cores, fifth-generation Tensor Cores, FP4 precision, connected via NVLink-C2C to a 20-core NVIDIA Grace CPU, supporting up to 128GB of unified memory.

As a result, the RTX Spark can run large models with 120 billion parameters, handle up to 1 million token contexts locally, render 3D scenes exceeding 90GB, edit 12K 4:2:2 video, generate 4K AI video, and run AAA games at over 100 FPS in 1440p.

Image source: NVIDIA

It's almost too powerful for a conventional laptop.

The strength of the RTX Spark lies primarily in its GPU, powered by CUDA, Blackwell GPU, unified memory, and local model inference capabilities. If the key question for AI PCs is whether they can run larger models locally, then the RTX Spark is indeed a compelling choice. However, Leitech argues that if the focus has shifted to whether Agents can perform tasks for users over the long term, the answer is not so clear-cut.

Agents involve task decomposition, application adjustments, file searches, code execution, webpage openings, sandbox maintenance, permission handling, waiting for external responses, and context switching between tasks. These workloads necessitate powerful cloud-based model capabilities and rely more on end-side CPU, I/O, system scheduling, security isolation, application interfaces, and cloud-based status.

Conversely, the performance of end-side models is far from sufficient to support the model capabilities required by Agents. Even a "beast" like the RTX Spark resembles cramming a small AI workstation into a laptop rather than redesigning a personal computer for the Age of Agents:

In other words, it's attempting to answer 2026's Agent questions with 2023's local large model imagination.


From NPU to CPU: NVIDIA's Strengthened PC Presence

NVIDIA didn't suddenly develop an interest in PC processors.

From a short-term business perspective, the company had little incentive to personally develop a PC SoC. It was already well-established in the PC market: its discrete GPUs have long held a dominant position, with gaming, creation, professional graphics, and CUDA ecosystems firmly under its control. Its data center GPUs are in such high demand that customers are more eager than NVIDIA to acquire them.

However, AI PCs have altered the workload structure of PCs and NVIDIA's position within them. (Note: AI PCs primarily refer to laptops, excluding DIY desktops.)

In the past, the core workloads of PCs revolved around CPU and graphics performance. NVIDIA only needed to enhance its GPUs and connect them to PCs via PCIe to maintain its top position in the performance hierarchy.

However, in the AI PC era, more workloads are centered around local inference, multimodal understanding, speech and video processing, semantic search, local knowledge bases, and system-level AI functions. NPUs are more aligned with everyday PC form factors than GPUs.

The low power consumption, constant presence, quiet operation, long battery life, and system-level calls of NPUs are precisely what AI PCs need most in their early stages. Traditional discrete GPUs require data to be shuttled back and forth between system memory and VRAM, leading to high power consumption, noisy fans, and poor battery life.

For laptops, which are naturally constrained by battery life, heat dissipation, and form factor, the logic of NPUs is highly attractive.

In 2024, Microsoft set a new standard with AI+ PC (initially Copilot+ AI): NPUs must have at least 40TOPS of computing power to run next-generation Windows AI features. Qualcomm, AMD, and Intel took turns showcasing their offerings, and PC manufacturers finally had a fresher selling point than CPU core counts, screen quality, and battery life.

Image source: Microsoft

However, if future PC AI workloads are entirely taken over by NPUs, NVIDIA's role in PCs could shift from a platform definer to a high-end graphics accessory.

That's why NVIDIA developed the RTX Spark, bringing CUDA, RTX, TensorRT, OptiX, DLSS, FP4, Blackwell GPU, and unified memory into Windows laptops. This ensures that PCs don't just run isolated AI functions but also support local AI workflows, avoiding a scenario where the AI PC era bypasses GPUs and CUDA.

During a Microsoft Build connection, NVIDIA CEO Jensen Huang mentioned that three years ago, he discussed with Microsoft CEO Satya Nadella a new kind of personal computer: one suitable for designers and creators as well as artificial intelligence, with both local processing capabilities and deep integration with Windows, creative software, and AI software stacks.

Image source: X

The RTX Spark and Surface Laptop Ultra are the results of that conversation. They do answer a question: if the Windows camp wants to build a high-end laptop capable of running local large models, local creation, and local AI development, how aggressive should the hardware be?

But AI has changed significantly in three years.

Three years ago, the industry was still in the local inference imagination phase following the explosion of ChatGPT. Many believed that the key to AI PCs was running models locally. This way, users wouldn't need to send private data to the cloud, pay token bills for each invocation, and could enjoy lower latency and more stable AI experiences.

This logic is still partially valid today. However, AI in 2026 is no longer just chatbots. From inference to Agents, context, inference chains, KV caches... almost all requirements have increased, and the capability gap between local devices and cloud models has widened further in many tasks.

Image source: X

In fact, almost all large models trained for Agents today have hardware requirements that far exceed those of the past. Quantized and compressed models cannot meet the demands for smooth Agent operation and user expectations.

In short, at least for now, end-side devices cannot support a good local Agent experience; cloud-based solutions are inevitable. Therefore, for personal PCs, the CPU becomes even more critical.

Can an "Outdated" ARM CPU Adapt to the Age of Agents?

In the Age of Agents, users don't just want an answer—they want the AI to complete tasks.

However, completing tasks is not the same as generating text locally. An Agent performing tasks often needs to access web pages, invoke software, run code, read files, handle permissions, verify results, and continue running in the background. It acts like an operator, not an offline model. The more complex the workflow, the more it relies on CPU, I/O, system scheduling, browser environment, sandboxes, and cloud services working together.

In contrast, the RTX Spark focuses almost entirely on the GPU and AI side: 1 petaflop of AI performance, 6144 CUDA cores, Blackwell RTX GPU, 128GB of unified memory, local 120B models, and million-token contexts.

For the CPU, NVIDIA chose MediaTek, adopting a 10-core Cortex-X925 and a 10-core Cortex-A725. These cores are based on Arm IP released two years ago and have been widely used in flagship and sub-flagship smartphone SoCs over the past two years, including the Dimensity 9400, Dimensity 8400, Xuanjie O1, and Exynos 2500.

Meanwhile, last year's Dimensity 9500 already used Arm's latest flagship architecture, C1-Ultra, and SoCs adopting the next-generation C2 core are expected to launch in the coming months.

Image source: Arm

Of course, the RTX Spark has far more CPU cores, but from planning to expectations, the CPU is probably not the core focus of NVIDIA's consumer-grade PC SoC.

This is not surprising, as NVIDIA's strongest moat has always been its GPUs and CUDA. However, the Age of Agents will re-elevate the status of CPUs. NVIDIA's Vera CPU for data centers essentially acknowledges that as AI evolves from chatbots to Agents, code execution, data processing, sandbox environments, and task orchestration will become critical paths, and CPUs will no longer just support GPUs.

However, the RTX Spark allocates too much budget, power consumption, chip area, and system imagination to the GPU and local inference without truly addressing the most critical execution, scheduling, long-term state, and cross-device collaboration issues for Agents.

Meanwhile, Project Solara, unveiled alongside the Surface Laptop Ultra, is Microsoft's alternative answer.

According to Steve Bathiche, head of Microsoft's Applied Sciences Group, Project Solara is a "chip-to-cloud" platform designed for agent-first experiences and new device form factors. It doesn't just bring intelligence to PCs, browsers, or phones but integrates it into workflows, environments, and task sites. Devices are no longer designed around Apps but around Agents.

By the way, Project Solara runs on Android, not Windows.

More importantly, Project Solara's state is not confined to a single device but is covered across a set of dedicated devices via Azure. Microsoft showcased portable and desktop form factors and explicitly named Qualcomm and MediaTek as the first chip partners.

Project Solara currently includes two devices. Image source: Microsoft

This approach may seem less impressive than the Surface Laptop Ultra and is still in its early stages, but it better aligns with the actual needs of Agents. The value of Agents lies not in having an entire local AI factory at every entry point but in appearing at the right time, place, and device and delegating tasks to cloud-based states and backend intelligence to continue progressing.

In other words, the Surface Laptop Ultra "enlarges" the PC, while Project Solara "slims" down devices. The Age of Agents may favor the latter.


Final Thoughts

Local computing power is still important.

Privacy-sensitive data, local files, low-latency interactions, offline scenarios, creative assets, and development environments all require sufficient end-side capabilities. For professional users who need local models, local rendering, local video generation, and local CUDA workflows, the Surface Laptop Ultra could be an excellent machine.

However, the Surface Laptop Ultra may not be the best answer for what personal computers should look like in the Age of Agents.

Agents are naturally better suited to a cloud-centric model, with multiple lightweight devices serving as entry points. Phones, PCs, work badges, desktop screens, earphones, and glasses can all act as Agent touchpoints, but they don't all need to become mini AI workstations.

From this perspective, AI PCs like the Surface Laptop Ultra and RTX Spark do seem to be using 2023’s local inference imagination to answer 2026’s Agent questions. They are powerful and important but not the starting point for next-generation Agent devices.


NVIDIA, Microsoft, Agents

Source: Leitech

Images in this article come from the 123RF licensed library. Source: Leitech

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.