03/05 2025
463
More than a year has passed since the introduction of the AI PC concept, but it seems that "much ado about nothing" has been the case. The market and consumers do not seem to be buying it. Is AI PC truly "AI"? What is a true AI PC? Let's see the answers given by true AI giants.
01
The Rise of the AI PC Concept
AI PC stands for Artificial Intelligence Personal Computer, first proposed by Intel in September 2023. It quickly gained widespread favor within the industry. Although it has not been around for long, it is widely believed that AI PC will be a turning point for the PC industry. Canalys defines AI PC as desktops and laptops equipped with dedicated AI chipsets or modules (such as NPU) for processing AI workloads.
2024 is recognized as the first year of AI PC applications, with various enterprises launching their own AI computers.
In early March, Apple released the AI PC MacBook Air. On March 18, Honor launched its first AI PC, the MagicBook Pro 16. Shortly after, AMD Chairman and CEO Lisa Su announced that AMD's Ryzen 8040 series AI PC processors had begun shipping. On March 22, Microsoft announced the launch of the Surface AI PC. On April 11, Huawei released its new MateBook X Pro laptop, which applies Huawei's Pangu large model for the first time.
To some extent, the PC industry, strongly tied to the AI concept, has indeed shown improvement. In the fourth quarter of 2024, AI PC shipments reached 15.4 million units, accounting for 23% of total quarterly PC shipments. For the full year of 2024, AI PCs accounted for 17% of total PC shipments. Among them, Apple led with a 54% market share, followed by Lenovo and HP, each with 12%. Due to the replacement wave brought about by the end of Windows 10 support, the market penetration rate of AI PCs is expected to continue to increase in 2025. But how much AI content is there really in them?
02
AI PC: Much Ado About Nothing
On February 23, 2024, Lenovo CEO Yang Yuanqing stated after the latest financial report was released that global PC shipments were expected to increase by about 5% year-on-year in 2024. Despite facing some challenges, he firmly believes that artificial intelligence will be a key factor driving Lenovo's business growth and transformation.
However, Yang Yuanqing also pointed out that the current AI PC market is still in its infancy. Despite the "big thunder," actual sales and user acceptance are relatively low. He believes this is mainly due to factors such as technology maturity, user education, and market acceptance.
Many people do not recognize the AI PC products that have been released. The core issue is that the "AI" and "PC" (hardware) in these AI PCs are basically separate. Taking Microsoft Copilot, the largest AI use case on PCs currently, as an example, in Intel and Microsoft's joint definition of AI PCs, it is emphasized that they must be equipped with hybrid architecture chips, Copilot, and its corresponding physical buttons. However, the fact is that all PCs upgraded to the latest Windows 11 version can use Copilot, as Copilot only relies on Microsoft Azure cloud computing power and has nothing to do with the PC hardware itself.
As the leader in AI chip technology, NVIDIA simply ignores Microsoft's definition. Who could possibly have more say in AI than NVIDIA? NVIDIA has been laying out its ecosystem in the AI field for a long time. Since its establishment in 1993, it has been a pioneer in the field of accelerated computing, with the most extensive CUDA ecosystem for AI productivity. High-performance PCs with NVIDIA graphics cards are less dependent on OEM adaptation and can not only run lightweight AI tools, such as local large language models and simple Stable Diffusion drawing, but can even run medium-scale AI models. The actual generation speed is much faster than using ordinary integrated graphics for AI.
The current market indifference towards AI PCs is mainly due to the following reasons:
1. Insufficient NPU computing power in current AI PCs
Intel NPU's AI performance peaks at 48 TOPS, while Intel Xe integrated graphics is approximately 28 TOPS. The computing power of AI PCs equipped with integrated graphics is currently in the range of 10-45 TOPS, while devices equipped with GeForce RTX 40 series GPUs, including laptops and desktops, offer product solutions with different levels of performance ranging from 200-1400 TOPS.
The RTX 5090 graphics card released this year adopts NVIDIA's Blackwell architecture, which results in a qualitative leap in performance. According to NVIDIA's official introduction, the RTX 5090's AI computing power reaches 4000 TOPS, which is three times that of the previous-generation Ada Lovelace architecture.
Compared to GPUs, NPU's AI computing power is significantly inferior.
In fact, even for single RTX 4080 or 4090 locally, there is not much abundance in terms of computing power for common AI applications. It is conceivable that the computing power of NPU is indeed not very useful.
2. NPU lacks DRAM and cannot independently support large model operations
Current AI large models are all "DRAM-heavy" in terms of hardware requirements. NPU inherently does not come with DRAM and relies on system RAM. This means that running large models requires additional DRAM of 64GB or more to pair with the NPU. If that much investment is made, why not just use APU/GPU? Since additional cost is involved, why not use whatever works best?
Moreover, APU and GPU running AI large models are open-source and well-adapted, providing an out-of-the-box experience.
3. Limited applications and narrow application scope for NPU adaptation
Theoretically, NPU can now run LLM large language models, stable diffusion image generation, inference for common CV neural networks (including Resnet, yolo), and whisper speech-to-text. Essentially, all AI inference workloads, which are essentially matrix operations, can be realized with low power consumption through NPU.
However, in reality, for Windows laptops currently available to users, the application scenarios where NPU can be invoked are limited to background blurring and cutout in Windows Studio Effect. The application scope is indeed too narrow. The number of local programs supported by NPU is currently very small.
Overall, the functions that NPU can actually use are mostly superficial. This round of AI's true popularity is due to the fact that people have seen chatbots like ChatGPT solving many problems. Therefore, if NPU is truly to be useful, it needs to be able to run LLM large language models, which the current NPU on AI PCs clearly cannot satisfy.
Whether it's NPU or GPU is not important, but localized AI is very much needed. For now, whether it's an AI PC or not is not important. What's more important is whether it is equipped with an NVIDIA GPU.
03
Three Major Manufacturers' "True AI PCs"
Although some manufacturers have previously promoted AI PC products, they were mostly just hype, equipped with NPU chips but without the ability to run true local large models. They cannot perform training or inference.
The concept of AI PC has been widely promoted on laptops. However, currently, no lightweight laptop can be considered a high-computing power AI-dedicated computing device. Instead, traditional high-performance gaming laptops and desktops equipped with powerful GPUs can truly provide true AI productivity.
True AI PCs still depend on manufacturers capable of developing high-performance GPUs, such as NVIDIA and AMD.
Earlier this year at CES, AMD released the AI Max 300 Strix Halo. Jen-Hsun Huang also unveiled Project DIGITS. Together with Apple's Mac Pro from before, these three are powerful tools for local deployment of large models, known as "desktop AI supercomputers."
AMD has released two versions of the Strix Halo: the consumer-grade Strix Halo, mainly used in consumer performance laptops (gaming laptops), and the commercial-grade Strix Halo Pro, mainly used in mobile workstations. Exposed 3DMark test data shows that its flagship model, the Ryzen AI MAX+ 395, has 16 CPU cores based on the Zen 5 architecture with 32 threads, 40 GPU cores based on the RDNA 3.5 architecture, namely the Radeon 8060S integrated graphics, and a maximum power of 120W, which is three times that of a standard mobile APU. It supports quad-channel LPDDR5X memory, providing up to 256 GB/s of bandwidth. Notably, the integrated Radeon 8060S integrated graphics performance reaches more than three times that of the previous-generation Radeon 890M and even approaches the level of the RTX 4060 discrete graphics card.
NVIDIA calls its released Project DIGITS the "smallest AI supercomputer" currently available. Project DIGITS uses a custom "GB10" superchip that integrates a GPU based on the Blackwell architecture within a single core, along with the Grace CPU developed through a collaboration between NVIDIA, MediaTek, and ARM. According to the data, the Blackwell GPU provides 1 PFLOPS of FP4 computing power, while the Grace CPU includes 10 Cortex-X925 cores and 10 Cortex-A725 cores. Between the GPU and CPU, a large supercomputer's same NVLINK-C2C chip-to-chip interconnect bus is used for connection.
Project DIGITS is also equipped with an independent NVIDIA ConnectX interconnect chip, which allows the GPU within the "GB10" superchip to be compatible with multiple different interconnect technology standards, including NCCL, RDMA, GPUDirect, etc., enabling this "large integrated graphics card" to be directly accessed by various development software and AI applications.
Apple released the M3 series chip in 2023, equipped with the next-generation GPU, representing the biggest leap in Apple's chip graphics architecture history. It is not only faster and more energy-efficient but also introduces a new technology called "Dynamic Cache." It also brings new rendering features such as hardware-accelerated ray tracing and mesh shading to Mac for the first time. The rendering speed is now 2.5 times faster than the M1 series chip. Notably, the new M3 series chip brings a unified memory architecture of up to 128GB. Apple claims that support for up to 128GB of memory unlocks workflows that were previously impossible on laptops, such as AI developers using larger Transformer models with billions of parameters. Last year, Apple released the M4 Pro chip, which boasts performance surpassing AI PC chips.
All three adopt a technology called unified memory architecture. The benefit of a unified architecture is that it unifies previously separate memory and video memory (graphics card memory), thereby reducing data copying during communication between the CPU and GPU. Additionally, this technology allows for larger video memory, breaking the limitation of insufficient video memory when consumer-grade graphics cards run large models. It is worth noting that the unified memory design was not originated by NVIDIA; Apple's M1 was the first example.
04
Deepseek Ignites the Battle for Desktop AI Supercomputers
Recently, the severe shortage of online computing power on DeepSeek has fueled the demand for local deployment of large models, and the three major manufacturers' "true AI PCs" have also begun deploying DeepSeek.
DeepSeek, as an MoE model, has high requirements for video memory but relatively low requirements for computing power/memory bandwidth. This has given these desktop AI supercomputers with large video memory through unified memory technology an opportunity.
Previously, a foreign expert used 8 M4 Pro Mac minis to run DeepSeek V3. Similarly, it is expected that four Project DIGITS can be used to deploy DeepSeek V3, and the generation speed should be much faster. According to AMD's own announcement, the Strix Halo architecture APU can deploy a 70B model, which is 2.2 times faster and consumes 87% less power than the 4090.
Some netizens have expressed, "I plan to replace my current laptop after the halo notebook is released. Local deployment of large models is indeed interesting. In a few years, maybe we can locally deploy 671B INT8 or FP8 large models. Besides large models, with improved RAM and CPU configuration, other tasks will also be faster."
The AI track may be an opportunity for domestic manufacturers to enter the PC chip field. Currently, many manufacturers are marketing various AI all-in-one products. It is believed that if domestic manufacturers can launch larger unified memory, such as a 256G version of the domestic "Project DIGITS," it may be more popular.
The concept of AI PC is like a little girl who can be dressed up as anyone. Each company has its own way of telling the story. OEM manufacturers are flourishing, investing money and engineers in localized AI applications. Some software can be used locally or in the cloud, and cloud services can access domestic models for business purposes, which may be a very profitable market.
Low latency + privacy protection may be a point that drives the localization of AI applications such as GPT-like large language models, SD drawing, voice cloning, AI frame interpolation, cutout, and redrawing.
Sufficient edge computing power + large memory (video memory) + highly optimized software for AI PCs can be combined to potentially solve industry pain points and enable large-scale deployment of AI terminals. Therefore, AI PC is not entirely hype and speculation. Whether it's more accessible AI, more energy-efficient AI, more powerful AI, or simpler and more user-friendly AI based on the cloud and network, there is ongoing technological development and market exploration.