MCU, Intelligent Awakening

02/24 2026 475

According to a report by IoT Analytics, the global IoT MCU market is projected to reach USD 7.3 billion by 2030, with a compound annual growth rate (CAGR) of approximately 6.3%. This growth is primarily driven by the release of demand for automation upgrades, the promotion of LPWAN projects, the trend of AI migration to the edge, and rapid growth in Asian markets, particularly China. The integration of AI technology is fundamentally reshaping the survival logic of MCUs. The deep integration of AI technology is rewriting the technical logic and market landscape of MCUs from the ground up, giving rise to AI MCUs that combine low power consumption, real-time performance, and intelligent reasoning capabilities, serving as carriers for the intelligent development of the IoT.

AI MCUs achieve local intelligent reasoning at the edge by integrating neural network processing units (NPUs) and extending dedicated instruction sets, precisely meeting the demand for low power consumption, high real-time performance, and cost-effectiveness in intelligent devices. This not only drives the upgrading of MCU products but also intensifies market competition. Major domestic and international chip manufacturers are increasing R&D investment, launching AI MCU products with unique technical features, and accelerating the deployment of edge intelligence applications.

01

Arm: Building the Core Technology Foundation for AI MCUs

As a global leader in processor architecture design, Arm's core product line covers a full range of scenarios from high-performance computing to low-power embedded applications. Since its establishment in 1990, it has formed three major product series: Cortex-A, Cortex-M, and Cortex-R, as well as the Neoverse series for servers/infrastructure. Each series is designed based on different architectural versions such as ARMv7, ARMv8, and ARMv9, with deep optimization for specific application scenarios.

Among them, the Cortex-M series is the core processor core for microcontrollers (MCUs) and embedded systems, featuring low power consumption, high real-time performance, and high cost-effectiveness. It supports single-cycle instruction execution, nanosecond-level fast interrupt response, and has a minimum core size of only 0.01mm². Paired with the Thumb-2 instruction set, it operates without an MMU and can run RTOS. It is currently the technological cornerstone of most MCU products and has become Arm's core entry point for layout (layout) AI MCUs.

To enable the Cortex-M series with stronger AI computing capabilities, Arm has introduced a series of dedicated technologies and core products. In 2019, Arm released the Cortex-M Vector Extension (MVE) technology for the Armv8M architecture—Arm Helium technology. As an extension of the Armv8.1-M architecture, this technology brings significant performance improvements for machine learning (ML) and digital signal processing (DSP) applications through efficient SIMD (Single Instruction, Multiple Data) operations, achieving up to a 15-fold performance breakthrough in machine learning tasks. Arm Helium Utils, as part of the CMSIS DSP library, also provides a series of utility functions for this technology, further optimizing the processing efficiency of DSP and ML tasks.

The Cortex-M55 and Cortex-M85 are the first processors to support Helium technology, enabling small, low-power embedded systems to meet computing challenges in scenarios such as audio devices, sensor hubs, keyword recognition, and static image processing. They also empower MCUs based on Arm cores with stronger AI processing capabilities. In 2023, Arm once again introduced the Cortex-M52 processor designed specifically for AI applications, not only achieving further improvements in energy efficiency but also becoming an alternative to mainstream MCU cores such as Cortex-M3 and Cortex-M33. With the help of Helium technology, it significantly upgrades DSP and ML processing capabilities, introducing AI-driven innovation into more low-power embedded scenarios.

In addition to core upgrades, Arm has also introduced the microNPU for the Cortex-M series—Ethos-U55, designed specifically for area-constrained embedded and IoT devices, significantly accelerating ML inference performance. When deployed in collaboration with the Cortex-M55, machine learning performance can be improved by up to 480 times compared to traditional Cortex-M systems, providing support for low-cost, high-energy-efficiency AI MCU solutions.

In China, Arm China has also completed localized innovation based on the Arm architecture. Its independently developed third-generation energy-efficient embedded chip IP, the " stars " STAR-MC3, is built on the Arm v8.1-M architecture and is backward compatible with traditional MCU architectures. It integrates Arm Helium technology, significantly improving CPU AI computing performance while maintaining excellent area efficiency and energy efficiency. It has become the core architecture for main control chips and coprocessors in the AIoT field, helping customers efficiently deploy edge-side AI applications.

02

RISC-V: Another Technological Route for AI MCUs

In addition to the Arm architecture, RISC-V, with its open-source and flexible characteristics, has become another important technological choice for AI MCUs. Its Vector Extension (RVV) technology allows processor cores to accelerate single-instruction-stream computing for massive datasets, perfectly adaptation (adapting to) machine learning, image compression processing, data encryption, audio-video multimedia processing, speech recognition, and natural language processing tasks—which are precisely the core computing requirements for the deployment of AI at the IoT edge. This has attracted many companies to develop AI MCU products based on the RISC-V architecture, forming a technological landscape complementary to the Arm architecture.

Today, Google has brought the RISC-V-based Coral NPU architecture to market through its partnership with Synaptics. The Coral NPU is based on the RISC-V instruction set architecture and includes a four-stage scalar CPU core, a 32-bit RISC-V vector engine, and a matrix core accelerator specifically optimized for modern Transformer models. Google positions the Coral NPU as "a full-stack, open-source platform designed to address the three core challenges of performance, fragmentation, and privacy, which have limited the deployment of powerful, always-on AI technologies in low-power edge devices and wearables."

Domestic and International Manufacturers: Accelerating the Deployment of AI MCU Products

Relying on technologies such as Arm Helium and RISC-V RVV, domestic and international chip manufacturers have successively launched their self-developed AI MCU products, covering different application scenarios from high-end, high-computing-power solutions to entry-level solutions, driving the commercialization of AI MCUs.

International Giants: Technologically Leading, Covering All Scenarios

Renesas Electronics' RA8 series, as the industry's first MCU series based on the Arm Cortex-M85 core, combines the high performance of an MPU with the ease of use, low power consumption, and low BOM cost of an MCU. It supports single-core or dual-core designs, achieving a maximum CPU clock speed of 1GHz and 0.25 TOPS of NPU computing power. It also integrates multi-protocol industrial networks including EtherCAT, meeting the high-computing-power application demands of AI, industrial Ethernet, robotics, and HMI. Building on this, Renesas has introduced the RA8P1 microcontroller product group, featuring a dual-core heterogeneous design of Cortex-M85 (1GHz) + Cortex-M33 (250MHz) and integrating the Arm EthosTM-U55 NPU. With the help of Arm Helium technology, it significantly improves DSP and AI/ML performance, setting a new benchmark for MCU performance.

STMicroelectronics' STM32N6 is the first STM32 MCU to embed a self-developed neural processing unit (NPU)—the ST Neural-ART accelerator—designed specifically for energy-efficient edge AI applications. It is equipped with an Arm Cortex-M55 core running at up to 800MHz, significantly enhancing DSP processing capabilities with Arm Helium vector processing technology. The overall chip clock speed can reach 1GHz, with a computing performance of 600 GOPS, providing real-time neural network inference capabilities for computer vision and audio applications.

NXP's i.MX RT700 series achieves hierarchical deployment of intelligent computing power through a multi-core architecture. The series features five computing cores: the main computing subsystem includes a 325MHz Arm Cortex-M33 core and a Cadence Tensilica HiFi 4 DSP, while the ultra-low-power sensing subsystem is paired with a second Cortex-M33 core and a Cadence Tensilica HiFi 1 DSP, eliminating the need for an external sensor hub and effectively reducing system design complexity and BOM costs. The series also integrates NXP's eIQ Neutron NPU, which can increase AI workload processing speed by 172 times. It includes 7.5MB of onboard SRAM, providing computing power support for edge intelligent devices such as wearables, medical devices, and smart homes.

Infineon's PSoC Edge E81, E83, and E84 series are all built on the Arm Cortex-M55 core, supporting the Arm Helium DSP instruction set, and are paired with different AI acceleration units: the E81 is equipped with Infineon's self-developed ultra-low-power NNLite neural network accelerator, while the E83 and E84 incorporate the Arm Ethos-U55 micro-NPU. Compared to traditional Cortex-M systems, machine learning performance is improved by 480 times. All three products integrate rich peripherals, on-chip memory, and hardware security functions, supporting WiFi 6, BT/BLE, Matter protocols, etc., and are suitable for machine learning applications in low-power computing fields.

Texas Instruments leverages its DSP technology advantage to deploy AI MCUs. Its C2000 MCU series' TMS320F28P550 MCU is equipped with a 150MHz C28x 32-bit DSP CPU, integrating a floating-point unit (FPU32) and a trigonometric function accelerator (TMU), and includes an NPU for edge AI computing, with a computing power of 600–1200 MOPS. It supports running CNN models locally, meeting the AI demands of industrial control scenarios.

Domestic Enterprises: Keeping Pace with Trends, Excelling in Localization and Adaptation

Domestic chip manufacturers are also making rapid breakthroughs in the AI MCU field, optimizing products for local IoT application scenarios while achieving the deployment of the RISC-V architecture, forming differentiated competitive advantages.

Nuvoton Technology's NuMicro M55M1 is an entry-level AI MCU solution designed for edge applications such as AI data recognition and intelligent audio. Based on the Arm Cortex-M55 core design, it is equipped with an Arm Ethos-U55 NPU and Helium vector processor, delivering an AI computing power of 110 GOPS. Compared to traditional 1GHz MCUs, AI inference performance has improved by more than 100 times. It also includes 1.5MB of RAM, 2MB of Flash, and supports external memory expansion. Additionally, Nuvoton has launched its self-developed NuML Tool Kit development tool and provides several off-the-shelf AI models, such as face recognition and object recognition, significantly lowering the barrier to AI application development and accelerating product deployment.

C*Core Technology focuses on the R&D of AI MCUs based on the RISC-V architecture. Its edge-side AI MCU chip, the CCR4001S, has achieved 100,000 unit shipments. The chip adopts a RISC-V core with a main frequency of 230MHz and integrates an NPU with 0.3 TOPS @INT8 computing power. It can efficiently run deep learning algorithms such as MobileNet, ResNet, and Yolo, performing complex tasks such as object recognition and target detection while maintaining low power consumption. It is particularly suitable for intelligent control scenarios in home appliances like air conditioners, allowing customers to independently import algorithm models and complete iterative optimization. Additionally, C*Core Technology has jointly launched the CCR7002 chip with SiFive, integrating a high-performance SoC and a low-power AI chip subsystem through multi-chip packaging technology, also providing 0.3 TOPS of NPU computing power and achieving deep integration of the RISC-V architecture and AI.

Espressif Systems' ESP32-S3 adds AI computing power support to a general-purpose MCU. This MCU, which integrates 2.4GHz Wi-Fi and Bluetooth 5 (LE), is equipped with an Xtensa 32-bit LX7 dual-core processor running at 240MHz. The newly added vector instructions accelerate neural network computing and signal processing. Developers can implement AI applications such as image recognition and voice wake-up through libraries like ESP-DSP and ESP-NN, making it suitable for lightweight edge intelligence scenarios like smart homes.

HiSilicon has introduced the Hi3066M for the intelligent edge of home appliances. This micro-computing-power embedded AI MCU adopts HiSilicon's proprietary RISC-V core, includes an eAI engine, and runs at a main frequency of 200MHz. Paired with 64KB of SRAM and 512KB of built-in Flash, it is suitable for AI energy-saving and intelligent detection scenarios in white goods such as air conditioners, refrigerators, and washing machines. It also reserves sufficient storage space to meet product upgrade needs for the next 5–10 years, marking HiSilicon's first edge-side eAI chip.

03

Conclusion

The integration of AI and MCUs is gradually moving from technological exploration to commercialization. With the continuous release of demand for edge intelligence and the continuous optimization of chip manufacturers in terms of computing power, power consumption, cost, and development tools, AI MCUs will become the computing power carriers for more intelligent devices, achieving wider applications in smart homes, industrial control, consumer electronics, medical devices, and other fields, driving the IoT industry toward a smarter and more efficient direction.

It is noteworthy that China, as the core promoter and practitioner of the RISC-V ecosystem, is also the most active country globally in the application of RISC-V chips. Compared to Arm architecture chips, our in-depth understanding is often limited to their peripherals, debugging systems, bus structures, and instruction sets, but we lack effective channels to explore the core details of the Cortex-M core, such as the internal pipeline of the chip, preventing us from mastering key technologies. The open-source nature of RISC-V precisely addresses this pain point. More importantly, it significantly lowers the barrier to learning about the core, enabling the cultivation of a large number of professionals who deeply understand core technologies.

Today, integrating AI functionality into MCUs has become commonplace, and this trend will continue to drive the in-depth development of the edge intelligence industry.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.