07/25 2025
397
With the rapid advancement of AI technology, particularly large language models, the demand for AI computing power has skyrocketed. According to OpenAI's estimates, global AI training computing power demand doubles every 3.4 months, representing a staggering 500,000-fold increase since 2012. This exponential growth poses unprecedented challenges to data centers' computing power and data transmission capabilities. As the backbone of AI computing power, data centers must process immense amounts of data, leading to exponential growth in internal data traffic. For instance, in large-scale AI training clusters, servers frequently exchange vast amounts of data to support model training and optimization. Traditional data transmission methods are increasingly strained by such substantial data traffic.
Optical modules, pivotal for optical-electrical signal conversion in data centers, directly impact data transmission rates, capacity, and efficiency. In the context of soaring computing power demand, traditional optical modules face numerous challenges.
As data volumes continue to surge, optical modules are required to support higher and higher transmission rates. From early 10G and 25G to today's 100G, 400G, and beyond towards 1.6T and higher rates, traditional optical modules encounter technical bottlenecks in speed escalation. Additionally, the issue of high power consumption associated with high-speed transmission becomes increasingly prominent. High power consumption not only escalates data centers' operational costs but also places stringent demands on cooling systems, complicating data center construction and maintenance. Furthermore, traditional optical modules struggle to meet data centers' increasingly stringent size and integration requirements for compact layouts. To address these challenges, co-packaged optics (CPO) technology emerges.
The essence of CPO lies in integrating optical engines and switching chips onto the same substrate through advanced packaging technology. In traditional pluggable optical module solutions, optical modules connect to the switch's printed circuit board (PCB) via a pluggable interface, and electrical signals traverse several centimeters along PCB traces to reach the optical module for opto-electronic conversion. During this process, electrical signals are susceptible to factors such as PCB trace resistance, capacitance, and inductance, leading to signal attenuation, distortion, and increased latency. In contrast, CPO technology tightly integrates the optical engine with the switching chip, drastically reducing the electrical signal transmission distance between the chip and the optical engine, typically to the millimeter level. This short-distance transmission method effectively minimizes signal loss and interference during transmission, enhancing signal integrity and transmission quality.
However, CPO currently has some notable shortcomings.
Firstly, the technology is not sufficiently mature, resulting in high costs. There are numerous hurdles to overcome in the industry chain: TSMC's silicon photonics wafer yield stands at only 65%, and Accelink Technologies' tests reveal that CPO modules' end-to-end coupling loss fluctuation is ±2dB, far inferior to pluggable modules' ±0.5dB. These technical issues directly elevate the production cost of a single module to 3-5 times that of older solutions, with the total cost of a 1.6T port CPO solution reaching $2,800 per port, 2.3 times that of pluggable modules. Even with future production scale increases, it is estimated that narrowing the cost gap will take 3-5 years.
Secondly, operational and maintenance changes are cumbersome. Since CPO modules cannot be plugged and unplugged like traditional optical modules, they lack the convenience of plug-and-play. Facebook's simulation tests indicate that a CPO architecture necessitates an additional 30% of redundant switching chips to handle failures, increasing overall costs by 18%. Cisco's research also reveals that repairing a broken CPO module takes an average of 72 hours, six times longer than a pluggable module. For financial data centers requiring 99.999% availability, this is a critical issue that could potentially lead to losses amounting to hundreds of thousands of dollars per hour.
Moreover, the industry standards chaos is slowing down CPO industrialization. Currently, several standard camps exist in the CPO field, including COBO, OIF, OpenEye, etc., with package sizes ranging from 35mm×35mm to 58mm×58mm, power supply specifications from 3.3V to 12V, and liquid cooling ratios in thermal management solutions spanning from 30% to 100%. This confusion causes headaches for equipment manufacturers who must develop compatible interfaces for different solutions, increasing R&D costs by over 40%. For instance, a leading switch vendor spent an additional $20 million on testing alone to be compatible with three mainstream standards, obviously slowing down technology implementation.
In the research, development, and application of CPO technology, many industry giants are actively laying out strategies and playing leading roles.
AMD's collaboration with Ranovus began during the early exploration of CPO technology. At OFC 2020, Ranovus first demonstrated the Odin series of silicon photonics engines based on multi-wavelength quantum dot lasers (QDL) and magnetic resonator technology, significantly reducing power consumption and cost compared to alternatives at the time. In 2021, Ranovus launched its second-generation CPO optical engine, utilizing an analog drive method to eliminate the need for retimers, further optimizing cost and power consumption, and integrating relevant components into a single electro-photonic IC. In 2022, the collaboration showcased the CPO implementation of the Xilinx Versal adaptive computing acceleration platform, leveraging Odin 800-Gbps CPO 2.0 technology to simplify circuit board wiring and reduce power consumption. By OFC 2023, the two companies had demonstrated the interoperability of the Versal adaptive SoC with the Odin 800G directly driven optical engine and third-party modules, with the optical engine built on GlobalFoundries' silicon photonics platform, providing a flexible and low-power solution for AI/ML workloads in hyperscale data centers.
Broadcom actively promotes the penetration of CPO technology from the switch side to the server side. In 2021, Broadcom announced its next-generation series of switching chips equipped with CPO optics, including the 25.6Tb Humboldt, scheduled for launch at the end of 2022, and the subsequent 51.2T Bailly, mentioning future plans for co-packaging with CPUs and GPUs. At OFC 2023, Broadcom showcased the 51.2T Bailly CPO prototype system based on Tomahawk5 and the 25.6T Humboldt CPO system based on Tomahawk4, with its full CMOS EIC integrating low-power TIAs and optical MUX/DEMUX, increasing optical engine bandwidth to 6.4T and significantly reducing optical interconnect power. In March 2024, Broadcom delivered the industry's first 51.2T CPO Ethernet switch to customers, integrating 8 x 6.4Tbps silicon photonics optical engines with Tomahawk 5 switching chips, substantially reducing power consumption compared to pluggable optical module solutions, and planning to extend CPO technology to computing chips in the future to pursue higher bandwidth.
Cisco has a clear judgment on the pace of CPO technology implementation, believing that its three pillars lie in removing some DSPs to save power, adopting remote light sources, and relying on a production-proven silicon photonics platform. These innovations can reduce the power required for connections by up to 50%, lowering the total power of fixed systems by 25-30%. At OFC 2023, Cisco's CPO demonstration showcased key technical advantages, with the silicon photonics IC achieving the multiplexers/demultiplexers required for 400G FR4, addressing the challenge of optical component miniaturization and supporting multiple types of data center optics. Cisco anticipates that trial deployments of CPO will synchronize with the 51.2Tb switching cycle, with larger-scale applications during the 101.2Tb switching cycle.
The MOTION project, developed by IBM in collaboration with Finisar, features a unique 2:1 redundant backup design for laser light sources, suitable for LGA and solder-mounted assemblies, aiming to enhance overall system reliability, distinguishing it from Broadcom's external pluggable light source solution.
Intel has been deploying CPO since 2020, with optical computing interconnect (OCI) as the ultimate goal. Leveraging its deep expertise in silicon photonics, Intel focuses on the development of pluggable optical transceivers and micro-ring modulator technology, utilizing its unique silicon photonics process platform to develop CPO systems based on micro-ring modulators.
Marvell's CPO technology platform balances the needs of pluggable and co-packaged solutions. At OFC 2022, its first-generation cloud-optimized CPO platform debuted, integrating 2.5D/3D highly integrated silicon photonics components, including lasers, TIAs, drivers, and PAM4 DSPs, laying the foundation for a 3.2T CPO platform for 51.2T switches. In 2023, Marvell showcased a new 200G per channel silicon photonics optical engine, integrating hundreds of components that can flexibly serve as pluggable optical modules or CPO optical solutions.
As a leader in the GPU field, NVIDIA is simultaneously deploying CPO on the switch side and GPU side. In a 2022 OFC speech, Chief Scientist Bill Dally elaborated on the goal of co-packaged optics using dense wavelength division multiplexing (DWDM) and the concept of utilizing silicon photonics to cross-connect GPU computing engines across racks. NVIDIA, in collaboration with AyarLabs, is dedicated to developing high-bandwidth, low-latency, ultra-low-power optical interconnect technology to support the horizontal scaling of AI/ML architectures and meet the rapidly growing demand for data volume.
TSMC is deploying in the CPO field by introducing the COUPE optics platform. At the 2024 North American Technology Symposium, TSMC presented silicon photonics-based solutions targeting bandwidths of up to 12.8 Tbps. Its COUPE engine employs SoIC-X packaging technology to combine a 65nm electronic integrated circuit (EIC) with a photonic integrated circuit (PIC), achieving efficient power usage through low-impedance SoIC-X interconnects, providing key technical support for future high-bandwidth needs of data centers.
Faced with competition from CPO technology, traditional pluggable modules are also continuously innovating to enhance their performance.
In terms of rate breakthroughs, pluggable modules have steadily evolved towards 1.6T and higher rates through the synergy of high-order modulation and integration technology. Modulation methods such as 16QAM increase the data density per wavelength, while silicon photonics integration technology compresses the optical engine's size within the standard packaging range, enabling the module to maintain 2km-level cross-rack transmission capability without modifying existing data centers' physical cabling architecture. The core of this technical path lies in achieving rate transitions without disrupting mature packaging standards, underpinned by precise control of upgrade costs for existing data centers—avoiding additional investments due to architectural reconfigurations, which is indispensable for the smooth transition of medium and large data centers.
The logic behind power consumption optimization reflects the pragmatism of technology selection. Unlike CPO, which achieves extreme power savings by shortening the electrical signal path, pluggable modules focus on improving energy efficiency within the existing architecture: through dynamic power management algorithms to align with data traffic fluctuations, new low-thermal resistance packaging materials to reduce thermal dissipation losses, and low-power laser driver circuit design to achieve step-wise reductions in power consumption per unit of bandwidth in medium and short-distance transmission scenarios. While this optimization does not push the physical limits of transmission distance, it strikes a balance between technology maturity and energy efficiency ratio, particularly suited for large-scale deployment scenarios sensitive to renovation costs.
In the future, these two technologies will not exist in an "either-or" substitution relationship but will play distinct roles in the data center's "hierarchical network": CPO will dominate high-density interconnects within racks due to its low latency and high integration advantages; pluggable modules will cover medium to long-distance transmissions across racks and rooms, relying on their flexibility and compatibility. This "layered coexistence" pattern will propel optical interconnect technology towards a more segmented and efficient evolution.
Meanwhile, to balance maintainability and energy efficiency improvements, hybrid architecture solutions are emerging. NVIDIA's "pluggable CPO" architecture adopted in the Grace Hopper supercomputer is quite enlightening, achieving physical separation but electrical direct connection between the optical engine and switching chip through standardized optoelectronic interfaces. This design maintains maintainability while enhancing energy efficiency to 2.1pJ/bit, offering new perspectives for technological transitions. This hybrid architecture solution, which to some extent combines the advantages of pluggable modules and CPO, may carve out a niche in the market in the future.
In this technological revolution where "quantitative changes" lead to "qualitative changes", co-packaged optics (CPO) technology has undoubtedly garnered significant attention due to its immense potential in transmission rates, power consumption, and integration. However, it is crucial to approach this technology with a realistic perspective, recognizing its limitations and the ongoing evolution of traditional pluggable modules.
However, as discussed in detail in this article, the journey towards the industrialization of Co-Packaged Optics (CPO) technology is fraught with significant obstacles. These include challenges related to technology maturity, cost control, operation and maintenance models, as well as the unification of industry standards. Specifically, issues such as low silicon photonics wafer yield, difficulty in controlling coupling loss, high initial investment, complex fault maintenance procedures, and a lack of standardized protocols all serve as "roadblocks" impeding the widespread commercialization of CPO. Consequently, the full-scale adoption of CPO technology may necessitate a period of 3-5 years or even longer for technological advancements and robust industry chain collaboration to materialize.