From the “Chip Olympics” to the future of AI chips: Interconnect technology bottlenecks emerge, and packaging innovation becomes the next main battlefield

From the “Chip Olympics” to the future of AI chips: Interconnect technology bottlenecks emerge, and packaging innovation becomes the next main battlefield

The annual top-tier circuit conference ISSCC 2026, dubbed the "Chip Olympics" of the semiconductor industry, has released a batch of technologies with direct market significance—Samsung's HBM4 performance data is revealed for the first time, the optical interconnect roadmaps of Nvidia and Broadcom are becoming clearer, and architectural details of AI accelerators from industry giants such as AMD and Microsoft have also been disclosed.

According to leading semiconductor analysis organization SemiAnalysis, Samsung's HBM4 technology presented at this conference shows a bandwidth of 3.3 TB/s and a maximum pin speed of 13 Gb/s, more than double the JEDEC standard, indicating that Samsung is closing the technical gap with SK Hynix. Meanwhile, Nvidia's DWDM optical interconnect proposal, highly aligned with the specifications released by the OCI MSA industry alliance, further clarifies the technical direction for next-generation AI data center interconnects.

If Samsung HBM4 continues to improve in yield and reliability, it will pose a substantial challenge to SK Hynix's market dominance; the convergence of optical interconnect standards signals that the investment window for the related supply chain is opening.

ISSCC: The Semiconductor Industry's Annual Technology Barometer

A brief introduction to ISSCC: The International Solid-State Circuits Conference is one of the three top academic conferences in the semiconductor field, the other two being IEDM and VLSI. Compared to the latter two, ISSCC focuses more on circuit integration and implementation, with almost every paper including schematics and real measurement data, making it a crucial window for observing the practical progress of chip technologies in the industry.

This year's ISSCC is particularly noteworthy. According to SemiAnalysis, papers at previous ISSCCs had varied direct impact on the industry, but 2026 is markedly different—a large number of papers are highly relevant to current market hotspots, covering HBM4, LPDDR6, GDDR7, NAND flash, co-packaged optics (CPO), advanced chip-to-chip interconnects, and processor architectures from MediaTek, AMD, Nvidia, Microsoft, and other manufacturers.

Samsung HBM4: Performance Breakthrough, but Yield and Cost Remain Concerns

Samsung is the only one of the three major memory manufacturers to publish a paper on HBM4 technology at this year's ISSCC.

The presented HBM4 uses 12-layer stacking, 36 GB capacity, 2048 IO pins, a bandwidth of 3.3 TB/s, sixth-generation 10nm-class (1c) process DRAM, and an SF4 advanced logic process for the logic base chip.

The most critical architectural change lies in the separation of the base chip process. HBM4 migrates the base chip from DRAM processing to SF4 logic processing, reducing the operating voltage (VDDQ) from HBM3E's 1.1V to 0.75V—a 32% reduction—achieving higher transistor density and better area efficiency. Combined with adaptive body bias (ABB) control technology and a fourfold increase in TSVs, Samsung HBM4 can achieve 11 Gb/s pin speed under a core voltage below 1V, up to a maximum of 13 Gb/s, far exceeding the JEDEC HBM4 standard of 6.4 Gb/s.

However, this technical path comes at a clear cost. SF4 processing is more expensive than SK Hynix's TSMC N12 process and Micron's internal CMOS base solution. More critically, Samsung's 1c process frontend yield was only about 50% last year, and although it is improving, the low yield puts pressure on HBM4 gross margins. SemiAnalysis noted that Samsung's HBM historical profit margins have always lagged SK Hynix, and this pattern remains challenging in the HBM4 era.

In terms of reliability and stability, Samsung still trails SK Hynix, but the technological catch-up trend is becoming more apparent.

LPDDR6 and GDDR7: Samsung and SK Hynix Each Have Their Focus

Both Samsung and SK Hynix showcased LPDDR6 chips at this year’s ISSCC. Both products support a maximum data rate of 14.4 Gb/s, about 35% higher than the fastest LPDDR5X.

There are differences in low voltage performance between the two. Samsung’s LPDDR6 can reach 12.8 Gb/s at 0.97V, while SK Hynix only achieves 10.9 Gb/s at 0.95V, showing Samsung’s advantage in power efficiency at lower pin speeds. Samsung also showcased an LPDDR6 PHY based on SF2 process, supporting nearly 50% reduced read power consumption in efficiency mode.

SK Hynix’s highlight is GDDR7. Its GDDR7, based on 1c process, can reach up to 48 Gb/s (1.2V), and even at low voltages of 1.05V/0.9V attains 30.3 Gb/s—higher than the 30 Gb/s memory in RTX 5080. Its bit density reaches 0.412 Gb/mm², significantly better than Samsung’s 1b process at 0.309 Gb/mm².

Notably, SemiAnalysis pointed out that Nvidia’s previously announced Rubin CPX AI processor with 128GB GDDR7 has mostly disappeared from the 2026 roadmap, with Nvidia shifting focus to launching the Groq LPX solution.

Optical Interconnect: Nvidia's DWDM Route and Industry Standards Are Converging

Optical interconnect is another core topic at this ISSCC, directly impacting the networking methods of next-generation AI accelerator clusters.

Nvidia proposed an optical interconnect solution based on DWDM (Dense Wavelength Division Multiplexing) consisting of 32 Gb/s per wavelength, 8-wavelength multiplexing, and a ninth wavelength for clock forwarding to simplify SerDes design and improve efficiency. This closely matches the specifications released by the newly formed OCI MSA (Optical Compute Interconnect Multi-Source Agreement) ahead of OFC 2026—the OCI MSA focuses on 200 Gb/s bidirectional links, adopting a 4-wavelength 50G NRZ DWDM solution for scale-up interconnect.

This progress clarifies previous market doubts: Nvidia’s COUPE optical engine targets 200G PAM4 DR optical scale-out switching, while DWDM is used for scale-up interconnect, and both paths coexist.

Broadcom showed its 6.4T MZM optical engine, consisting of 64 channels of about 100G PAM4, and tested it in the Tomahawk 5 51.2T CPO system. Broadcom said it will shift to the COUPE solution in the future, but current products still use other packaging routes.

Marvell presented its 800G Coherent-Lite transceiver for data center campus scenarios, consuming just 3.72 pJ/b (excluding silicon photonics), roughly half the power of conventional coherent transceivers, with latency below 300 ns over 40 km of fiber.

Advanced Packaging and Chip-to-Chip Interconnect: Multiple Technologies Competing

As multi-chip designs become mainstream, chip-to-chip interconnect is now a performance bottleneck, and multiple companies have showcased their solutions at this ISSCC.

TSMC exhibited active local silicon interconnect (aLSI) technology, introducing edge-triggered transceiver (ETT) circuits in bridge chips to improve signal integrity, compressing PHY depth from 1043μm to 850μm, with total power consumption at just 0.36 pJ/b. SemiAnalysis noted that the test carrier’s package design closely matches AMD’s MI450 GPU, suggesting that aLSI might be the packaging solution for AMD’s next-generation products.

Intel presented a chip-to-chip interface compatible with the UCIe-S standard, based on a 22nm process, offering up to 48 Gb/s/channel and a transmission distance of 30mm in standard organic packages, considered the prototype for future Diamond Rapids Xeon CPUs.

Microsoft disclosed chip-to-chip interconnect details based on TSMC N3P process, with system power consumption of 0.33 pJ/b at 24 Gb/s. SemiAnalysis believes this is Microsoft Cobalt 200 CPU’s custom high-bandwidth interconnect connecting two compute chiplets.

AI Accelerators: AMD, Microsoft, and Rebellions Architecture Details Revealed for the First Time

AMD provided details on improvements in the MI355X GPU over MI300X: the core compute chip (XCD) moved from N5 to N3P process, doubling matrix throughput with no area change; the IO chip (IOD) was merged from four chips to two, reducing chip-to-chip interconnect overhead and cutting interconnect power consumption by about 20%.

Microsoft Maia 200 was another major AI accelerator disclosed at this conference. As the last mainstream HBM accelerator to maintain mask-level monolithic design, Maia 200 adopts TSMC N3P process, integrates over 10 PFLOPS FP4 compute, six HBM3E chips and 28 lanes of 400 Gb/s full-duplex chip interconnect, using a packaging solution similar to Nvidia H100 with CoWoS-S interposer.

Korean AI chip startup Rebellions revealed architectural details of its Rebel100 accelerator for the first time. The chip uses Samsung SF4X process and I-CubeS advanced packaging, with four compute chips and four HBM3E chips, and integrates silicon capacitors to improve HBM3E power quality. SemiAnalysis notes that Samsung may bundle I-CubeS packaging with its frontend process and use HBM supply conditions as leverage to push this packaging technology, not yet adopted by mainstream AI accelerators, into the market.

Risk Warning and DisclaimerThe market has risks, investment requires caution. This article does not constitute personal investment advice and does not take into account the unique investment objectives, financial situation, or needs of individual users. Users should consider whether any opinions, views, or conclusions in this article fit their specific circumstances. Investing based on this information is at your own risk.