Advanced packaging solutions in HPC and consumer – The chronicles by Yole SystemPlus

Advanced packaging technology plays a pivotal role in enabling the development of smaller, more powerful, and energy-efficient electronic devices, such as smartphones, tablets, and HPC (High-Performance Computing) systems. According to Yole Intelligence, 2.5D & 3D packaging revenue is expected to grow at a 29% CAGR, from US$3.25 billion in 2021 to over US$15 billion in 2027, with telecom and infrastructure applications having the largest market share.

Although the structure of the devices analyzed by Yole SystemPlus looks the same from the outside, deeper analyses reveal the complexity of the packaging techniques used and their key differences.

Yole SystemPlus and Yole Intelligence are Yole Group’s companies.

Advanced packaging in HPC systems

In its report 2.5D & 3D Packaging Comparison 2022, Yole SystemPlus focused on advanced packaging innovations by comparing two accelerator cards, the A100 from NVIDIA and the Instinct MI 210 from AMD. Both devices target data center, server, and workstation applications and feature the same structure, including a GPU (Graphics Processing Unit) framed by some HBM (High Bandwidth Memory) components, enabling high memory storage capacity and high-speed communication.

NVIDIA, considered the worldwide leader in the field since 2017, has evolved towards successfully integrating increased memory storage density to achieve 80 GB capacity with the A100 using the CoWoS (Chip-on-Wafer-on-Substrate) process. AMD, fairly new on the market, has adopted a Fan-Out embedded bridge technology and succeeded in providing a 64 GB device.

  • Global view comparison

With a very well optimized design, the A100 hosts five HBM stacks. With this design, NVIDIA opts for a market-timing strategy with a sixth space available for potential additional HBM integration (filler die). The AMD device features only four HBM dies with no possibility of further memory integration without a redesign.

  • Cross-sectional view comparison

The cross-section reveals a significant difference on the interconnection front. In the case of the A100, the communication between the HBM dies, the GPU, and the substrate is assured by a single Si interposer that occupies the entire component area. TSVs (through-silicon vias) are processed within the interposer to enable even more efficient communication between the stacked chips. For the MI 210, AMD has developed a very innovative solution based on four small Si bridges, each placed partially below each HBM die and the processor. A molding material with copper posts fills the remaining space left by the Si bridges and provides communication between the HBM stacks and the substrate. The Si bridge solution, also referred to as organic or mold interposer, features a ‘silicon bridge-interposer’/’total chip area’ ratio of less than 0.05 compared with over 0.25 for the A100 interposer. A closer examination of the cross-section shows that, in both cases, each HBM structure hosts eight DRAM stacked dies interconnected with TSVs.

Due to the use of a large interposer chip and a greater number of DRAM dies, the total embedded chip area in the A100 is almost 1.5 times greater than that of the MI 210. With a package surface area less than that of the AMD MI 210, the optimized footprint of the NVIDIA A100 is confirmed. At the price level, additional manufacturing costs due to the integration of TSVs and a bigger Si die, subject to low yields, result in an interposer Si die costing 90% more than a single Si bridge die.

As the need for more memory storage capacity grows, the interposer solution may reach a deadlock, and NVIDIA will surely have to change its tune.   

Advanced packaging for consumer devices

As with HPC systems, packaging for high-end smartphones seems similar from one manufacturer to another at first glance. Whatever the device, PoP (Package-on-Package) technology is used to integrate the DRAM stack on top of the SoC (System-on-Chip) die package. With a more detailed analysis, Yole SystemPlus has discerned four types of SoC die assembly – InFO (Integrated Fan-Out), Standard PoP, FO PLP (Fan-Out Panel Level Packaging), and MCeP (Molded Core Embedded Package) – used by key players Samsung, TSMC, Shinko and other OSATs. The interconnection technique is the main factor that differentiates the four packaging processes. The FO PLP integrates a thick interconnect layer based on a PCB frame. Tin-based solder balls are used in the standard PoP, and copper core solder balls in the MCeP for the interconnect between the top and bottom substrates. InFO packaging features TIV (Through InFO Via) for the interconnect between the DRAM package ball and the RDL (Redistribution Layer). Interconnect height is also a key parameter as it allows integration of a thicker SoC die, which is supposed to have better mechanical reliability. On this point, as the total package height is almost the same from one package to another, the FO PLP, designed by Samsung for the Google Tensor G2, is leading the race, followed by the InFO by TSMC for the Apple A16.

Since gaining space is not as critical as in smartphones, the tablet, laptop, and desktop chips analyzed show a structure similar to that of HPC systems, with DRAM packages placed laterally beside the SoC die. A comparative analysis was conducted between the M1 Pro and M2 Pro, revealing that Apple uses a Flip Chip BGA (Ball Grid Array) package to assemble the SoCs in both systems. The Apple M2 Pro integrates two more DRAM stacks of reduced size, which is supposed to increase the flexibility of the chip configuration.

About the authors

Ying-Wu Liu is a Technology & Cost Analyst at Yole SystemPlus, part of Yole Group. Ying-Wu ’s core expertise is Integrated Circuit technologies. With solid expertise in the physical and electronic analysis of devices and experiences in wafer manufacturing and technical support with international clients, Ying’s mission is to develop reverse engineering & costing reports. She works closely with different laboratories to set up significant physical & chemical analyses of innovative IC chips. Based on the results, Ying identifies and analyzes the overall manufacturing process and all technical choices made by the IC makers to understand the structure of the device and point out the link between cost and technology. Prior to Yole SystemPlus, Ying worked as a Technical Support Manager at KEOLABS, where she had the opportunity to develop her ability to cooperate with clients from different cultures. Ying holds a master’s in Theoretical Physics from the National Tsing Hua University (Taiwan) and a master’s in Integration, Security, and Trust in Embedded systems from the Grenoble INP, ESISAR (France).

Belinda Dube serves as a Technology & Cost Analyst at Yole SystemPlus, part of Yole Group.
Belinda’s core expertise is memory technology, especially DRAM and 3D NAND flash memory. At the same time, she also investigates IC technologies as well as advanced packaging.
Belinda’s mission is to develop reverse engineering & costing reports. She also works on custom projects, where she works closely with the laboratory team to set up significant physical & chemical analyses of innovative memory chips
Belinda holds a master’s degree in Instrumentation & Nanotechnology Engineering from INSA (France).