Technology, Process and Cost

NVIDIA Tesla P100 Graphics Processing Unit (GPU) with HBM2

By Yole SystemPlus — Oct 2017

TSMC CoWoS – Samsung HBM2 – 2.5D and 3D Packaging

Targeted for High Performance Computing (HPC) and deep learning, the NVIDIA Tesla P100 is the world’s first artificial intelligence supercomputing data center GPU.

It uses various leading edge technologies, including 3D stacked memory with 2.5D integration on a silicon interposer in a Chip-on-Wafer-on-Substrate (CoWoS) process.

Improving memory performance threefold over the NVIDIA Maxwell architecture, the Tesla P100 accelerators are equipped with 12GB or 16GB of second generation high bandwidth memory (HBM2).

HBM2 greatly increases memory capacity and bandwidth over first generation HBM1 technology. HBM1 was limited to 1GB of memory per stack of four dynamic random access memory (DRAM) die with maximum capacity of 256MB and 125GB/sec of bandwidth.

That compares to 8GB of memory per stack of eight stacked DRAM die with maximum capacity of 1GB and 180GB/sec bandwidth for HBM2.

The single 55mm x 55mm 12-layer ball grid array (BGA) package of the NVIDIA Tesla P100 includes more than 3,500 mm² of silicon area. Two industry leaders, TSMC and Samsung, had to come together to deliver this much silicon area in a package.

TSMC is the main provider for the Tesla P100. Using its 2.5D CoWoS platform, it manufactures the GP100 GPU die, featuring a 16nm FinFET process and 15.3 billion transistors.

It also produces a large silicon interposer on top of which the GPU is assembled at the wafer-level with its four HBM2 stacks.

Samsung provides the HBM2 stacks. A 3D assembly process yields HBM2 stacks composed of four 1GB DRAM memory dies and one buffer die, connected with via-middle through-silicon vias and micro-bumps.

The report includes a complete physical analysis of the packaging process, with details on all technical choices regarding process, equipment and materials.

Also, the complete manufacturing supply chain is described and manufacturing costs are calculated.

The report also compares the Tesla P100 with AMD’s Fury X, which uses HBM1 and 2D assembly, to explain the interest in evolution through the HBM2 and CoWoS 2.5D platforms.

Finally, it describes NVIDIA’s key module design and related process choices.

REVERSE COSTING WITH

Detailed photos and cross-sections
Precise measurements
Material analysis
Manufacturing process flow
Supply chain evaluation
Manufacturing cost analysis
Estimated sales price

Do you have an account?

Search

NVIDIA Tesla P100 Graphics Processing Unit (GPU) with HBM2

TSMC CoWoS – Samsung HBM2 – 2.5D and 3D Packaging

REVERSE COSTING WITH

Overview / Introduction

Manufacturing Process Flow

Cost Analysis

Estimated Price Analysis