Wednesday 10 May 2017

Nvidia Details Volta GV100 GPU, Tesla V100 Accelerator

Nvidia pulled part of the curtain off its long-anticipated Volta GPU architecture, revealing the GV100 GPU and the first derivative product, the Tesla V100 here at GTC in San Jose today. Nvidia first dropped the Volta name at GTC in 2013, and it’s taken the company four years to reveal the juicy details. If you’re a gamer, don’t get too excited yet; Nvidia is still pitching Pascal-derived products (only a year old, or less). If you work in the AI and high performance computer (HPC) markets, however, this first phase of Volta is coming your way.

The Volta GV100 GPU Architecture

The Volta GV100 GPU uses the 12nm TSMC FFN process, has over 21 billion transistors, and is designed for deep learning applications. We’re talking about an 815mm2 die here, which pushes the limits of TSMC’s current capabilities. Nvidia said it’s not possible to build a larger GPU on the current process technology. The GP100 was the largest GPU that Nvidia ever produced before the GV100. It took up a 610mm2 surface area and housed 15.3 billion transistors. The GV100 is more than 30% larger.

Volta’s full GV100 GPU sports 84 SMs (each SM features four texture units, 64 FP32 cores, 64 INT32 cores, 32 FP64 cores) fed by 128KB of shared L1 cache per SM that can be configured to varying texture cache and shared memory ratios. The GP100 featured 60 SMs and a total of 3840 CUDA cores. The Volta SMs also feature a new type of core that specializes in Tensor deep learning 4×4 Matrix operations. The GV100 contains eight Tensor cores per SM, and each core delivers up to 120 TFLOPS for training and inference operations. To save you some math, this brings the full GV100 GPU to an impressive 5,376 FP32 and INT32 cores, 2688 FP64 cores, and 336 texture units.

Like the GP100, we get two SMs per TPC; 42 TPC overall in GV100. And that rolls up into six GPCs.

GV100 also features four HBM2 memory emplacements, like GP100, with each stack controlled by a pair of memory controllers. Speaking of which, there are eight 512-bit memory controllers (giving this GPU a total memory bus width of 4,096-bit). Each memory controller is attached to 768KB of L2 cache, for a total of 6MB of L2 cache (vs 4MB for Pascal).


Tesla V100

The new Nvidia Tesla V100 features 80 SMs for a total of 5,120 CUDA cores. However, it has the potential to reach 7.5, 15, and 120 TFLOPs in FP64, FP32, and Tensor computations, respectively.

The Tesla V100 sports 16GB of HBM2 memory, which is capable of reaching up to 900 GB/s. The Samsung memory that Nvidia installed on the Tesla V100 is also 180 GB/s faster than the memory found on the Tesla P100 cards. Nvidia said it used the fastest memory available on the market.

The Tesla V100 also introduces the second generation of NVLink, which allows for up to 300 GB/s over six 25GB/s NVLinks per GPU. 

To put those numbers into perspective, Nvidia’s Pascal-derived Tesla P100 sports 56 SMs and 3584 CUDA cores, which produce up to 5.3 TFLPs in FP64 computations, and 10.6 TFLOPs in FP32 computations. The V100 offers a full 30% more FP32 computational capability than the P100, and nearly a 50% increase in FP64 performance. And Nvidia increased the NVLink bandwidth of the Tesla V100 by 50% by adding two NVLinks per GPU compared to the Tesla P100 and increasing the bandwidth of each NVLink by 5GB/s. 

Nvidia said the Tesla V100 carries a TDP of 300W, which is the same power requirement as the Tesla P100.

V100 P100

SMs

80

56

Cores

– 5,120 (FP32)

– 2,560 (FP64)

– 3,584 (FP32)

– 1,792 (FP64)

Boost Clock

1,455MHz

1,480MHz

TFLOPs

– 7.5 (FP64)

– 15 (FP32)

– 120 Tensor

– 5.3 (FP64)

– 10.3 (FP32)

Texture Units

320

224

Memory

16GB 4096-bit HBM2

16GB 4096-bit HBM2

Data Rate

900 GB/s

720 GB/s

Transistors

21.1 Billion

15.3 Billion

Manufacturing Process

12nm FFN

16nm FinFET+

Go to Source

The post Nvidia Details Volta GV100 GPU, Tesla V100 Accelerator appeared first on Gigarefurb Refurbished Laptops News.



source https://news.gigarefurb.co.uk/nvidia-details-volta-gv100-gpu-tesla-v100-accelerator/

No comments:

Post a Comment