NVIDIA Announces Tesla V100 with Volta GPU at GTC 2017

Subject: Graphics Cards | May 10, 2017 - 01:32 PM |
Tagged: v100, tesla, nvidia, gv100, gtc 2017

During the opening keynote to NVIDIA’s GPU Technology Conference, CEO Jen-Hsun Huang formally unveiled the latest GPU architecture and the first product based on it. The Tesla V100 accelerator is based on the Volta GPU architecture and features some amazingly impressive specifications. Let’s take a look.

  Tesla V100 GTX 1080 Ti Titan X (Pascal) GTX 1080 GTX 980 Ti TITAN X GTX 980 R9 Fury X R9 Fury
GPU GV100 GP102 GP102 GP104 GM200 GM200 GM204 Fiji XT Fiji Pro
GPU Cores 5120 3584 3584 2560 2816 3072 2048 4096 3584
Base Clock - 1480 MHz 1417 MHz 1607 MHz 1000 MHz 1000 MHz 1126 MHz 1050 MHz 1000 MHz
Boost Clock 1455 MHz 1582 MHz 1480 MHz 1733 MHz 1076 MHz 1089 MHz 1216 MHz - -
Texture Units 320 224 224 160 176 192 128 256 224
ROP Units 128 (?) 88 96 64 96 96 64 64 64
Memory 16GB 11GB 12GB 8GB 6GB 12GB 4GB 4GB 4GB
Memory Clock 878 MHz (?) 11000 MHz 10000 MHz 10000 MHz 7000 MHz 7000 MHz 7000 MHz 500 MHz 500 MHz
Memory Interface 4096-bit (HBM2) 352-bit 384-bit G5X 256-bit G5X 384-bit 384-bit 256-bit 4096-bit (HBM) 4096-bit (HBM)
Memory Bandwidth 900 GB/s 484 GB/s 480 GB/s 320 GB/s 336 GB/s 336 GB/s 224 GB/s 512 GB/s 512 GB/s
TDP 300 watts 250 watts 250 watts 180 watts 250 watts 250 watts 165 watts 275 watts 275 watts
Peak Compute 15 TFLOPS 10.6 TFLOPS 10.1 TFLOPS 8.2 TFLOPS 5.63 TFLOPS 6.14 TFLOPS 4.61 TFLOPS 8.60 TFLOPS 7.20 TFLOPS
Transistor Count 21.1B 12.0B 12.0B 7.2B 8.0B 8.0B 5.2B 8.9B 8.9B
Process Tech 12nm 16nm 16nm 16nm 28nm 28nm 28nm 28nm 28nm
MSRP (current) lol $699 $1,200 $599 $649 $999 $499 $649 $549

While we are low on details today, it appears that the fundamental compute units of Volta are similar to that of Pascal. The GV100 has 80 SMs with 40 TPCs and 5120 total CUDA cores, a 42% increase over the GP100 GPU used on the Tesla P100 and 42% more than the GP102 GPU used on the GeForce GTX 1080 Ti. The structure of the GPU remains the same GP100 with the CUDA cores organized as 64 single precision (FP32) per SM and 32 double precision (FP64) per SM.

View Full Size

Click to Enlarge

Interestingly, NVIDIA has already told us the clock speed of this new product as well, coming in at 1455 MHz Boost, more than 100 MHz lower than the GeForce GTX 1080 Ti and 25 MHz lower than the Tesla P100.

View Full Size

Click to Enlarge

Volta adds in support for a brand new compute unit though, known as Tensor Cores. With 640 of these on the GPU die, NVIDIA directly targets the neural network and deep learning fields. If this is your first time hearing about Tensor, you should read up on its influence on the hardware markets, bringing forth an open-source software library for machine learning. Google has invested in a Tensor-specific processor already, and now NVIDIA throws its hat in the ring.

Adding Tensor Cores to Volta allows the GPU to do mass processing for deep learning, on the order of a 12x improvement over Pascal’s capabilities using CUDA cores only.

View Full Size

For users interested in standard usage models, including gaming, the GV100 GPU offers 1.5x improvement in FP32 computing, up to 15 TFLOPS of theoretical performance and 7.5 TFLOPS of FP64. Other relevant specifications include 320 texture units, a 4096-bit HBM2 memory interface and 16GB of memory on-module. NVIDIA claims a memory bandwidth of 900 GB/s which works out to 878 MHz per stack.

Maybe more impressive is the transistor count: 21.1 BILLION! NVIDIA claims that this is the largest chip you can make physically with today’s technology. Considering it is being built on TSMC's 12nm FinFET technology and has an 815 mm2 die size, I see no reason to doubt them.

View Full Size

Shipping is scheduled for Q3 for Tesla V100 – at least that is when NVIDIA is promising the DXG-1 system using the chip is promised to developers.

I know many of you are interested in the gaming implications and timelines – sorry, I don’t have an answer for you yet. I will say that the bump from 10.6 TFLOPS to 15 TFLOPS is an impressive boost! But if the server variant of Volta isn’t due until Q3 of this year, I find it hard to think NVIDIA would bring the consumer version out faster than that. And whether or not NVIDIA offers gamers the chip with non-HBM2 memory is still a question mark for me and could directly impact performance and timing.

More soon!!

Source: NVIDIA

May 10, 2017 | 02:03 PM - Posted by StephanS

I would expect this be priced at ~$3000+ and wont enter the consumer market for at least 6+month after availability in a reduced form.
nvidia only got itself to compete with at those levels, so not point in undercutting the GTX 1080 ti.

Volta only need to come in the consumer product when FP16 is fully leveraged. Might be over a year ?

May 10, 2017 | 11:05 PM - Posted by bria5544

This card isn't going to be anywhere near $3,000. Probably close to $8,000 MSRP. The current P100 is $7,000 MSRP.

May 10, 2017 | 02:07 PM - Posted by boidsonly

JHC, nVidia are just smacking AMD around like a...

May 10, 2017 | 03:08 PM - Posted by analogue

So now we know that AMD Vega will sit between Pascal and Volta, with 12,5 tflops FP32 but lets see about the rapid packed math... if it's as efficient as Volta's, so that it can reach close to 100 Tflops.
However, I do believe that price wise Vega will be much cheaper! A die size of 815mm^2 is just insane and massively expensive...
Assuming that the GTX2080 is going to be a cut down of this beast, it will probably come out around 12,5/13,0 tflops, just to hedge AMD's Vega!
So Vega doesn't look that bad if the pricing is correct.

May 10, 2017 | 03:59 PM - Posted by renz

Titan Xp already rated at 12Tflops at it's stock configuration. if you can push the clock to 2Ghz (which is pretty much all pascal chip capable of) the performance will peak at 15Tflops! this new chip main focus is FP64 and deep learning stuff just like GP100. they are not gaming chip like GP102/104/106/107/108 are.

May 10, 2017 | 04:36 PM - Posted by analogue

As I said the 2080 will be a cut down of GV100 and if nvidia delivers the expected bang, I mean a (20)80 model with the performance of the previous gen Xp or Ti version, then we are looking at 12,5tflops of raw performance for the 2080.
I also doubt that the Volta architecture will clock as high as Pascal.This is a more wider GPU so it will be difficult to clock as high, it's 3840 cuda cores Vs 5120. A cut down version of GV100 will probably come with higher clocks but also less SM's.
20% less SM's than the GV100 gives something like 64 SM's times 64 cuda cores = 4096. Isn't this the same as Vega 10 ?
Die size something around 600mm^2 ?? Still massive!
The GPU world is on fire!!! :)

May 10, 2017 | 05:22 PM - Posted by psuedonymous

"So now we know that AMD Vega will sit between Pascal and Volta"

Based on information that AMD has actually released (rathern than Wild Speculation), we know that Vega will be called Vega.

May 10, 2017 | 03:57 PM - Posted by mLocke

on track for Summit!


May 10, 2017 | 05:25 PM - Posted by CB

2018 for GPU's....

Color me disappointed.

Nvidia waiting for AMD to make them produce just like Intel.

May 10, 2017 | 06:49 PM - Posted by JohnGR

That's a big chip. 3 billions in development. So Vega will be 30% faster than GV100? Just kidding.

People should look at that chip, that 3 billion dollars number, consider the fact that TSMC even created a variance of 12nm FinFET just for Nvidia, and then realize that demanding from AMD to offer something much faster at a much lower price is just stupid.

Anyway, Nvidia is going full steam ahead for the AI and deep learning market. If they succeed there, they will have enough money to challenge Intel in the future. They don't worry about the GPU market of course. AMD had started creating Vega with GTX 1080 in mind, not 1080 Ti or 2080 in a few months from now.

PS The FULL GV100 comes with 5376 CUDA cores. V100 uses a cut down version.

May 11, 2017 | 08:33 AM - Posted by willmore


Best table filler value ever.

May 11, 2017 | 11:13 AM - Posted by razor512

Hopefully they will figure out how to stop price gouging, and sell it for $300.

May 13, 2017 | 11:10 AM - Posted by xendrome

You literally have no idea what this type of hardware is used for do you?

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.