NVIDIA Launches Titan V, the World's First Consumer Volta GPU with HBM2

Subject: Graphics Cards | December 7, 2017 - 11:44 PM |
Tagged: Volta, titan, nvidia, graphics card, gpu

NVIDIA made a surprising move late Thursday with the simultaneous announcement and launch of the Titan V, the first consumer/prosumer graphics card based on the Volta architecture.

View Full Size

Like recent flagship Titan-branded cards, the Titan V will be available exclusively from NVIDIA for $2,999. Labeled "the most powerful graphics card ever created for the PC," Titan V sports 12GB of HBM2 memory, 5120 CUDA cores, and a 1455MHz boost clock, giving the card 110 teraflops of maximum compute performance. Check out the full specs below:

6 Graphics Processing Clusters
80 Streaming Multiprocessors
5120 CUDA Cores (single precision)
320 Texture Units
640 Tensor Cores
1200 MHz Base Clock (MHz)
1455 MHz Boost Clock (MHz)
850 MHz Memory Clock
1.7 Gbps Memory Data Rate
4608K L2 Cache Size
12288 MB HBM2 Total Video Memory
3072-bit Memory Interface
652.8 GB/s Total Memory Bandwidth
384 GigaTexels/sec Texture Rate (Bilinear)
12 nm Fabrication Process (TSMC 12nm FFN High Performance)
21.1 Billion Transistor Count
3 x DisplayPort, 1 x HDMI Connectors
Dual Slot Form Factor
One 6-pin, One 8-pin Power Connectors
600 Watts Recommended Power Supply
250 Watts Thermal Design Power (TDP)

The NVIDIA Titan V's 110 teraflops of compute performance compares to a maximum of about 12 teraflops on the Titan Xp, a greater than 9X increase in a single generation. Note that this is a very specific claim though, and references the AI compute capability of the Tensor cores rather than we traditionally measure for GPUs (single precision FLOPS). In that metric, the Titan V only truly offers a jump to 14 TFLOPS. The addition of expensive HBM2 memory also adds to the high price compared to its predecessor.

View Full Size

The Titan V is available now from NVIDIA.com for $2,999, with a limit of 2 per customer. And hey, there's free shipping too.

Source: NVIDIA

December 8, 2017 | 12:12 AM - Posted by arbiter

TPu said it also has "640 Tensor cores (specialized units that accelerate neural-net building/training)" which wasn't listed here.

December 8, 2017 | 12:18 AM - Posted by Jim Tanous

Thanks for catching that. I've added it to the specs list.

December 8, 2017 | 12:28 AM - Posted by TheROPsWhereAreTheROPs (not verified)

384 GigaTexels/sec Texture Rate (Bilinear)and the Vega 64FE's texture fill rate is 409.6 GTexel/s on 256 TMUs! What up with Titan V's Texture fill rate.

December 8, 2017 | 12:19 AM - Posted by TheROPsWhereAreTheROPs (not verified)

320 Texture Units listed and no ROP figures, so 320 TMUs and no ROPs and 0 FPS without some ROPs!

This SKU can not game without ROPs!

December 8, 2017 | 12:24 AM - Posted by BleedingEdgeYes...

So every AAA game at ultra settings and 4K/240Hz rock solid? I can only imagine the limited quantities for a $3000 consumer-focused GPU, but the fact that they didn't go whole hog with 2x8-pin power and AIO liquid cooling only means even more performance could be had. Quite surprised by the 110 teraflop spec for sure, I'm not sure if the supported APIs will be relevant when we need that kind of power to run games. Also wondering if the HDMI port is 2.1 compliant. Looking forward to the upcoming coverage in the months to come.

December 8, 2017 | 12:29 AM - Posted by TheROPsWhereAreTheROPs (not verified)

Not without any ROPs!

December 9, 2017 | 03:29 AM - Posted by Joakim Löwenadler (not verified)

Read again mate, it's 14 TFLOPS (single precision) compared to 12 for Titan Xp. That's the difference in gaming performance.

December 8, 2017 | 12:30 AM - Posted by Anony mouse (not verified)

Whats with the "5120 CUDA Cores (single precision)".

Is it going to be able to do DP and HP at all or if it does is it going to be emulated?

December 8, 2017 | 12:44 AM - Posted by ButWhereAreTheROPsTheFriggingROPs (not verified)

Single Precision 15 TFLOPS
7.5 TFLOPS? (1/2 rate)

According to Anandtech, with Tensor Performance(Deep Learning) 110 TFLOPS and that may be just 8 bit tensor operations.

But without ROPs no frames can be flung!

December 8, 2017 | 01:13 AM - Posted by Anony mouse (not verified)

They have "?" following those numbers. I'm assuming they are just guesses at this point.

December 8, 2017 | 12:51 AM - Posted by razor512

How well does it overclock, and can it handle Pac-Man at 4K resolution?

December 8, 2017 | 12:55 AM - Posted by NoROPsNoRaster (not verified)

It can't even play Canasta without ROPs to raster!

It's got 0 ROPs!

December 8, 2017 | 01:25 AM - Posted by Chaitanya (not verified)

At this rate we can expect x60 class of GPUs at current 1080 prices. Guess lack of competition and too many suckers in the world.

December 8, 2017 | 02:02 AM - Posted by HEXiT (not verified)

surely there having a laugh calling this a consumer card because most consumers dont have a spare body part to sell to get 1.

as for gaming its just not needed.
in that i mean, no current game engine will push the card hard enough to warrant the outlay.
and any game that does push this card will not work well on current hardware without being seriously compromised in some way.

i mean who will be willing to pay 150 for a game that runs like garbage on a $350-$1200 gpu just because it has assets that run on a $3k gpu. we are already seeing a price increase on AAA titles that most gamers cant max out, but still have to pay the inflated prices because the game contains assets for 4k gaming even though most gamers are still on 1080 or less.

its a productivity card pretending to be enthusiast grade IMO.

December 8, 2017 | 02:17 AM - Posted by Hakuren

I know that target audience is not gaming per se even if some will buy it anyway (yes, yes before no ROPs start screaming about no ROPs, yes no ROPs I got the message LOL).

I always thought that because of HBM we won't have to deal with stupidly long VGAs any more. Perhaps this is only related to blower style cooler where you need certain amount of PCB just to mount this monstrosity. Anyway if there will be something with HBM for "normal" users who don't exactly have 3000$ in every pocket they have I expect to be it much shorter.

Waiting how situation will unfold. Always Titan is first and then waiting for 'standard' cards begins.

Of course there is something exciting to shout about too. We have new letter for Titan! I hope nVidia won't do another Titan Vv later. :D Probably went with V because with 3rd gen Titan they risked everybody will call it Titan XXX anyway. Ha, ha oh you sneaky folks. LOL

December 8, 2017 | 11:22 AM - Posted by ItsGotLoadsOfROPsButWithAzzezToKiss (not verified)

Oh it's got ROP it's just the Technology press kissing Nvidia's(any other maker's) A$$!

AS Nvidia will be announcing GV102, and GV104/GV106/GV108 and this Titan-V SKU is just the first binn of the GV100 dies that did not make the grade for $10,000+ compute/HPC market SKUs.

GV102 will be used to make the Quadro/Pro Graphics cards at first with the GV104 base die the one to use for the GTX 2080/2070 variants and then the lower cost GV104/GV106 base die designs will round out the low cost market variants. With the GV102 having the most total ROP's available if Nvidia needs to spin up a GTX 2080Ti way later in the game if AMD's GPUs are getting a little too close in that FPS metric that Bubba Gamer lives and dies by.

GV100's got ROPs so the press needs to be asking about ROPs' but the press is between their review samples and a hard review manual place and that translates into shut up and tow the company line.

December 8, 2017 | 11:44 AM - Posted by Anonymously Anonymous (not verified)

per Anandtech's blurb, it mentions ROP's, but doesn't give an exact number:
https://www.anandtech.com/show/12135/nvidia-announces-nvidia-titan-v-vid...

" ... The differences come with the memory and ROPs. In what's clearly a salvage part for NVIDIA, one of the card's 4 memory partitions has been cut, leaving Titan V with 12GB of HBM2 attached via a 3072-bit memory bus. As each memory controller is associated with a ROP partition and 768 KB of L2 cache, this in turn brings L2 down to 4.5 MB, as well as cutting down the ROP count. ... "

maybe someone else has reported it?

December 8, 2017 | 12:32 PM - Posted by FPSmetricsAndTheDragRaceBubbasMoney (not verified)

The technology press should tell Nvidia/others no full specifications no reporting on Volta. Look at Semiaccurate's Qualcomm Snapdragon 845 "Deep Dive" article and it's juat a jab at Qualcomm for not providing the proper CPU specifications of their so called Kryo "Custom" ARM Cores.

So the ARM "Custom" Core makers are not so up front compared to the x86 CPU makers with the proper CPU core specifications made public. Look at Apple's A7 Cyclone design Apple reported no proper information on that design as Anand Lal Shimpi had to get out the benchmarking and other software tools and look through the header files and try and get that Apple A7 CPU core specification information to the best of his knowledge! And to this very day no one at Anandtech(No Longer Owned BY Anand) mostly uses the A7/Cyclone specifications that Anand himself sussed out for any of Apple's newer A series CPU variants.

It's ROPs that get the FPS and no FPS can be had without ROP's and Nvidia just keeps many base GP102/other, now GV102/other, base die designs handy with plenty of extra ROPs for FPS wins if needed for any GTX/"TI" variants if its GP104, now GV104, base die variants do not provide any excess ROPs to get any higher FPS metrics. AMD Radeon graphics has plenty of TMU/Texture fill rates even compered to this Volta variant.

I suspect that the ARM "Custom Core" makers are not custom core makers at all and they are mostly semi-custom dependents of Arm Holding's reference designs and unlike Apple, via Apple's Palo Alto Semiconductor acquisition, who got plenty of top notch CPU design engineers including Jim Keller who was Vice President of Engineering at P.A. Semi at the time.

But Nvidia not releasing the ROP count's makes me think that Volta is more about those tensor cores and not much more about being that much different than Pascal as far as gaming workloads go. But Nvidia does not appear to care as much about texture qualtity more than it cares about ROP numbers and that FPS metric that gets the Bubba Gamers all up and about with the FPS metrics drag race. 12nm is nice but how much nicer than 16nm and maybe all the performance gains can be explaned with a process node shrink rather than any new GPU Micro-arch engineering.

December 8, 2017 | 12:36 PM - Posted by FPSmetricsAndTheDragRaceBubbasMoney (not verified)

Edit: no one at Anandtech
To: everyone at Anandtech

December 8, 2017 | 01:46 PM - Posted by Tim Verry

Well if true and the tesla v100 ROP count we have in that table is accurate I'm guessing Titan V is going to have 96 ROPs. Ig we'll have to wait and see though :/.

December 8, 2017 | 02:44 AM - Posted by Power (not verified)

With all the AI built in it should be able to play the owner out of her/his money even better than EA.

December 8, 2017 | 04:21 AM - Posted by throAU (not verified)

Haha.

Lets quote single precision TFLOP numbers to make them look bigger.

This is an impressive card, sure - but for AI. For gaming, it is pants.

A 1080TI will outperform it in gaming i suspect.

December 8, 2017 | 05:10 AM - Posted by Martin (not verified)

You wanted to say half-precision, right?

15 SP TFLOPS is definitely nothing to sneeze at. 30+% over both TitanX/1080Ti and Vega64 LE.

December 8, 2017 | 11:08 PM - Posted by Anonymouse (not verified)

Vega 64 LC is 13.7 TFLOPS, which makes a $3000 GPU a hair less than 9% faster for 3.75x even an $800 inflated price. If you don't need Tensor, you shouldn't buy this card.

But that won't stop people with more money than sense who only want the bragging rights.

December 8, 2017 | 10:46 AM - Posted by ET (not verified)

I would one for these for Seti@home alien search!

December 8, 2017 | 11:10 AM - Posted by AnonymousLocust (not verified)

Weird... why is Nvidia trying to sell Quadros in disguise to consumers?

December 8, 2017 | 11:43 AM - Posted by psl2c (not verified)

I can't wait for PCPer to review this.

I wonder if someone will come with yet another cryptocurrency that this time, utilizes the tensor processing units...

December 8, 2017 | 01:39 PM - Posted by Tim Verry

You didnt hear about the Tensorcoin ICO? haha jk.

December 8, 2017 | 03:23 PM - Posted by Anonymouse (not verified)

It still cant max PubG.

December 8, 2017 | 10:45 PM - Posted by Kingkookaluke (not verified)

Please Miners.Buy this card and let us get our hands on reasonably priced 1080ti's!

December 9, 2017 | 10:40 AM - Posted by Rich Fuck (not verified)

It must suck to be poor.

I got 8 for my first shipment, already have Customers'money for those, them next shipment we're getting 24!!

I'll get two of them for project cars and VR porn.

December 9, 2017 | 02:48 PM - Posted by BubbaBragginRightsFerDollars (not verified)

BUBBA GAMER loves them ROPs!

AMD needs ROPs(1) to compete with Nvidia in That FPS contest. If Nvidia has more ROPs then Nvidia can fling more Frames Per Second, no matter the Texture quality or FP processing sdvantage that AMD may have over even Volta based GPU SKUs. Bubba Gamer can not understand frame/texture quality as at 45-120 FPS Bubba Gamer in not going to notice that as much. AMD has a serious ROP deficiency and even though AMD's texture rate far exceeds Nvidia's on average, that texture fill rate only matters for other non FPS necessary graphics workloads. AMD's GPUs have great compute and texture performance but Bubba Gamer only sees FPS and nothing else!

So to sell to Bubba Gamers AMD needs more ROPs! Bubba Gamer don't care a lick about nothing but FPS and bragging to Vern about them FPS metrics and how many flashing lights his gaming rig has.

Now Bubba Coin Miner, well that's a whole different set of Bubbas that don't care 'bout no ROPs!

"The render output unit, often abbreviated as "ROP", and sometimes called (perhaps more properly) raster operations pipeline, is a hardware component in modern graphics processing units (GPUs) and one of the final steps in the rendering process of modern graphics cards. The pixel pipelines take pixel (each pixel is a dimensionless point), and texel information and process it, via specific matrix and vector operations, into a final pixel or depth value. This process is called rasterization. So ROPs control antialiasing, when more than one sample is merged into one pixel. The ROPs perform the transactions between the relevant buffers in the local memory – this includes writing or reading values, as well as blending them together. Dedicated antialiasing hardware used to perform hardware-based antialiasing methods like MSAA is contained in ROPs.

All data rendered has to travel through the ROP in order to be written to the framebuffer, from there it can be transmitted to the display." (1)

(1)

"Render output unit"

https://en.wikipedia.org/wiki/Render_output_unit

December 9, 2017 | 09:42 PM - Posted by MoreVsNewzBlurbz (not verified)

CPU-Z listed in article has the Pixel fill rate at 123.4 GPixel/s and a much better Taxture fill rate of 547.4 GTexel/s. The GTX1080Ti's(88 ROPs) pixel rate is 139.2 GPixel/s so that's interesting. That Volta Texture fill rate is difinitely higher than was listed by some websites before they removed the texture info. Who Knows until the hardware is in the indipendent tester's hands!

"Overclocked NVIDIA TITAN V benchmarks emerge"

https://videocardz.com/74382/overclocked-nvidia-titan-v-benchmarks-emerge

December 12, 2017 | 01:11 AM - Posted by James

This is not a consumer card.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.