Review Index:

NVIDIA TITAN V Review Part 1: Gaming

Author: Ryan Shrout
Manufacturer: NVIDIA

Dirt Rally

Dirt Rally (DirectX 11)

DiRT Rally is the most authentic, challenging and thrilling rally game ever made, road-tested over 80 million miles by the DiRT community. It captures the essence of what makes rally unique like no other game: that white knuckle feeling of racing on the edge; trying to remain in control of your emotions as you hurtle along dangerous, undulating roads at breakneck speed, aiming to squeeze everything out of your car whilst knowing that one crash could irreparably harm your stage time.

It’s the ultimate test of a driver’s skill, and the ultimate in high risk, high reward gameplay.

Settings used for Dirt Rally

View Full Size

View Full Size

View Full Size

View Full Size

View Full Size


View Full Size

View Full Size

View Full Size

View Full Size

View Full Size

The Titan V takes its first swing at the gaming world and comes out head. It is 15% faster than the Titan Xp, 21% faster than the 1080 Ti, and 82% faster than the 1080 at 4K!

TITAN V 12GB, Average FPS Comparisons, Dirt Rally
  Titan Xp 12GB GTX 1080 Ti 11GB GTX 1080 8GB Vega 64 Liquid 8GB
2560x1440 +6% +14% +64% +51%
3840x2160 +15% +21% +82% +76%

This table presents the above data in a more basic way, focusing only on the average FPS, so keep that in mind. 

Video News

December 13, 2017 | 10:22 PM - Posted by Kokolordas15 (not verified)

Ryan they added an fps cap in HITMAN GOTY patch at 100fps for all configs.Me and others have tried to message the devs but i dont know what they are doing.

the fps cap is not very stable and I have a theory but its irrelevant.

December 13, 2017 | 10:28 PM - Posted by Kokolordas15 (not verified)

Seeing this thermal throttling,I am also interested to know if the fan speed or the die itself is causing this poor cooling performance.This cooler is supposed to be a bit better than previous FE coolers which could hold 250w.(correct me if I am wrong)

February 9, 2018 | 08:36 AM - Posted by Some random guy (not verified)

This is not because of poor cooling performance. This is because the GPU has a much higher TDP than its predecessors and generates A LOT more heat. It's not designed to be used for gaming.

December 13, 2017 | 11:07 PM - Posted by 5150Joker (not verified)

These results show just how far behind AMD is lagging. If the die shrink of Vega doesn't provide at least a 70% uplift, they're dead next round.

December 14, 2017 | 01:32 AM - Posted by Hishnash (not verified)

That is only if Nvidia can product the GV100 and yields (and volumes) that let it come close the consumer market.

I think its much more likely we might see a refresh of Pascal on the 12nm for gaming (this will still be a big boost) with more Cuda core due to the big power savings of the new prosses. The question here is will this be the same as volta in games? possibly.

But AMD is also scheduled to do a vega re-fresh on a new (lower power) prosses. This will reduce power consumption on vega quite a lot. Sure vega2 (or whatever the name will be) will not be beating a volta but very very very few gamers buy the top end cards so to say AMD is dead is a little pointless and blind. After all, I'm sure AMD sell a load of GPUs (in all of those consoles people buy) the majority of people don't buy TI level GPUs so it is sort of ok for AMD to not target that market.

December 14, 2017 | 06:08 AM - Posted by 5150Joker (not verified)

There’s not a chance in hell we’ll see another Pascal release after what we have now. I can guarantee that 100%.

December 14, 2017 | 09:02 AM - Posted by Anonymously Anonymous (not verified)

Unless you work in a position that gives you power to make decisions about what Nvidia will do and/or own Nvidia, then you have absolutely %0 percent of garanteeing anything about what Nvidia sells or does not sell.

December 14, 2017 | 10:56 AM - Posted by Anonymousanonymity (not verified)

He is right though. No more Pascal is the reasonable conclusion. They have exhausted Pascal with the XP, Xp, and Quadro Pascal cards.

December 14, 2017 | 10:56 AM - Posted by Anonymousanonymity (not verified)

He is right though. No more Pascal is the reasonable conclusion. They have exhausted Pascal with the XP, Xp, and Quadro Pascal cards.

December 14, 2017 | 01:03 PM - Posted by Power (not verified)

Seems like gaming efficiency gains of Volta can be attributed almost excursively to HBM. GDDR6 or HBM equipped Pascal plus some marketing spin will be enough for "next generation".

December 16, 2017 | 02:55 PM - Posted by RGB LED fan (not verified)

Only HBM? I'm sure the 5160 shaders help somewhat too.

December 14, 2017 | 04:49 PM - Posted by MobbenHobbenEnBeboben (not verified)

AMD can always do a dual GPU die on one PCIe card configuration with Vega. Vega 20 is going to be even more DP FP heavy with a 1/2 DP FP to SP FP ratio. And Vega speaks the Infinity Fabric so any dual GPU dies on a single PCIe card configurations may not need to worry about any software/driver/API CF support as 2 GPU dies wired up via the Infinity Fabric IP would look to the software/drivers as a single monolithic logical GPU.

Look at how the Infinity fabric ties all those Zen/Zeppelin dies together on TR/Epyc and that part of Navi is already here. Navi is more about producing scalable GPUs from smaller GPU DIEs that can be wired up Infinity Fabric style to look like one big single GPU more than Navi is that much of a GPU micro-arch change over the Vega GPU micro-arch. Navi is more about that scalable Zen/Zeppelin sort of modular design taken to the next level and the Infinity Fabric IP is in all of AMD's new Zen/Volta products currently.

So any Vega refresh dies on 12nm, including Vega 20 with is higher FP 64 number crunching will have already had the Infinity Fabric IP since the first Vega SKUs where introduced. And that gives AMD the option of wiring up some Dual GPU DIE on one PCIe card designs that can scale up and look to any software/driver just like a single bigger logical GPU.

AMD does not have to wait for Navi to go modular it's just that Navi will be using more smaller GPU die chiplets that can be fabbed with very high yields and give AMD a finer grained ability to scale up GPU power from mobile to flagship using a smaller modular GPU common Die design.

That Radeon Pro Duo(Fiji XT) has 2× 4096:256:64 shaders:TMUs:ROPs for pleny of compute power and non gaming graphics rendering power. So maybe a dual Vega 64 or even a Dual Vega 20 for the professional markets that makes more use of the Infinity Fabric that the Fiji XT Radeon Pro Duo did not have the option of makeing use of.

December 14, 2017 | 12:33 AM - Posted by ROPsOrL2CacheOrHBM2OrAllOfThemAndTMUs (not verified)

96 ROPs for Titan V and a little more memory bandwidth over the the Titan Xp and a lot more shaders. Wikipedia lists the L2 cache size on the Titan V as 4608KB and the Titan Xp's L2 as 4096 KB and the Titan Xp has 96 ROPs as does the Titan V have 96 ROPs. So is it Titan V's HBM2 higher effective memory bandwidth and much wider HBM2 interface that is giving Titan V the most help in gaming or is it the larger cache on the Titan V relative to the Titan XP that is really helping keep the latency to a minimum. Titan V has more TMUs than the Titan Xp and those 320 TMUs on Titan V sure up Nvidia's Texture Fill Rates even relative to AMD's Vega Micro-Arch based Vega 64/56 SKUs.

Titan V's shader counts are overkill for gaming and my money is on the Titan V's larger L2 cache helping to lower the latency because Titan V's ROP counts are the same as Titan Xp's ROP counts. Titan V's lower clocks base/boost clocks are more than made up for by other factors such as more shader cores/L2 cache and higher texture throughput. I'd like to see Titan V's shader core utilization rates and that average closk rate is not too bad on Titan V and I wish there where some Titan Xp average clock rates for comparsion.

It looks like maybe the games are not needing the Shader counts as much as the games may be liking any extra L2 cache that Titan V can have available to keep and mamory access latency issues to a minimum. All that extra HBM2 effective bandwidth that the Titan V has over Titan XP has to count for some uplift over the GDDR5X used on the Titan Xp. And This is the first time HBM2 can be tested for gaming on any Nvidia GPU using gaming drivers and that has to count for some of Titan V's performance delta over Titan Xp.

So the big question still remains as to just what extra ROP resources Nvidia will have on GV102 and GV104 based variants and just what higher clock speeds can be had on any GV104 based Volta variants that will very likely have the shader cores pruned back a good bit.

The ROP counts on any GV102/GV104 based variants will be interesting also as will be Nvidia use of VRAM memory(Gddr or HBM2) on its GV104 gaming variants. Even with all those extra shader cores that extra L2 cache on Titan V has to help.

Bad old Nvidia is requiring regrsitration to view the GV100 whitepapers, so that's a big bummer.

But some other PDF online lists:

FP32 units 64
FP64 units 32
INT32 units 64
Tensor Cores 8
Register File 256 KB
Unified L1/Shared
128 KB
Active Threads 2048

Completely new ISA
Twice the schedulers
Simplified Issue Logic
Large, fast L1 cache
Improved SIMT model
Tensor acceleration
The easiest SM to program yet
Redesigned for Productivity" (1)


Olivier Giroux and Luke Durant
May 10, 2017"

December 14, 2017 | 01:38 AM - Posted by Anonymous_666! (not verified)

"1700 MHz"

What? Surely you mean 17000 MHZ? Or else it's 10x slower RAM than the Titan XP and 1080Ti.

December 14, 2017 | 02:37 AM - Posted by SirMaster (not verified)

No, he means 1700MHz.

It's not slower. Titan V uses HBM2 which has a much wider bus than GDDR5X.

The 1080Ti has an 11008MHz memory clock on a 352-bit bus width, resulting in a memory bandwidth of 484GB/s

The Titan Xp has an 11408MHz memory clock on a 384-bit bus width, resulting in a memory bandwidth of 547.6GB/s

The Titan V has an 1700MHz memory clock on a 3072-bit bus width, resulting in a memory bandwidth of 652.8GB/s

December 14, 2017 | 11:00 AM - Posted by Jabbadap

Yeah I hate it when people uses MHz in wrong places. Clock speed for HBM2 in this thing is 850MHz(This is the real clock which one can overclock) and it can do two bits per clock thus 1.7Gbps, thus card's bandwidth is 3*1.7Gbps*1024bit/(8 bit/Byte)= 652 GB/s

Edit: corrected memory freq.

December 14, 2017 | 11:35 AM - Posted by LotsMoreFeaturesInHBM2 (not verified)

800MHz and data on the falling and rizing edge of the clock for a Dual Data Rate(DDR) of 1600MHz effective. The clock speed is in base 10 and the bandwidth is in base 2 units and do not forget any overhead and parity. And Each JEDEC standard HBM2 stack gets its own 1024 bit wide interface subdivided into 8, 128 bit independently operating channels. And for the JEDEC HBM2 standard Only, not HBM, HBM2 offers a 64 bit pseudo addresing mode where each 128 bit memory channel can be split into 2, 64 bit pseudo channels for finer grained memory access. Each HBM2 stack can have a total bandwidth of 256GB/S clocked at the maximum JEDEC speed.

According to Anandtech/SK Hynix the pseudo channel mode improves latency via optimized memory accesses:

"The second-generation HBM (HBM2) technology, which is outlined by the JESD235A standard, inherits physical 128-bit DDR interface with 2n prefetch architecture, internal organization, 1024-bit input/output, 1.2 V I/O and core voltages as well as all the crucial parts of the original tech. Just like the predecessor, HBM2 supports two, four or eight DRAM devices on a base logic die (2Hi, 4Hi, 8Hi stacks) per KGSD. HBM Gen 2 expands capacity of DRAM devices within a stack to 8 Gb and increases supported data-rates up to 1.6 Gb/s or even to 2 Gb/s per pin. In addition, the new technology brings an important improvement to maximize actual bandwidth.

One of the key enhancements of HBM2 is its Pseudo Channel mode, which divides a channel into two individual sub-channels of 64 bit I/O each, providing 128-bit prefetch per memory read and write access for each one. Pseudo channels operate at the same clock-rate, they share row and column command bus as well as CK and CKE inputs. However, they have separated banks, they decode and execute commands individually. SK Hynix says that the Pseudo Channel mode optimizes memory accesses and lowers latency, which results in higher effective bandwidth.

If, for some reason, an ASIC developer believes that Pseudo Channel mode is not optimal for their product, then HBM2 chips can also work in Legacy mode. While memory makers expect HBM2 to deliver higher effective bandwidth than predecessors, it depends on developers of memory controllers how efficient next-generation memory sub-systems will be. In any case, we will need to test actual hardware before we can confirm that HBM2 is better than HBM1 at the same clock-rate." (1)


"JEDEC Publishes HBM2 Specification as Samsung Begins Mass Production of Chips"

December 14, 2017 | 08:39 AM - Posted by Anonymous_666! (not verified)

Sorry, I totally didn't realize the 1080Ti and especially the Xp product don't use HBM2 as well (and that HBM2 has a lower clock speed but much wider bus).

December 14, 2017 | 01:43 AM - Posted by khanmein

Ryan, can you run with the latest driver? 388.59? Thanks.

December 14, 2017 | 12:41 PM - Posted by Ryan Shrout

Oops, actually, we DID use 388.59, just updated the table.

December 14, 2017 | 01:48 AM - Posted by Anonymous_666! (not verified)

You do ensure Fallout 4 is running in Fullscreen Exclusive Display Mode right? Every time you hit Okay in the configuration utility it will re-enable Borderless Fullscreen (and the option to turn it off in the utility is stupidly grayed out so you need to disable Borderless Fullscreen by editing the config file)

December 14, 2017 | 02:45 PM - Posted by Jeremy Hellstrom

Really?  Didn't realize that, wonder if it will change my performance on those rare occasisons I get to play.

December 14, 2017 | 06:02 AM - Posted by Hypetrain (not verified)

Sniper Elite 4 in DX11?
Thought it was one of the better Async-implementations - or were there Problems with Performance or Stability in DX12?

December 14, 2017 | 12:32 PM - Posted by CNote

I was a little disappointed in not seeing dx12 vs dx11 or even a Vulkan game like Wolfenstein 2. I knows it will blow away a Vega64 but its still interesting.

December 14, 2017 | 06:05 AM - Posted by Anonymous1564986 (not verified)

Why does the gap gets smaller at 4k? Shouldn't it get bigger since it uses HBM?

December 14, 2017 | 07:48 AM - Posted by Anonymous2 (not verified)

That's not how it works. You still have a set amount of ROPs and CUDA cores to do work. The only way Titan V is going to max out its memory is during HPC operations. My guess is that the 1180 Ti, etc. will all use GDDR5X or GDDR6, not HBM.

December 14, 2017 | 09:14 AM - Posted by Anonymously Anonymous (not verified)

The performance is impressive as the card is. However, and I'm sure most would agree, we'd all like to see the performance of this card with a good air cooler or with water cooling, and not this underwhelming reference cooler.
Wonder how long until one of the big custom water cooling suppliers have a kit out for this card.

December 14, 2017 | 12:08 PM - Posted by Anonymouse (not verified)

Why are the clock speeds for RX Vega Liquid set to 1406 MHZ in the GTA V slides? That care does 1677 stock with a 1750 boost.

December 14, 2017 | 12:09 PM - Posted by TensorThisGooFromTheBigAdSpaffersAtTheBigG (not verified)

And Google's TPU Verson 2 does FP 32 bit Tensor Tango at 45 TFlops.

"•Two cores, each with a 128x128 mixed multiply unit (MXU) and 8GB of high-bandwidth memory, adding up to 64GB of HBM for one four-chip device.
•600 GB/s memory bandwidth.
•32-bit floating-point precision math units for scalars and vectors, and 32-bit floating-point-precision matrix multiplication units with reduced precision for multipliers.
•Some 45 TFLOPS of max performance, adding up to 180 TFLOPS for one four-chip device." (1)


"Google boffins tease custom AI math-chip TPU2 stats: 45 TFLOPS, 16GB HBM, benchmarks"

December 14, 2017 | 12:19 PM - Posted by AMDrules (not verified)

Wow the performance is dissappointing. Just 20% after 2 years. I guess this is what a lack of competition results in...

December 14, 2017 | 01:30 PM - Posted by BubbasROPLoveAndePeenExtention (not verified)

It needs more ROPs and lack of ROPs are why Vega is only just competing with the GTX1080. AMD needs to start an ROP increase crash plan and get more ROPs to push out as many FPS as possible. Doesn't AMD realize by now that frame quality does not matter to gamers as much as frame flinging metrics. ROPs are what fling out those frame/FPS metrics that Bubba gamer likes, and Bubba gamer likes them FPS bragging rights more than any actual gaming. Just look at how much Bubba Gamer spends on making his Rig a showpiece like some pickup truck all dolled up to look like an 18 wheeler!

Bubba is in a drag race of ROPs against ROPs and he will pay top dollar for them FPS bragging rights. Ha ha ha, old JHH ain't added any extra ROP's this time around to Nvidia's SKUs so that extra Frame Flinging is not so much above the previous generations SKUs. That GTX 2080 or GTX 1180/whatever thay call it Volta SKU based on the GV104 die better at least get 88 ROPs or it will not outperform the GTX 1080Ti with its 88 ROPs.

ROPs, ROPs, Bubba gamer loves them ROPs! Hey Vern look at my FPS matrics, dat's top notch 20lbs golden belt buckle good! Dat's dem ROP's do'en all that frame flinging and I get more than you, he he haw! Hey Vern my gaming rigs got running lights and mud flaps, Yosemite Sam/Get Back mud flaps with LEDs on ol' Sam's belt buckel, yeehaw!

December 14, 2017 | 12:48 PM - Posted by Palingenesis21 (not verified)

And me who has just acquired a pair of Titan Xp Star Wars Edition, in order to realize soon a SLI (with a Core i9 7900x) ...

Titan v vs. 2-way SLI Titan Xp: what would it give? Tests soon expected ?

December 14, 2017 | 01:45 PM - Posted by Anonymouse (not verified)

Give headache

December 14, 2017 | 02:00 PM - Posted by Anonymous55 (not verified)

For reals, you measure with fraps and can't even get the specs for the Vega right.

I trust these results.

December 14, 2017 | 06:28 PM - Posted by Anonymouse (not verified)

5960X and X99 are pretty dated platform, hopefully we see some updates results with 8700K and OC as these results look like they are seeing a CPU/platform bottleneck.

December 18, 2017 | 12:21 PM - Posted by 124cores (not verified)

Based on these results I don't think we will see any mainstream gaming Volta cards. They made a killing selling a tiny Pascal 300mm die chip as a high-end part due to lack of competition. A 300mm Volta card would only be marginally faster than the 1080 and not worth upgrading for most people. They need a 300m part that is 25-30% faster than the 1080 to maintain their huge margins, that chip will require a brand new architecture and a move to 10nm or 7nm.

December 21, 2017 | 01:39 PM - Posted by Dugom (not verified)

388.71 are here and now support Titan V officialy !

Where the 388.51 doesn't.

Will the min framerate be better?

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

This question is for testing whether you are a human visitor and to prevent automated spam submissions.