Author:
Manufacturer: NVIDIA

A Look Back and Forward

Although NVIDIA's new GPU architecture, revealed previously as Turing, has been speculated about for what seems like an eternity at this point, we finally have our first look at exactly what NVIDIA is positioning as the future of gaming.

geforce-rtx-2080.png

Unfortunately, we can't talk about this card just yet, but we can talk about what powers it

First though, let's take a look at the journey to get here over the past 30 months or so.

Unveiled in early 2016, Pascal marked by the launch of the GTX 1070 and 1080 was NVIDIA's long-awaited 16nm successor to Maxwell. Constrained by the oft-delayed 16nm process node, Pascal refined the shader unit design original found in Maxwell, while lowering power consumption and increasing performance.

Next, in May 2017 came Volta, the next (and last) GPU architecture outlined in NVIDIA's public roadmaps since 2013. However, instead of the traditional launch with a new GeForce gaming card, Volta saw a different approach.

Click here to continue reading our analysis of NVIDIA's Turing Graphics Architecture

Podcast #510 - NVIDIA 2080 Launch, blockchain gaming, and more!

Subject: General Tech | August 23, 2018 - 03:54 PM |
Tagged: Volta, video, turing, Threadripper, rtx, podcast, nzxt, nvidia, logitech, arm, amd

PC Perspective Podcast #510 - 08/23/18

Join us this week for discussion on NVIDIA 2080 Launch, blockchain gaming, and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

Hosts: Jeremy Hellstrom, Josh Walrath, Allyn Malventano

Peanut Gallery: Alex Lustenberg

Program length: 1:24:43

Podcast topics of discussion:
  1. Week in Review:
  2. News items of interest:
  3. Picks of the Week:
    1. 1:14:15 Jeremy: I love 14cm fans!
  4. Closing/outro
 
 
Source:

Turing vs Volta: Two Chips Enter. No One Dies.

Subject: Graphics Cards | August 21, 2018 - 08:43 PM |
Tagged: nvidia, Volta, turing, tu102, gv100

In the past, when NVIDIA launched a new GPU architecture, they would make a few designs for each of their market segments. All SKUs would be one of those chips, with varying amounts of it disabled or re-clocked to hit multiple price points. The mainstream enthusiast (GTX -70/-80) chip of each generation is typically 300mm2, and the high-end enthusiast (Titan / -80 Ti) chip is often around 600mm2.

nvidia-2016-gtc-pascal-banner.png

Kepler used quite a bit of that die space for FP64 calculations, but that did not happen with consumer versions of Pascal. Instead, GP100 supported 1:2:4 FP64:FP32:FP16 performance ratios. This is great for the compute community, such as scientific researchers, but games are focused on FP32. Shortly thereafter, NVIDIA releases GP102, which had the same number of FP32 cores (3840) as GP100 but with much-reduced 64-bit performance… and much reduced die area. GP100 was 610mm2, but GP102 was just 471mm2.

At this point, I’m thinking that NVIDIA is pulling scientific computing chips away from the common user to increase the value of their Tesla parts. There was no reason to either make a cheap 6XXmm2 card available to the public, and a 471mm2 part could take the performance crown, so why not reap extra dies from your wafer (and be able to clock them higher because of better binning)?

nvidia-2017-sc17-japanaisuper.jpg

And then Volta came out. And it was massive (815mm2).

At this point, you really cannot manufacture a larger integrated circuit. You are at the limit of what TSMC (and other fabs) can focus onto your silicon. Again, it’s a 1:2:4 FP64:FP32:FP16 ratio. Again, there is no consumer version in sight. Again, it looked as if NVIDIA was going to fragment their market and leave consumers behind.

And then Turing was announced. Apparently, NVIDIA still plans on making big chips for consumers… just not with 64-bit performance. The big draw of this 754mm2 chip is its dedicated hardware for raytracing. We knew this technology was coming, and we knew that the next generation would have technology to make this useful. I figured that meant consumer-Volta, and NVIDIA had somehow found a way to use Tensor cores to cast rays. Apparently not… but, don’t worry, Turing has Tensor cores too… they’re just for machine-learning gaming applications. Those are above and beyond the raytracing ASICs, and the CUDA cores, and the ROPs, and the texture units, and so forth.

nvidia-2018-geforce-rtx-turing-630-u.jpg

But, raytracing hype aside, let’s think about the product stack:

  1. NVIDIA now has two ~800mm2-ish chips… and
  2. They serve two completely different markets.

In fact, I cannot see either FP64 or raytracing going anywhere any time soon. As such, it’s my assumption that NVIDIA will maintain two different architectures of GPUs going forward. The only way that I can see this changing is if they figure out a multi-die solution, because neither design can get any bigger. And even then, what workload would it even perform? (Moment of silence for 10km x 10km video game maps.)

What do you think? Will NVIDIA keep two architectures going forward? If not, how will they serve all of their customers?

NVIDIA Releases GeForce 397.31. RTX for Developers.

Subject: Graphics Cards | April 25, 2018 - 08:27 PM |
Tagged: nvidia, graphics drivers, rtx, Volta

It’s quite the jump in version number from 391.35 to 397.31, but NVIDIA has just released a new graphics driver. Interestingly, it is “Game Ready” tied to the Battletech, which I have been looking forward to, but I was always under the impression that no-one else was. Apparently not.

nvidia-geforce.png

As for its new features? The highlight is a developer preview of NVIDIA RTX Technology. This requires a Volta GPU, which currently means Titan V unless your team was seeded something that doesn’t necessarily exist, as well as 396.xx+ drivers, the new Windows 10 update, and Microsoft DXR developer package. Speaking of which, I’m wondering how much of the version number bump could be attributed to RTX being on the 396.xx branch. Even then, it still feels like a branch or two never left NVIDIA’s dev team. Hmm.

Moving on, the driver also conforms with the Vulkan 1.1 test suite (version 1.1.0.3). If you remember back from early March, the Khronos Group released the new standard, which integrated a bunch of features into core, and brought Subgroup Operations into the mix. This could allow future shaders to perform quicker by being compiled with new intrinsic functions.

Also – the standalone installer will apparently clean up after itself better than it used to. Often I can find a few gigabytes of old NVIDIA folders when I’m looking for space to save, so it’s good for NVIDIA to finally address at least some of that.

Pick up the new drivers on NVIDIA’s website or through GeForce Experience.

Source: NVIDIA

Podcast #493 - New XPS 13, Noctua NH-L9a, News from NVIDIA GTC and more!

Subject: General Tech | March 29, 2018 - 02:37 PM |
Tagged: podcast, nvidia, GTC 2018, Volta, quadro gv100, dgx-2, noctua, NH-L9a-AM4

PC Perspective Podcast #493 - 03/29/18

Join us this week for our review of the new XPS 13,  Noctua NH-L9a, news from NVIDIA GTC and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

Hosts: Allyn Malventano, Jeremy Hellstrom, Josh Walrath

Peanut Gallery: Ken Addison

Program length: 0:59:35

Podcast topics of discussion:

  1. Week in Review:
  2. News items of interest:
  3. Picks of the Week:
    1. Allyn: retro game music remixed - ocremix.org (torrents)

NVIDIA Announces DGX-2 with 16 GV100s & 8 100Gb NICs

Subject: Systems | March 27, 2018 - 08:04 PM |
Tagged: Volta, nvidia, dgx-2, DGX

So… this is probably not for your home.

NVIDIA has just announced their latest pre-built system for enterprise customers: the DGX-2. In it, sixteen Volta-based Tesla V100 graphics devices are connected using NVSwitch. This allows groups of graphics cards to communicate to and from every other group at 300GB/s, which, to give a sense of scale, is about as much bandwidth as the GTX 1080 has available to communicate with its own VRAM. NVSwitch treats all 512GB as a unified memory space, too, which means that the developer doesn’t need redundant copies across multiple boards just so it can be seen by the target GPU.

nvidia-2018-dgx2-explode.png

Note: 512GB is 16 x 32GB. This is not a typo. 32GB Tesla V100s are now available.

For a little recap, Tesla V100 cards run a Volta-based GV100 GPU, which has 5120 CUDA cores and runs them at ~15 TeraFLOPs of 32-bit performance. Each of these cores also scale exactly to FP64 and FP16, as was the case since Pascal’s high-end offering, leading to ~7.5 TeraFLOPs of 64-bit or ~30 TeraFLOPs of 16-bit computational throughput. Multiply that by sixteen and you get 480 TeraFLOPs of FP16, 240 TeraFLOPs of FP32, or 120 TeraFLOPs of FP64 performance for the whole system. If you count the tensor units, then we’re just under 2 PetaFlops of tensor instructions. This is powered by a pair of Xeon Platinum CPUs (Skylake) and backed by 1.5TB of system RAM – which is only 3x the amount of RAM that the GPUs have if you stop and think about it.

nvidia-2018-dgx-list.png

The device communicates with the outside world through eight EDR InfiniBand NICs. NVIDIA claims that this yields 1600 gigabits of bi-directional bandwidth. Given how much data this device is crunching, it makes sense to keep data flowing in and out as fast as possible, especially for real-time applications. While the Xeons are fast and have many cores, I’m curious to see how much overhead the networking adds to the system when under full load, minus any actual processing.

NVIDIA’s DGX-2 is expected to ship in Q3.

Source: NVIDIA

Blender Foundation Releases Blender 2.79a

Subject: General Tech | March 4, 2018 - 04:55 PM |
Tagged: Blender, Volta, nvidia

Normally the “a” patch of Blender arrives much closer to the number release – about a month or so.

Five months after 2.79, however, the Blender Foundation has released 2.79a. It seemed likely that it would happen at some point, because it looks like they are aiming for 2.80 to be the next full release, and that will take some time. I haven’t had a chance to use 2.79a yet, but the release notes are mostly bug fixes and performance improvements.

blender-2017-cyclesdenoise.png

Glancing through the release notes, one noteworthy edition is that Blender 2.79a now includes the CUDA 9 SDK in its build process, and it includes work-arounds for “performance loss” with those devices. While I haven’t heard any complaints from Titan V owners, the lack of CUDA 8 SDK was a big problem for early owners of GeForce GTX 10X0 cards, so Volta users might have been suffering in silence until now. If you were having issues with the Titan V, then you should try 2.79a.

If you’re interested, be sure to check out the latest release. As always, it’s free.

Author:
Manufacturer: NVIDIA

Looking Towards the Professionals

This is a multi-part story for the NVIDIA Titan V:

Earlier this week we dove into the new NVIDIA Titan V graphics card and looked at its performacne from a gaming perspective. Our conclusions were more or less what we expected - the card was on average ~20% faster than the Titan Xp and about ~80% faster than the GeForce GTX 1080. But with that $3000 price tag, the Titan V isn't going to win any enthusiasts over.

What the Titan V is meant for in reality is the compute space. Developers, coders, engineers, and professionals that use GPU hardware for research, for profit, or for both. In that case, $2999 for the Titan V is simply an investment that needs to show value in select workloads. And though $3000 is still a lot of money, keep in mind that the NVIDIA Quadro GP100, the most recent part with full-performance double precision compute from the Pascal chip, is still selling for well over $6000 today. 

IMG_5009.JPG

The Volta GV100 GPU offers 1:2 double precision performance, equating to 2560 FP64 cores. That is a HUGE leap over the GP102 GPU used on the Titan Xp that uses a 1:32 ratio, giving us just 120 FP64 cores equivalent.

  Titan V Titan Xp GTX 1080 Ti GTX 1080 GTX 1070 Ti GTX 1070 RX Vega 64 Liquid Vega Frontier Edition
GPU Cores 5120 3840 3584 2560 2432 1920 4096 4096
FP64 Cores 2560 120 112 80 76 60 256 256
Base Clock 1200 MHz 1480 MHz 1480 MHz 1607 MHz 1607 MHz 1506 MHz 1406 MHz 1382 MHz
Boost Clock 1455 MHz 1582 MHz 1582 MHz 1733 MHz 1683 MHz 1683 MHz 1677 MHz 1600 MHz
Texture Units 320 240 224 160 152 120 256 256
ROP Units 96 96 88 64 64 64 64 64
Memory 12GB 12GB 11GB 8GB 8GB 8GB 8GB 16GB
Memory Clock 1700 MHz MHz 11400 MHz 11000 MHz 10000 MHz 8000 MHz 8000 MHz 1890 MHz 1890 MHz
Memory Interface 3072-bit
HBM2
384-bit G5X 352-bit G5X 256-bit G5X 256-bit 256-bit 2048-bit HBM2 2048-bit HBM2
Memory Bandwidth 653 GB/s 547 GB/s 484 GB/s 320 GB/s 256 GB/s 256 GB/s 484 GB/s 484 GB/s
TDP 250 watts 250 watts 250 watts 180 watts 180 watts 150 watts 345 watts 300 watts
Peak Compute 12.2 (base) TFLOPS
14.9 (boost) TFLOPS
12.1 TFLOPS 11.3 TFLOPS 8.2 TFLOPS 7.8 TFLOPS 5.7 TFLOPS 13.7 TFLOPS 13.1 TFLOPS
Peak DP Compute 6.1 (base) TFLOPS
7.45 (boost) TFLOPS
0.37 TFLOPS 0.35 TFLOPS 0.25 TFLOPS 0.24 TFLOPS 0.17 TFLOPS 0.85 TFLOPS 0.81 TFLOPS
MSRP (current) $2999 $1299 $699 $499 $449 $399 $699 $999

The current AMD Radeon RX Vega 64, and the Vega Frontier Edition, all ship with a 1:16 FP64 ratio, giving us the equivalent of 256 DP cores per card.

Test Setup and Benchmarks

Our testing setup remains the same from our gaming tests, but obviously the software stack is quite different. 

  PC Perspective GPU Testbed
Processor Intel Core i7-5960X Haswell-E
Motherboard ASUS Rampage V Extreme X99
Memory G.Skill Ripjaws 16GB DDR4-3200
Storage OCZ Agility 4 256GB (OS)
Adata SP610 500GB (games)
Power Supply Corsair AX1500i 1500 watt
OS Windows 10 x64
Drivers AMD: 17.10.2
NVIDIA: 388.59

Applications in use include:

  • Luxmark 
  • Cinebench R15
  • VRay
  • Sisoft Sandra GPU Compute
  • SPECviewperf 12.1
  • FAHBench

Let's not drag this along - I know you are hungry for results! (Thanks to Ken for running most of these tests for us!!)

Continue reading part 2 of our Titan V review on compute performance!!

Author:
Manufacturer: NVIDIA

A preview of potential Volta gaming hardware

This is a multi-part story for the NVIDIA Titan V:

As a surprise to most of us in the media community, NVIDIA launched a new graphics card to the world, the TITAN V. No longer sporting the GeForce brand, NVIDIA has returned the Titan line of cards to where it began – clearly targeted at the world of developers and general purpose compute. And if that branding switch isn’t enough to drive that home, I’m guessing the $2999 price tag will be.

Today’s article is going to look at the TITAN V from the angle that is likely most interesting to the majority of our readers, that also happens to be the angle that NVIDIA is least interested in us discussing. Though targeted at machine learning and the like, there is little doubt in my mind that some crazy people will want to take on the $3000 price to see what kind of gaming power this card can provide. After all, this marks the first time that a Volta-based GPU from NVIDIA has shipped in a place a consumer can get their hands on it, and the first time it has shipped with display outputs. (That’s kind of important to build a PC around it…)

IMG_4999.JPG

From a scientific standpoint, we wanted to look at the Titan V for the same reasons we tested the AMD Vega Frontier Edition cards upon their launch: using it to estimate how future consumer-class cards will perform in gaming. And, just as we had to do then, we purchased this Titan V from NVIDIA.com with our own money. (If anyone wants to buy this from me to recoup the costs, please let me know! Ha!)

  Titan V Titan Xp GTX 1080 Ti GTX 1080 GTX 1070 Ti GTX 1070 RX Vega 64 Liquid Vega Frontier Edition
GPU Cores 5120 3840 3584 2560 2432 1920 4096 4096
Base Clock 1200 MHz 1480 MHz 1480 MHz 1607 MHz 1607 MHz 1506 MHz 1406 MHz 1382 MHz
Boost Clock 1455 MHz 1582 MHz 1582 MHz 1733 MHz 1683 MHz 1683 MHz 1677 MHz 1600 MHz
Texture Units 320 240 224 160 152 120 256 256
ROP Units 96 96 88 64 64 64 64 64
Memory 12GB 12GB 11GB 8GB 8GB 8GB 8GB 16GB
Memory Clock 1700 MHz MHz 11400 MHz 11000 MHz 10000 MHz 8000 MHz 8000 MHz 1890 MHz 1890 MHz
Memory Interface 3072-bit
HBM2
384-bit G5X 352-bit G5X 256-bit G5X 256-bit 256-bit 2048-bit HBM2 2048-bit HBM2
Memory Bandwidth 653 GB/s 547 GB/s 484 GB/s 320 GB/s 256 GB/s 256 GB/s 484 GB/s 484 GB/s
TDP 250 watts 250 watts 250 watts 180 watts 180 watts 150 watts 345 watts 300 watts
Peak Compute 12.2 (base) TFLOPS
14.9 (boost) TFLOPS
12.1 TFLOPS 11.3 TFLOPS 8.2 TFLOPS 7.8 TFLOPS 5.7 TFLOPS 13.7 TFLOPS 13.1 TFLOPS
MSRP (current) $2999 $1299 $699 $499   $399 $699 $999

The Titan V is based on the GV100 GPU though with some tweaks that lower performance and capability slightly when compared to the Tesla-branded equivalent hardware. Though our add-in card iteration has the full 5120 CUDA cores enabled, the HBM2 memory bus is reduced from 4096-bit to 3072-bit and it has one of the four stacks on the package disabled. This also drops the memory capacity from 16GB to 12GB, and memory bandwidth to 652.8 GB/s.

Continue reading our gaming review of the NVIDIA Titan V!!

Video: What does a $3000 GPU look like? NVIDIA TITAN V Unboxing and Teardown!

Subject: Graphics Cards | December 12, 2017 - 07:51 PM |
Tagged: nvidia, titan, titan v, Volta, video, teardown, unboxing

NVIDIA launched the new Titan V graphics card last week, a $2999 part targeted not at gamers (thankfully) but instead at developers of machine learning applications. Based on the GV100 GPU and 12GB of HBM2 memory, the Titan V is an incredibly powerful graphics card. We have every intention of looking at the gaming performance of this card as a "preview" of potential consumer Volta cards that may come out next year. (This is identical to our stance of testing the Vega Frontier Edition cards.)

But for now, enjoy this unboxing and teardown video that takes apart the card to get a good glimpse of that GV100 GPU.

A couple of quick interesting notes:

  • This implementation has 25% of the memory and ROPs disabled, giving us 12GB of HBM2, a 3072-bit bus, and 96 ROPs.
  • Clock speeds in our testing look to be much higher than the base AND boost ratings.
  • So far, even though the price takes this out of the gaming segment completely, we are impressed with some of the gaming results we have found.
  • The cooler might LOOK the same, but it definitely is heavier than the cooler and build for the Titan Xp.
  • Champagne. It's champagne colored.
  • Double precision performance is insanely good, spanking the Titan Xp and Vega so far in many tests.
  • More soon!

gv100.png

Source: NVIDIA