Podcast #493 - New XPS 13, Noctua NH-L9a, News from NVIDIA GTC and more!

Subject: General Tech | March 29, 2018 - 02:37 PM |
Tagged: podcast, nvidia, GTC 2018, Volta, quadro gv100, dgx-2, noctua, NH-L9a-AM4

PC Perspective Podcast #493 - 03/29/18

Join us this week for our review of the new XPS 13,  Noctua NH-L9a, news from NVIDIA GTC and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

Hosts: Allyn Malventano, Jeremy Hellstrom, Josh Walrath

Peanut Gallery: Ken Addison

Program length: 0:59:35

Podcast topics of discussion:

  1. Week in Review:
  2. News items of interest:
  3. Picks of the Week:
    1. Allyn: retro game music remixed - ocremix.org (torrents)

NVIDIA Announces DGX-2 with 16 GV100s & 8 100Gb NICs

Subject: Systems | March 27, 2018 - 08:04 PM |
Tagged: Volta, nvidia, dgx-2, DGX

So… this is probably not for your home.

NVIDIA has just announced their latest pre-built system for enterprise customers: the DGX-2. In it, sixteen Volta-based Tesla V100 graphics devices are connected using NVSwitch. This allows groups of graphics cards to communicate to and from every other group at 300GB/s, which, to give a sense of scale, is about as much bandwidth as the GTX 1080 has available to communicate with its own VRAM. NVSwitch treats all 512GB as a unified memory space, too, which means that the developer doesn’t need redundant copies across multiple boards just so it can be seen by the target GPU.

nvidia-2018-dgx2-explode.png

Note: 512GB is 16 x 32GB. This is not a typo. 32GB Tesla V100s are now available.

For a little recap, Tesla V100 cards run a Volta-based GV100 GPU, which has 5120 CUDA cores and runs them at ~15 TeraFLOPs of 32-bit performance. Each of these cores also scale exactly to FP64 and FP16, as was the case since Pascal’s high-end offering, leading to ~7.5 TeraFLOPs of 64-bit or ~30 TeraFLOPs of 16-bit computational throughput. Multiply that by sixteen and you get 480 TeraFLOPs of FP16, 240 TeraFLOPs of FP32, or 120 TeraFLOPs of FP64 performance for the whole system. If you count the tensor units, then we’re just under 2 PetaFlops of tensor instructions. This is powered by a pair of Xeon Platinum CPUs (Skylake) and backed by 1.5TB of system RAM – which is only 3x the amount of RAM that the GPUs have if you stop and think about it.

nvidia-2018-dgx-list.png

The device communicates with the outside world through eight EDR InfiniBand NICs. NVIDIA claims that this yields 1600 gigabits of bi-directional bandwidth. Given how much data this device is crunching, it makes sense to keep data flowing in and out as fast as possible, especially for real-time applications. While the Xeons are fast and have many cores, I’m curious to see how much overhead the networking adds to the system when under full load, minus any actual processing.

NVIDIA’s DGX-2 is expected to ship in Q3.

Source: NVIDIA

Blender Foundation Releases Blender 2.79a

Subject: General Tech | March 4, 2018 - 04:55 PM |
Tagged: Blender, Volta, nvidia

Normally the “a” patch of Blender arrives much closer to the number release – about a month or so.

Five months after 2.79, however, the Blender Foundation has released 2.79a. It seemed likely that it would happen at some point, because it looks like they are aiming for 2.80 to be the next full release, and that will take some time. I haven’t had a chance to use 2.79a yet, but the release notes are mostly bug fixes and performance improvements.

blender-2017-cyclesdenoise.png

Glancing through the release notes, one noteworthy edition is that Blender 2.79a now includes the CUDA 9 SDK in its build process, and it includes work-arounds for “performance loss” with those devices. While I haven’t heard any complaints from Titan V owners, the lack of CUDA 8 SDK was a big problem for early owners of GeForce GTX 10X0 cards, so Volta users might have been suffering in silence until now. If you were having issues with the Titan V, then you should try 2.79a.

If you’re interested, be sure to check out the latest release. As always, it’s free.

Author:
Manufacturer: NVIDIA

Looking Towards the Professionals

This is a multi-part story for the NVIDIA Titan V:

Earlier this week we dove into the new NVIDIA Titan V graphics card and looked at its performacne from a gaming perspective. Our conclusions were more or less what we expected - the card was on average ~20% faster than the Titan Xp and about ~80% faster than the GeForce GTX 1080. But with that $3000 price tag, the Titan V isn't going to win any enthusiasts over.

What the Titan V is meant for in reality is the compute space. Developers, coders, engineers, and professionals that use GPU hardware for research, for profit, or for both. In that case, $2999 for the Titan V is simply an investment that needs to show value in select workloads. And though $3000 is still a lot of money, keep in mind that the NVIDIA Quadro GP100, the most recent part with full-performance double precision compute from the Pascal chip, is still selling for well over $6000 today. 

IMG_5009.JPG

The Volta GV100 GPU offers 1:2 double precision performance, equating to 2560 FP64 cores. That is a HUGE leap over the GP102 GPU used on the Titan Xp that uses a 1:32 ratio, giving us just 120 FP64 cores equivalent.

  Titan V Titan Xp GTX 1080 Ti GTX 1080 GTX 1070 Ti GTX 1070 RX Vega 64 Liquid Vega Frontier Edition
GPU Cores 5120 3840 3584 2560 2432 1920 4096 4096
FP64 Cores 2560 120 112 80 76 60 256 256
Base Clock 1200 MHz 1480 MHz 1480 MHz 1607 MHz 1607 MHz 1506 MHz 1406 MHz 1382 MHz
Boost Clock 1455 MHz 1582 MHz 1582 MHz 1733 MHz 1683 MHz 1683 MHz 1677 MHz 1600 MHz
Texture Units 320 240 224 160 152 120 256 256
ROP Units 96 96 88 64 64 64 64 64
Memory 12GB 12GB 11GB 8GB 8GB 8GB 8GB 16GB
Memory Clock 1700 MHz MHz 11400 MHz 11000 MHz 10000 MHz 8000 MHz 8000 MHz 1890 MHz 1890 MHz
Memory Interface 3072-bit
HBM2
384-bit G5X 352-bit G5X 256-bit G5X 256-bit 256-bit 2048-bit HBM2 2048-bit HBM2
Memory Bandwidth 653 GB/s 547 GB/s 484 GB/s 320 GB/s 256 GB/s 256 GB/s 484 GB/s 484 GB/s
TDP 250 watts 250 watts 250 watts 180 watts 180 watts 150 watts 345 watts 300 watts
Peak Compute 12.2 (base) TFLOPS
14.9 (boost) TFLOPS
12.1 TFLOPS 11.3 TFLOPS 8.2 TFLOPS 7.8 TFLOPS 5.7 TFLOPS 13.7 TFLOPS 13.1 TFLOPS
Peak DP Compute 6.1 (base) TFLOPS
7.45 (boost) TFLOPS
0.37 TFLOPS 0.35 TFLOPS 0.25 TFLOPS 0.24 TFLOPS 0.17 TFLOPS 0.85 TFLOPS 0.81 TFLOPS
MSRP (current) $2999 $1299 $699 $499 $449 $399 $699 $999

The current AMD Radeon RX Vega 64, and the Vega Frontier Edition, all ship with a 1:16 FP64 ratio, giving us the equivalent of 256 DP cores per card.

Test Setup and Benchmarks

Our testing setup remains the same from our gaming tests, but obviously the software stack is quite different. 

  PC Perspective GPU Testbed
Processor Intel Core i7-5960X Haswell-E
Motherboard ASUS Rampage V Extreme X99
Memory G.Skill Ripjaws 16GB DDR4-3200
Storage OCZ Agility 4 256GB (OS)
Adata SP610 500GB (games)
Power Supply Corsair AX1500i 1500 watt
OS Windows 10 x64
Drivers AMD: 17.10.2
NVIDIA: 388.59

Applications in use include:

  • Luxmark 
  • Cinebench R15
  • VRay
  • Sisoft Sandra GPU Compute
  • SPECviewperf 12.1
  • FAHBench

Let's not drag this along - I know you are hungry for results! (Thanks to Ken for running most of these tests for us!!)

Continue reading part 2 of our Titan V review on compute performance!!

Author:
Manufacturer: NVIDIA

A preview of potential Volta gaming hardware

This is a multi-part story for the NVIDIA Titan V:

As a surprise to most of us in the media community, NVIDIA launched a new graphics card to the world, the TITAN V. No longer sporting the GeForce brand, NVIDIA has returned the Titan line of cards to where it began – clearly targeted at the world of developers and general purpose compute. And if that branding switch isn’t enough to drive that home, I’m guessing the $2999 price tag will be.

Today’s article is going to look at the TITAN V from the angle that is likely most interesting to the majority of our readers, that also happens to be the angle that NVIDIA is least interested in us discussing. Though targeted at machine learning and the like, there is little doubt in my mind that some crazy people will want to take on the $3000 price to see what kind of gaming power this card can provide. After all, this marks the first time that a Volta-based GPU from NVIDIA has shipped in a place a consumer can get their hands on it, and the first time it has shipped with display outputs. (That’s kind of important to build a PC around it…)

IMG_4999.JPG

From a scientific standpoint, we wanted to look at the Titan V for the same reasons we tested the AMD Vega Frontier Edition cards upon their launch: using it to estimate how future consumer-class cards will perform in gaming. And, just as we had to do then, we purchased this Titan V from NVIDIA.com with our own money. (If anyone wants to buy this from me to recoup the costs, please let me know! Ha!)

  Titan V Titan Xp GTX 1080 Ti GTX 1080 GTX 1070 Ti GTX 1070 RX Vega 64 Liquid Vega Frontier Edition
GPU Cores 5120 3840 3584 2560 2432 1920 4096 4096
Base Clock 1200 MHz 1480 MHz 1480 MHz 1607 MHz 1607 MHz 1506 MHz 1406 MHz 1382 MHz
Boost Clock 1455 MHz 1582 MHz 1582 MHz 1733 MHz 1683 MHz 1683 MHz 1677 MHz 1600 MHz
Texture Units 320 240 224 160 152 120 256 256
ROP Units 96 96 88 64 64 64 64 64
Memory 12GB 12GB 11GB 8GB 8GB 8GB 8GB 16GB
Memory Clock 1700 MHz MHz 11400 MHz 11000 MHz 10000 MHz 8000 MHz 8000 MHz 1890 MHz 1890 MHz
Memory Interface 3072-bit
HBM2
384-bit G5X 352-bit G5X 256-bit G5X 256-bit 256-bit 2048-bit HBM2 2048-bit HBM2
Memory Bandwidth 653 GB/s 547 GB/s 484 GB/s 320 GB/s 256 GB/s 256 GB/s 484 GB/s 484 GB/s
TDP 250 watts 250 watts 250 watts 180 watts 180 watts 150 watts 345 watts 300 watts
Peak Compute 12.2 (base) TFLOPS
14.9 (boost) TFLOPS
12.1 TFLOPS 11.3 TFLOPS 8.2 TFLOPS 7.8 TFLOPS 5.7 TFLOPS 13.7 TFLOPS 13.1 TFLOPS
MSRP (current) $2999 $1299 $699 $499   $399 $699 $999

The Titan V is based on the GV100 GPU though with some tweaks that lower performance and capability slightly when compared to the Tesla-branded equivalent hardware. Though our add-in card iteration has the full 5120 CUDA cores enabled, the HBM2 memory bus is reduced from 4096-bit to 3072-bit and it has one of the four stacks on the package disabled. This also drops the memory capacity from 16GB to 12GB, and memory bandwidth to 652.8 GB/s.

Continue reading our gaming review of the NVIDIA Titan V!!

Video: What does a $3000 GPU look like? NVIDIA TITAN V Unboxing and Teardown!

Subject: Graphics Cards | December 12, 2017 - 07:51 PM |
Tagged: nvidia, titan, titan v, Volta, video, teardown, unboxing

NVIDIA launched the new Titan V graphics card last week, a $2999 part targeted not at gamers (thankfully) but instead at developers of machine learning applications. Based on the GV100 GPU and 12GB of HBM2 memory, the Titan V is an incredibly powerful graphics card. We have every intention of looking at the gaming performance of this card as a "preview" of potential consumer Volta cards that may come out next year. (This is identical to our stance of testing the Vega Frontier Edition cards.)

But for now, enjoy this unboxing and teardown video that takes apart the card to get a good glimpse of that GV100 GPU.

A couple of quick interesting notes:

  • This implementation has 25% of the memory and ROPs disabled, giving us 12GB of HBM2, a 3072-bit bus, and 96 ROPs.
  • Clock speeds in our testing look to be much higher than the base AND boost ratings.
  • So far, even though the price takes this out of the gaming segment completely, we are impressed with some of the gaming results we have found.
  • The cooler might LOOK the same, but it definitely is heavier than the cooler and build for the Titan Xp.
  • Champagne. It's champagne colored.
  • Double precision performance is insanely good, spanking the Titan Xp and Vega so far in many tests.
  • More soon!

gv100.png

Source: NVIDIA

NVIDIA Launches Titan V, the World's First Consumer Volta GPU with HBM2

Subject: Graphics Cards | December 7, 2017 - 11:44 PM |
Tagged: Volta, titan, nvidia, graphics card, gpu

NVIDIA made a surprising move late Thursday with the simultaneous announcement and launch of the Titan V, the first consumer/prosumer graphics card based on the Volta architecture.

NVIDIA_TITAN V_KV.jpeg

Like recent flagship Titan-branded cards, the Titan V will be available exclusively from NVIDIA for $2,999. Labeled "the most powerful graphics card ever created for the PC," Titan V sports 12GB of HBM2 memory, 5120 CUDA cores, and a 1455MHz boost clock, giving the card 110 teraflops of maximum compute performance. Check out the full specs below:

6 Graphics Processing Clusters
80 Streaming Multiprocessors
5120 CUDA Cores (single precision)
320 Texture Units
640 Tensor Cores
1200 MHz Base Clock (MHz)
1455 MHz Boost Clock (MHz)
850 MHz Memory Clock
1.7 Gbps Memory Data Rate
4608K L2 Cache Size
12288 MB HBM2 Total Video Memory
3072-bit Memory Interface
652.8 GB/s Total Memory Bandwidth
384 GigaTexels/sec Texture Rate (Bilinear)
12 nm Fabrication Process (TSMC 12nm FFN High Performance)
21.1 Billion Transistor Count
3 x DisplayPort, 1 x HDMI Connectors
Dual Slot Form Factor
One 6-pin, One 8-pin Power Connectors
600 Watts Recommended Power Supply
250 Watts Thermal Design Power (TDP)

The NVIDIA Titan V's 110 teraflops of compute performance compares to a maximum of about 12 teraflops on the Titan Xp, a greater than 9X increase in a single generation. Note that this is a very specific claim though, and references the AI compute capability of the Tensor cores rather than we traditionally measure for GPUs (single precision FLOPS). In that metric, the Titan V only truly offers a jump to 14 TFLOPS. The addition of expensive HBM2 memory also adds to the high price compared to its predecessor.

titan-v-stylized-photography-6.jpeg

The Titan V is available now from NVIDIA.com for $2,999, with a limit of 2 per customer. And hey, there's free shipping too.

Source: NVIDIA

NVIDIA's SC17 Keynote: Data Center Business on Cloud 9

Subject: Graphics Cards | November 13, 2017 - 10:35 PM |
Tagged: nvidia, data center, Volta, tesla v100

There have been a few NVIDIA datacenter stories popping up over the last couple of months. A month or so after Google started integrating Pascal-based Tesla P100s into their cloud, Amazon announced Telsa V100s for their rent-a-server service. They have also announced Volta-based solutions available or coming from Dell EMC, Hewlett Packard Enterprise, Huawei, IBM, Lenovo, Alibaba Cloud, Baidu Cloud, Microsoft Azure, Oracle Cloud, and Tencent Cloud.

nvidia-2017-sc17-money.jpg

This apparently translates to boatloads of money. Eyeball-estimating from their graph, it looks as though NVIDIA has already made about 50% more from datacenter sales in their first three quarters (fiscal year 2018) than all last year.

nvidia-2017-sc17-japanaisuper.jpg

They are also seeing super-computer design wins, too. Earlier this year, Japan announced that it would get back into supercomputing, having lost ground to other nations in recent years, with a giant, AI-focused offering. Turns out that this design will use 4352 Tesla V100 GPUs to crank out 0.55 ExaFLOPs of (tensor mixed-precision) performance.

nvidia-2017-sc17-cloudcontainer.jpg

As for product announcements, this one isn’t too exciting for our readers, but should be very important for enterprise software developers. NVIDIA is creating optimized containers for various programming environments, such as TensorFlow and GAMESS, with their recommended blend of driver version, runtime libraries, and so forth, for various generations of GPUs (Pascal and higher). Moreover, NVIDIA claims that they will support it “for as long as they live”. Getting the right container for your hardware is just filling out a simple form and downloading the blob.

NVIDIA’s keynote is available on UStream, but they claim it will also be uploaded to their YouTube soon.

Source: NVIDIA

Podcast #474 - Optane 900P, Cord Cutting, 1070 Ti, and more!

Subject: General Tech | November 2, 2017 - 12:11 PM |
Tagged: Volta, video, podcast, PCI-e 4, nvidia, msi, Microsoft Andromeda, Memristors, Mali-D71, Intel Optane, gtx 1070 ti, cord cutting, arm, aegis 3, 8th generation core

PC Perspective Podcast #474 - 11/02/17

Join us for discussion on Optane 900P, Cord Cutting, 1070 Ti, and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

Hosts: Ryan Shrout, Josh Walrath, Jeremy Hellstrom, Allyn Malventano,

Peanut Gallery: Ken Addison, Alex Lustenberg

Program length: 1:32:19

Podcast topics of discussion:
  1. Week in Review:
  2. News items of interest:
  3. Hardware/Software Picks of the Week
    1. 1:17:00 Ryan: Intel 900P Optane SSD
    2. 1:26:45 Allyn: Sony RX10 Mk IV. Pricey, but damn good.
  4. Closing/outro

Source:

NVIDIA Partners with AWS for Volta V100 in the Cloud

Subject: Graphics Cards | October 31, 2017 - 09:58 PM |
Tagged: nvidia, amazon, google, pascal, Volta, gv100, tesla v100

Remember last month? Remember when I said that Google’s introduction of Tesla P100s would be good leverage over Amazon, as the latter is still back in the Kepler days (because Maxwell was 32-bit focused)?

Amazon has leapfrogged them by introducing Volta-based V100 GPUs.

nvidia-2017-voltatensor.jpg

To compare the two parts, the Tesla P100 has 3584 CUDA cores, yielding just under 10 TFLOPs of single-precision performance. The Tesla V100, with its ridiculous die size, pushes that up over 14 TFLOPs. Same as Pascal, they also support full 1:2:4 FP64:FP32:FP16 performance scaling. It also has access to NVIDIA’s tensor cores, which are specialized for 16-bit, 4x4 multiply-add matrix operations that are apparently common in neural networks, both training and inferencing.

Amazon allows up to eight of them at once (with their P3.16xlarge instances).

So that’s cool. While Google has again been quickly leapfrogged by Amazon, it’s good to see NVIDIA getting wins in multiple cloud providers. This keeps money rolling in that will fund new chip designs for all the other segments.

Source: Amazon