Author:
Manufacturer: AMD

Vega meets Radeon Pro

Professional graphics cards are a segment of the industry that can look strange to gamers and PC enthusiasts. From the outside, it appears that businesses are paying more for almost identical hardware when compared to their gaming counterparts from both NVIDIA and AMD. 

However, a lot goes into a professional-level graphics card that makes all the difference to the consumers they are targeting. From the addition of ECC memory to protect against data corruption, all the way to a completely different driver stack with specific optimizations for professional applications, there's a lot of work put into these particular products.

The professional graphics market has gotten particularly interesting in the last few years with the rise of the NVIDIA TITAN-level GPUs and "Frontier Edition" graphics cards from AMD. While lacking ECC memory, these new GPUs have brought over some of the application level optimizations, while providing a lower price for more hobbyist level consumers.

However, if you're a professional that depends on a graphics card for mission-critical work, these options are no replacement for the real thing.

Today we're looking at one of AMD's latest Pro graphics offerings, the AMD Radeon Pro WX 8200. 

DSC05271.JPG

Click here to continue reading our review of the AMD Radeon Pro WX 8200.

Author:
Manufacturer: XFX

Overview

While 2018 so far has contained lots of talk about graphics cards, and new GPU architectures, little of this talk has been revolving around AMD. After having launched their long-awaited Vega GPUs in late 2017, AMD has remained mostly quiet on the graphics front.

As we headed into summer 2018, the talk around graphics started to turn to NVIDIA's next generation Turing architecture, the RTX 2070, 2080, and 2080 Ti, and the subsequent price creeps of graphics cards in their given product segment.

However, there has been one segment in particular that has been lacking any excitement in 2018—mid-range GPUs for gamers on a budget.

DSC05266.JPG

AMD is aiming to change that today with the release of the RX 590. Join us as we discuss the current state of affordable graphics cards.

  RX 590 RX 580 GTX 1060 6GB GTX 1060 3GB
GPU Polaris 30 Polaris 20 GP106 GP106
GPU Cores 2304 2304 1280 1152
Rated Clock 1469 MHz Base
1545 MHz Boost

1257 MHz Base
1340 MHz Boost

1506 MHz Base
1708 MHz Boost
1506 MHz Base
1708 MHz Boost
Texture Units 144 144 80 80
ROP Units 32 32 48 48
Memory 8GB 8GB 6GB 6GB
Memory Clock 8000 MHz 8000 MHz 8000 MHz 8000 MHz
Memory Interface 256-bit 256-bit 192-bit 192-bit
Memory Bandwidth 256 GB/s 256 GB/s 192 GB/s 192 GB/s
TDP 225 watts 185 watts 120 watts 120 watts
Peak Compute 7.1 TFLOPS 6.17 TFLOPS 3.85 TFLOPS (Base) 2.4 TFLOPS (Base)
Process Tech 12nm 14nm 16nm 16nm
MSRP (of retail cards) $239 $219 $249 $209

Click here to continue reading our review of the AMD RX 590!

Author:
Manufacturer: MSI

Overview

With the launch of the GeForce RTX 2070, NVIDIA seems to have applied some pressure to their partners to get SKUs that actually hit the advertised "starting at $499" price. Compared to the $599 Founders Edition RTX 2070, these lower cost options have the potential to bring significantly more value to the consumer, especially taken into account the relative performance levels of the RTX 2070 to the GTX 1080 we observed in our initial review.

Earlier this week, we took a look at the EVGA RTX 2070 Black Edition, but it's not the only card to hit the $499 price range that we've received.

Today, we are taking a look at MSI's low-cost RTX 2070 offering, the MSI RTX 2070 Armor.

DSC05222.JPG

MSI RTX 2070 ARMOR 8G
Base Clock Speed 1410 MHz
Boost Clock Speed 1620 MHz
Memory Clock Speed 14000 MHz GDDR6
Outputs DisplayPort x 3(v1.4) / HDMI 2.0b x 1 / USB Type-C x1 (VirtualLink) / 
Dimensions

12.1 x 6.1 x 1.9 inches (309 x 155 x 50 mm)

Price $499.99

Click here to continue reading our review of the MSI RTX 2070 Armor!

Author:
Manufacturer: NVIDIA

TU106 joins the party

In general, the launch of RTX 20-series GPUs from NVIDIA in the form of the RTX 2080 and RTX 2080 Ti has been a bit of a mixed bag.

While these new products did give us the fastest gaming GPU available, the RTX 2080 Ti, they are also some of the most expensive videos cards ever to launch. With a value proposition that is partially tied to the adoption of new hardware features into games, the reception of these new RTX cards has been rocky.

To say this puts a bit of pressure on the RTX 2070 launch would be an apt assessment. The community wants to see a reason to get excited for new graphics cards, without having to wait for applications to take advantage of the new hardware features like Tensor and RT cores. Conversely, NVIDIA would surely love to see an RTX launch with a bit more praise from the press and community than their previous release has garnered.

The wait is no longer, today we are taking a look at the RTX 2070, the last of the RTX-series graphics cards announced by NVIDIA back in August.

icon.jpg

  RTX 2080 Ti GTX 1080 Ti RTX 2080  RTX 2070 GTX 1080 GTX 1070 RX Vega 64 (Air)
GPU TU102 GP102 TU104 TU106 GP104 GP104 Vega 64
GPU Cores 4352 3584 2944 2304 2560 1920 4096
Base Clock 1350 MHz 1408 MHz 1515 MHz 1410  MHz 1607 MHz 1506 MHz 1247 MHz
Boost Clock 1545 MHz/
1635 MHz (FE)
1582 MHz 1710 MHz/
1800 MHz (FE)
1620 MHz/ 1710 MHz (FE) 1733 MHz 1683 MHz 1546 MHz
Texture Units 272 224 184 144 160 120 256
ROP Units 88 88 64 64 64 64 64
Tensor Cores 544 -- 368 288 -- -- --
Ray Tracing Speed 10 GRays/s -- 8 GRays/s 6 GRays/s -- -- --
Memory 11GB 11GB 8GB 8GB 8GB 8GB 8GB
Memory Clock 14000 MHz  11000 MHz 14000 MHz  14000 MHz 10000 MHz 8000 MHz 1890 MHz
Memory Interface 352-bit G6 352-bit G5X 256-bit G6 256-bit G6 256-bit G5X 256-bit G5 2048-bit HBM2
Memory Bandwidth 616GB/s 484 GB/s 448 GB/s 448 GB/s 320 GB/s 256 GB/s 484 GB/s
TDP 250 W /
260 W (FE)
250 W 215W /
225W (FE)
175 W / 185W (FE) 180 W 150 W 292 W
Peak Compute (FP32) 13.4 TFLOPS / 14.2 TFLOP (FE) 10.6 TFLOPS 10 TFLOPS / 10.6 TFLOPS (FE) 7.5 TFLOPS / 7.9 TFLOPS (FE) 8.2 TFLOPS 6.5 TFLOPS 13.7 TFLOPS
Transistor Count 18.6 B 12.0 B 13.6 B 10.8 B 7.2 B 7.2B 12.5 B
Process Tech 12nm 16nm 12nm 12nm 16nm 16nm 14nm
MSRP (current) $1200 (FE)/
$1000
$699 $800 (FE)/
$700
$599 (FE)/ $499 $549 $379 $499

Click here to continue reading our review of the NVIDIA GeForce RTX 2070!

Author:
Manufacturer: ASUS

Overview

With the release of the NVIDIA GeForce RTX 2080 and 2080 Ti just last week, the graphics card vendors have awakened with a flurry of new products based on the Turing GPUs.

Today, we're taking a look at ASUS's flagship option, the ASUS Republic of Gamers STRIX 2080 Ti.

ASUS ROG STRIX 2080 Ti
Base Clock Speed 1350 MHz
Boost Clock Speed 1665 MHz
Memory Clock Speed 14000 MHz GDDR6
Outputs DisplayPort x 2 (v1.4) / HDMI 2.0b x 2 / USB Type-C x1 (VirtualLink)
Dimensions

12 x 5.13 x 2.13 inches (30.47 x 13.04 x 5.41 cm)

Price $1249.99

DSC05198.JPG

For those of you familiar with the most recent STRIX video cards, the GTX 1080 Ti, and the RX Vega 64, the design of the RTX 2080 Ti will be immediately familiar. The same symmetric triple fan setup is present, contrasted against some of the recent triple fan designs we've seen from other manufacturers with different size fans.

DSC05207.JPG

Just as with the STRIX GTX 1080 Ti, the RTX 2080 Ti version features RGB lighting along the fan shroud of the card. 

Continue reading our review of the ASUS ROG STRIX RTX 2080 Ti!

Author:
Manufacturer: MSI

Our First Look

Over the years, the general trend for new GPU launches, especially GPUs from new graphics architecture is to launch only with the "reference" graphics card designs, developed by AMD or NVIDIA. While the idea of a "reference" design has changed over the years, with the introduction of NVIDIA's Founders Edition cards, and different special edition designs at launch from AMD like we saw with Vega 56 and Vega 64, generally there aren't any custom designs from partners available at launch.

However with the launch of NVIDIA's Turing architecture, in the form of the RTX 2080 and RTX 2080 Ti, we've been presented with an embarrassment of riches in the form of plenty of custom cooler and custom PCB designs found from Add-in Board (AIB) Manufacturers.

Today, we're taking a look at our first custom RTX 2080 design, the MSI RTX 2080 Gaming X Trio.

MSI GeForce RTX 2080 Gaming X Trio
Base Clock Speed 1515 MHz
Boost Clock Speed 1835 MHz
Memory Clock Speed 7000 MHz GDDR6
Outputs DisplayPort x 3 (v1.4) / HDMI 2.0b x 1 / USB Type-C x1 (VirtualLink)
Dimensions

12.9-in x 5.5-in x 2.1-in (327 x 140 x 55.6 mm)

Weight 3.42 lbs (1553 g)
Price $849.99

Introduced with the GTX 1080 Ti, the Gaming X Trio is as you might expect, a triple fan design, that makes up MSI's highest performance graphics card offering.

DSC05188.JPG

Click here to continue reading our review of the MSI GeForce RTX 2080 Gaming X TRio

Author:
Manufacturer: NVIDIA

New Generation, New Founders Edition

At this point, it seems that calling NVIDIA's 20-series GPUs highly anticipated would be a bit of an understatement. Between months and months of speculation about what these new GPUs would be called, what architecture they would be based off, and what features they would bring, the NVIDIA GeForce RTX 2080 and RTX 2080 Ti were officially unveiled in August, alongside the Turing architecture.

DSC05181.JPG

We've already posted our deep dive into the Turing architecture and the TU 102 and TU 104 GPUs powering these new graphics cards, but here's a short take away. Turing provides efficiency improvements in both memory and shader performance, as well as adds additional specialized hardware to accelerate both deep learning (Tensor cores), and enable real-time ray tracing (RT cores).

  RTX 2080 Ti Quadro RTX 6000 GTX 1080 Ti RTX 2080  Quadro RTX 5000 GTX 1080 TITAN V RX Vega 64 (Air)
GPU TU102 TU102 GP102 TU104 TU104 GP104 GV100 Vega 64
GPU Cores 4352 4608 3584 2944 3072 2560 5120 4096
Base Clock 1350 MHz 1455 MHz 1408 MHz 1515 MHz 1620 MHz 1607 MHz 1200 MHz 1247 MHz
Boost Clock 1545 MHz/
1635 MHz (FE)
1770 MHz 1582 MHz 1710 MHz/
1800 MHz (FE)
1820 MHz 1733 MHz 1455 MHz 1546 MHz
Texture Units 272 288 224 184 192 160 320 256
ROP Units 88 96 88 64 64 64 96 64
Tensor Cores 544 576 -- 368 384 -- 640 --
Ray Tracing Speed 10 GRays/s 10 GRays/s -- 8 GRays/s 8 GRays/s -- -- --
Memory 11GB 24GB 11GB 8GB 16GB 8GB 12GB  8GB
Memory Clock 14000 MHz  14000 MHz  11000 MHz 14000 MHz  14000 MHz  10000 MHz 1700 MHz 1890 MHz
Memory Interface 352-bit G6 384-bit G6 352-bit G5X 256-bit G6 256-bit G6 256-bit G5X 3072-bit HBM2 2048-bit HBM2
Memory Bandwidth 616GB/s 672GB/s 484 GB/s 448 GB/s 448 GB/s 320 GB/s 653 GB/s 484 GB/s
TDP 250 W/
260 W (FE)
260 W 250 watts 215W
225W (FE)
230 W 180 watts 250W 292
Peak Compute (FP32) 13.4 TFLOPS / 14.2 TFLOP (FE) 16.3 TFLOPS 10.6 TFLOPS 10 TFLOPS / 10.6 TFLOPS (FE) 11.2 TFLOPS 8.2 TFLOPS 14.9 TFLOPS 13.7 TFLOPS
Transistor Count 18.6 B 18.6B 12.0 B 13.6 B 13.6 B 7.2 B 21.0 B 12.5 B
Process Tech 12nm 12nm 16nm 12nm 12nm 16nm 12nm 14nm
MSRP (current) $1200 (FE)/
$1000
$6,300 $699 $800/
$700
$2,300 $549 $2,999 $499

 

As unusual as it is for them NVIDIA has decided to release both the RTX 2080 and RTX 2080 Ti at the same time, as the first products in the Turing family. 

The TU102-based RTX 2080 Ti features 4352 CUDA cores, while the TU104-based RTX 2080 features 2944, less than the GTX 1080 Ti. Also, these new RTX GPUs have moved to GDDR6 from the GDDR5X we found on the GTX 10-series.

DSC05175.JPG

Click here to continue reading our review of the RTX 2080 and 2080 Ti.

Author:
Manufacturer: NVIDIA

A Look Back and Forward

Although NVIDIA's new GPU architecture, revealed previously as Turing, has been speculated about for what seems like an eternity at this point, we finally have our first look at exactly what NVIDIA is positioning as the future of gaming.

geforce-rtx-2080.png

Unfortunately, we can't talk about this card just yet, but we can talk about what powers it

First though, let's take a look at the journey to get here over the past 30 months or so.

Unveiled in early 2016, Pascal marked by the launch of the GTX 1070 and 1080 was NVIDIA's long-awaited 16nm successor to Maxwell. Constrained by the oft-delayed 16nm process node, Pascal refined the shader unit design original found in Maxwell, while lowering power consumption and increasing performance.

Next, in May 2017 came Volta, the next (and last) GPU architecture outlined in NVIDIA's public roadmaps since 2013. However, instead of the traditional launch with a new GeForce gaming card, Volta saw a different approach.

Click here to continue reading our analysis of NVIDIA's Turing Graphics Architecture

Author:
Manufacturer: NVIDIA

Retesting the 2990WX

Earlier today, NVIDIA released version 399.24 of their GeForce drivers for Windows, citing Game Ready support for some newly released games including Shadow of the Tomb Raider, The Call of Duty: Black Ops 4 Blackout Beta, and Assetto Corsa Competizione early access. 

399-24-changelog.png

While this in and of itself is a normal event, we shortly started to get some tips from readers about an interesting bug fix found in NVIDIA's release notes for this specific driver revision.

f12017.png

Specifically addressing performance differences between 16-core/32-thread processors and 32-core/64-thread processors, this patched issue immediately rang true of our experiences benchmarking the AMD Ryzen Threadripper 2990WX back in August, where we saw some games resulting in frames rates around 50% slower than the 16-core Threadripper 2950X. 

This particular patch note lead us to update out Ryzen Threadripper 2990WX test platform to this latest NVIDIA driver release and see if there were any noticeable changes in performance.

The full testbed configuration is listed below:

Test System Setup
CPU

AMD Ryzen Threadripper 2990WX

Motherboard ASUS ROG Zenith Extreme - BIOS 1304
Memory

16GB Corsair Vengeance DDR4-3200

Operating at DDR4-2933

Storage Corsair Neutron XTi 480 SSD
Sound Card On-board
Graphics Card NVIDIA GeForce GTX 1080 Ti 11GB
Graphics Drivers NVIDIA 398.26 and 399.24
Power Supply Corsair RM1000x
Operating System Windows 10 Pro x64 RS4 (17134.165)

Included at the end of this article are the full results from our entire suite of game benchmarks from our CPU testbed, but first, let's take a look at some of the games that provided particularly bad issues with the 2990WX previously.

The interesting data points for this testing are the 2990WX scores across both the driver revision we tested across every CPU, 398.26, as well as the results from the 1/4 core compatibility mode, and the Ryzen Threadripper 2950X. From the wording of the patch notes, we would expect gaming performance between the 16-core 2950X and the 32-core 2990WX to be very similar.

Grand Theft Auto V

gtav-new.png

GTA V was previously one of the worst offenders in our original 2990WX testing, with the frame rate almost halving compared to the 2950X.

However, with the newest GeForce driver update, we see this gap shrinking to around a 20% difference.

Continue reading our revised look at Threadripper 2990WX gaming performance!!

Author:
Manufacturer: AMD

Your Mileage May Vary

One of the most interesting things going around in the computer hardware communities this past weekend was the revelation from a user named bryf50 on Reddit that they somehow had gotten his FreeSync display working with his NVIDIA GeForce GPU. 

For those of you that might not be familiar with the particular ins-and-outs of these variable refresh technologies, getting FreeSync displays to work on NVIDIA GPUs is potentially a very big deal.

While NVIDIA GPUs support the NVIDIA G-SYNC variable refresh rate standard, they are not compatible with Adaptive Sync (the technology on which FreeSync is based) displays. Despite Adaptive Sync being an open standard, and an optional extension to the DisplayPort specification, NVIDIA so far has chosen not to support these displays.

However, this provides some major downsides to consumers looking to purchase displays and graphics cards. Due to the lack of interoperability, consumers can get locked into a GPU vendor if they want to continue to use the variable refresh functionality of their display. Plus, Adaptive-Sync/FreeSync monitors, in general, seem to be significantly more inexpensive for similar specifications.

01.jpg

Click here to continue reading our exploration into FreeSync support on NVIDIA GPUs!

 

Author:
Manufacturer: ASUS

A long time coming

To say that the ASUS ROG Swift PG27UQ has been a long time coming is a bit of an understatement. In the computer hardware world where we are generally lucky to know about a product for 6-months, the PG27UQ is a product that has been around in some form or another for at least 18 months.

Originally demonstrated at CES 2017, the ASUS ROG Swift PG27UQ debuted alongside the Acer Predator X27 as the world's first G-SYNC displays supporting HDR. With promised brightness levels of 1000 nits, G-SYNC HDR was a surprising and aggressive announcement considering that HDR was just starting to pick up steam on TVs, and was unheard of for PC monitors. On top of the HDR support, these monitors were the first announced displays sporting a 144Hz refresh rate at 4K, due to their DisplayPort 1.4 connections.

However, delays lead to the PG27UQ being displayed yet again at CES this year, with a promised release date of Q1 2018. Even more slippages in release lead us to today, where the ASUS PG27UQ is available for pre-order for a staggering $2,000 and set to ship at some point this month.

In some ways, the launch of the PG27UQ very much mirrors the launch of the original G-SYNC display, the ROG Swift PG278Q. Both displays represented the launch of an oft waited technology, in a 27" form factor, and were seen as extremely expensive at their time of release.

DSC05009.JPG

Finally, we have our hands on a production model of the ASUS PG27UQ, the first monitor to support G-SYNC HDR, as well as 144Hz refresh rate at 4K. Can a PC monitor really be worth a $2,000 price tag? 

Continue reading our review of the ASUS ROG PG27UQ G-SYNC HDR Monitor!

Author:
Manufacturer: Intel

System Overview

Announced at Intel's Developer Forum in 2012, and launched later that year, the Next Unit of Computing (NUC) project was initially a bit confusing to the enthusiast PC press. In a market that appeared to be discarding traditional desktops in favor of notebooks, it seemed a bit odd to launch a product that still depended on a monitor, mouse, and keyboard, yet didn't provide any more computing power.

Despite this criticism, the NUC lineup has rapidly expanded over the years, seeing success in areas such as digital signage and enterprise environments. However, the enthusiast PC market has mostly eluded the lure of the NUC.

Intel's Skylake-based Skull Canyon NUC was the company's first attempt to cater to the enthusiast market, with a slight stray from the traditional 4-in x 4-in form factor and the adoption of their best-ever integrated graphics solution in the Iris Pro. Additionally, the ability to connect external GPUs via Thunderbolt 3 meant Skull Canyon offered more of a focus on high-end PC graphics. 

However, Skull Canyon mostly fell on deaf ears among hardcore PC users, and it seemed that Intel lacked the proper solution to make a "gaming-focused" NUC device—until now.

8th Gen Intel Core processor.jpg

Announced at CES 2018, the lengthily named 8th Gen Intel® Core™ processors With Radeon™ RX Vega M Graphics (henceforth referred to as the code name, Kaby Lake-G) marks a new direction for Intel. By partnering with one of the leaders in high-end PC graphics, AMD, Intel can now pair their processors with graphics capable of playing modern games at high resolutions and frame rates.

DSC04773.JPG

The first product to launch using the new Kaby Lake-G family of processors is Intel's own NUC, the NUC8i7HVK (Hades Canyon). Will the marriage of Intel and AMD finally provide a NUC capable of at least moderate gaming? Let's dig a bit deeper and find out.

Click here to continue reading our review of the Intel Hades Canyon NUC!

Manufacturer: Microsoft

O Rayly? Ya Rayly. No Ray!

Microsoft has just announced a raytracing extension to DirectX 12, called DirectX Raytracing (DXR), at the 2018 Game Developer's Conference in San Francisco.

microsoft-2015-directx12-logo.jpg

The goal is not to completely replace rasterization… at least not yet. This effect will be mostly implemented for effects that require supplementary datasets, such as reflections, ambient occlusion, and refraction. Rasterization, the typical way that 3D geometry gets drawn on a 2D display, converts triangle coordinates into screen coordinates, and then a point-in-triangle test runs across every sample. This will likely occur once per AA sample (minus pixels that the triangle can’t possibly cover -- such as a pixel outside of the triangle's bounding box -- but that's just optimization).

microsoft-2018-gdc-directx12raytracing-rasterization.png

For rasterization, each triangle is laid on a 2D grid corresponding to the draw surface.
If any sample is in the triangle, the pixel shader is run.
This example shows the rotated grid MSAA case.

A program, called a pixel shader, is then run with some set of data that the GPU could gather on every valid pixel in the triangle. This set of data typically includes things like world coordinate, screen coordinate, texture coordinates, nearby vertices, and so forth. This lacks a lot of information, especially things that are not visible to the camera. The application is free to provide other sources of data for the shader to crawl… but what?

  • Cubemaps are useful for reflections, but they don’t necessarily match the scene.
  • Voxels are useful for lighting, as seen with NVIDIA’s VXGI and VXAO.

This is where DirectX Raytracing comes in. There’s quite a few components to it, but it’s basically a new pipeline that handles how rays are cast into the environment. After being queued, it starts out with a ray-generation stage, and then, depending on what happens to the ray in the scene, there are close-hit, any-hit, and miss shaders. Ray generation allows the developer to set up how the rays are cast, where they call an HLSL instrinsic instruction, TraceRay (which is a clever way of invoking them, by the way). This function takes an origin and a direction, so you can choose to, for example, cast rays only in the direction of lights if your algorithm was to, for instance, approximate partially occluded soft shadows from a non-point light. (There are better algorithms to do that, but it's just the first example that came off the top of my head.) The close-hit, any-hit, and miss shaders occur at the point where the traced ray ends.

To connect this with current technology, imagine that ray-generation is like a vertex shader in rasterization, where it sets up the triangle to be rasterized, leading to pixel shaders being called.

microsoft-2018-gdc-directx12raytracing-multibounce.png

Even more interesting – the close-hit, any-hit, and miss shaders can call TraceRay themselves, which is used for multi-bounce and other recursive algorithms (see: figure above). The obvious use case might be reflections, which is the headline of the GDC talk, but they want it to be as general as possible, aligning with the evolution of GPUs. Looking at NVIDIA’s VXAO implementation, it also seems like a natural fit for a raytracing algorithm.

Speaking of data structures, Microsoft also detailed what they call the acceleration structure. Each object is composed of two levels. The top level contains per-object metadata, like its transformation and whatever else data that the developer wants to add to it. The bottom level contains the geometry. The briefing states, “essentially vertex and index buffers” so we asked for clarification. DXR requires that triangle geometry be specified as vertex positions in either 32-bit float3 or 16-bit float3 values. There is also a stride property, so developers can tweak data alignment and use their rasterization vertex buffer, as long as it's HLSL float3, either 16-bit or 32-bit.

As for the tools to develop this in…

microsoft-2018-gdc-PIX.png

Microsoft announced PIX back in January 2017. This is a debugging and performance analyzer for 64-bit, DirectX 12 applications. Microsoft will upgrade it to support DXR as soon as the API is released (specifically, “Day 1”). This includes the API calls, the raytracing pipeline resources, the acceleration structure, and so forth. As usual, you can expect Microsoft to support their APIs with quite decent – not perfect, but decent – documentation and tools. They do it well, and they want to make sure it’s available when the API is.

ea-2018-SEED screenshot (002).png

Example of DXR via EA's in-development SEED engine.

In short, raytracing is here, but it’s not taking over rasterization. It doesn’t need to. Microsoft is just giving game developers another, standardized mechanism to gather supplementary data for their games. Several game engines have already announced support for this technology, including the usual suspects of anything top-tier game technology:

  • Frostbite (EA/DICE)
  • SEED (EA)
  • 3DMark (Futuremark)
  • Unreal Engine 4 (Epic Games)
  • Unity Engine (Unity Technologies)

They also said, “and several others we can’t disclose yet”, so this list is not even complete. But, yeah, if you have Frostbite, Unreal Engine, and Unity, then you have a sizeable market as it is. There is always a question about how much each of these engines will support the technology. Currently, raytracing is not portable outside of DirectX 12, because it’s literally being announced today, and each of these engines intend to support more than just Windows 10 and Xbox.

Still, we finally have a standard for raytracing, which should drive vendors to optimize in a specific direction. From there, it's just a matter of someone taking the risk to actually use the technology for a cool work of art.

If you want to read more, check out Ryan's post about the also-announced RTX, NVIDIA's raytracing technology.

Manufacturer: Microsoft

It's all fun and games until something something AI.

Microsoft announced the Windows Machine Learning (WinML) API about two weeks ago, but they did so in a sort-of abstract context. This week, alongside the 2018 Game Developers Conference, they are grounding it in a practical application: video games!

microsoft-2018-winml-graphic.png

Specifically, the API provides the mechanisms for game developers to run inference on the target machine. The training data that it runs against would be in the Open Neural Network Exchange (ONNX) format from Microsoft, Facebook, and Amazon. Like the initial announcement suggests, it can be used for any application, not just games, but… you know. If you want to get a technology off the ground, and it requires a high-end GPU, then video game enthusiasts are good lead users. When run in a DirectX application, WinML kernels are queued on the DirectX 12 compute queue.

We’ve discussed the concept before. When you’re rendering a video game, simulating an accurate scenario isn’t your goal – the goal is to look like you are. The direct way of looking like you’re doing something is to do it. The problem is that some effects are too slow (or, sometimes, too complicated) to correctly simulate. In these cases, it might be viable to make a deep-learning AI hallucinate a convincing result, even though no actual simulation took place.

Fluid dynamics, global illumination, and up-scaling are three examples.

Previously mentioned SIGGRAPH demo of fluid simulation without fluid simulation...
... just a trained AI hallucinating a scene based on input parameters.

Another place where AI could be useful is… well… AI. One way of making AI is to give it some set of data from the game environment, often including information that a player in its position would not be able to know, and having it run against a branching logic tree. Deep learning, on the other hand, can train itself on billions of examples of good and bad play, and make results based on input parameters. While the two methods do not sound that different, the difference between logic being designed (vs logic being assembled from an abstract good/bad dataset) someone abstracts the potential for assumptions and programmer error. Of course, it abstracts that potential for error into the training dataset, but that’s a whole other discussion.

The third area that AI could be useful is when you’re creating the game itself.

There’s a lot of grunt and grind work when developing a video game. Licensing prefab solutions (or commissioning someone to do a one-off asset for you) helps ease this burden, but that gets expensive in terms of both time and money. If some of those assets could be created by giving parameters to a deep-learning AI, then those are assets that you would not need to make, allowing you to focus on other assets and how they all fit together.

These are three of the use cases that Microsoft is aiming WinML at.

nvidia-2018-deeplearningcarupscale.png

Sure, these are smooth curves of large details, but the antialiasing pattern looks almost perfect.

For instance, Microsoft is pointing to an NVIDIA demo where they up-sample a photo of a car, once with bilinear filtering and once with a machine learning algorithm (although not WinML-based). The bilinear algorithm behaves exactly as someone who has used Photoshop would expect. The machine learning algorithm, however, was able to identify the objects that the image intended to represent, and it drew the edges that it thought made sense.

microsoft-2018-gdc-PIX.png

Like their DirectX Raytracing (DXR) announcement, Microsoft plans to have PIX support WinML “on Day 1”. As for partners? They are currently working with Unity Technologies to provide WinML support in Unity’s ML-Agents plug-in. That’s all the game industry partners they have announced at the moment, though. It’ll be interesting to see who jumps in and who doesn’t over the next couple of years.

Author:
Manufacturer: AMD

Overview

It's clear by now that AMD's latest CPU releases, the Ryzen 3 2200G and the Ryzen 5 2400G are compelling products. We've already taken a look at them in our initial review, as well as investigated how memory speed affected the graphics performance of the internal GPU but it seemed there was something missing.

Recently, it's been painfully clear that GPUs excel at more than just graphics rendering. With the rise of cryptocurrency mining, OpenCL and CUDA performance are as important as ever.

Cryptocurrency mining certainly isn't the only application where having a powerful GPU can help system performance. We set out to see how much of an advantage the Radeon Vega 11 graphics in the Ryzen 5 2400G provided over the significantly less powerful UHD 630 graphics in the Intel i5-8400.

DSC04637.JPG

Test System Setup
CPU AMD Ryzen 5 2400G
Intel Core i5-8400
Motherboard Gigabyte AB350N-Gaming WiFi
ASUS STRIX Z370-E Gaming
Memory 2 x 8GB G.SKILL FlareX DDR4-3200
(All memory running at 3200 MHz)
Storage Corsair Neutron XTi 480 SSD
Sound Card On-board
Graphics Card AMD Radeon Vega 11 Graphics
Intel UHD 630 Graphics
Graphics Drivers AMD 17.40.3701
Intel 23.20.16.4901
Power Supply Corsair RM1000x
Operating System Windows 10 Pro x64 RS3

 

GPGPU Compute

Before we take a look at some real-world examples of where a powerful GPU can be utilized, let's look at the relative power of the Vega 11 graphics on the Ryzen 5 2400G compared to the UHD 630 graphics on the Intel i5-8400.

sisoft-screen.png

SiSoft Sandra is a suite of benchmarks covering a wide array of system hardware and functionality, including an extensive range of GPGPU tests, which we are looking at today. 

sandra1.png

Comparing the raw shader performance of the Ryzen 5 2400G and the Intel i5-8400 provides a clear snapshot of what we are dealing with. In every precision category, the Vega 11 graphics in the AMD part are significantly more powerful than the Intel UHD 630 graphics. This all combines to provide a 175% increase in aggregate shader performance over Intel for the AMD part. 

Now that we've taken a look at the theoretical power of these GPUs, let's see how they perform in real-world applications.

Continue reading our look at the GPU compute performance of the Ryzen 5 2400G!

Author:
Manufacturer: AMD

Memory Matters

Memory speed is not a factor that the average gamer thinks about when building their PC. For the most part, memory performance hasn't had much of an effect on modern processors running high-speed memory such as DDR3 and DDR4.

With the launch of AMD's Ryzen processors, last year emerged a platform that was more sensitive to memory speeds. By running Ryzen processors with higher frequency and lower latency memory, users should see significant performance improvements, especially in 1080p gaming scenarios.

However, the Ryzen processors are not the only ones to exhibit this behavior.

Gaming on integrated GPUs is a perfect example of a memory starved situation. Take for instance the new AMD Ryzen 5 2400G and it's Vega-based GPU cores. In a full Vega 56 or 64 situation, these Vega cores utilize blazingly fast HBM 2.0 memory. However, due to constraints such as die space and cost, this processor does not integrate HBM.

DSC04643.JPG

Instead, both the CPU portion and the graphics portion of the APU must both depend on the same pool of DDR4 system memory. DDR4 is significantly slower than memory traditionally found on graphics cards such as GDDR5 or HBM. As a result, APU performance is usually memory limited to some extent.

In the past, we've done memory speed testing with AMD's older APUs, however with the launch of the new Ryzen and Vega based R3 2200G and R5 2400G, we decided to take another look at this topic.

For our testing, we are running the Ryzen 5 2400G at three different memory speeds, 2400 MHz, 2933 MHz, and 3200 MHz. While the maximum supported JEDEC memory standard for the R5 2400G is 2933, the memory provided by AMD for our processor review will support overclocking to 3200MHz just fine.

Continue reading our look at memory speed scaling with the Ryzen 5 2400G!

Author:
Manufacturer: ASUS

Specifications and Design

With all of the activity in both the GPU and CPU markets this year, it's hard to remember some of the launches in the first half of the year—including NVIDIA's GTX 1080 Ti. Maintaining the rank of fastest gaming GPU for the majority of the year, little has challenged NVIDIA's GP102-based offering, making it the defacto choice for high-end gamers.

Even though we've been giving a lot of attention to NVIDIA's new flagship TITAN V graphics card, the $3000 puts it out of the range of almost every gamer who doesn't have a day job involving deep learning.

IMG_5011.JPG

Today, we're taking a look back to the (slightly) more reasonable GP102 and the one of the most premiere offerings to feature it, the ASUS ROG Strix GTX 1080 Ti.

Hardware Specifications

While the actual specifications of the GP102 GPU onboard the ASUS Strix GTX 1080 Ti hasn't changed at all, let's take a moment to refresh ourselves on where it sits in regards to the rest of the market.

  RX Vega 64 Liquid RX Vega 56 GTX 1080 Ti GTX 1080 GTX 1070 Ti GTX 1070
GPU Cores 4096 3584 3584 2560 2432 1920
Base Clock 1406 MHz 1156 MHz 1480 MHz 1607 MHz 1607 MHz 1506 MHz
Boost Clock 1677 MHz 1471 MHz 1582 MHz 1733 MHz 1683 MHz 1683 MHz
Texture Units 256 256 224 160 152 120
ROP Units 64 64 88 64 64 64
Memory 8GB 8GB 11GB 8GB 8GB 8GB
Memory Clock 1890 MHz 1600 MHz 11000 MHz 10000 MHz 8000 MHz 8000 MHz
Memory Interface 2048-bit HBM2 2048-bit HBM2 352-bit G5X 256-bit G5X 256-bit 256-bit
Memory Bandwidth 484 GB/s 410 GB/s 484 GB/s 320 GB/s 256 GB/s 256 GB/s
TDP 345 watts 210 watts 250 watts 180 watts 180 watts 150 watts
Peak Compute 13.7 TFLOPS 10.5 TFLOPS 11.3 TFLOPS 8.2 TFLOPS 7.8 TFLOPS 5.7 TFLOPS
MSRP (current) $699 $399 $699 $499 $449 $399

If you'd like some additional details on the NVIDIA GTX 1080 Ti, or it's GP102 GPU, take a look at our review of the reference Founder's edition.

The GTX 1000 series of products from NVIDIA has marked a consolidation in ASUS's GPU offerings. Instead of having both Strix and Matrix products available, the Strix has supplanted everything to be the most premium option from ASUS for any given GPU, and the Strix GTX 1080 Ti doesn't disappoint.

IMG_5018.JPG

While it might not be the largest graphics card we've ever seen, the ASUS Strix GTX 1080 Ti is more massive in all dimensions compared to both the NVIDIA Founder's Edition card, as well as the EVGA ICX option we took a look at earlier this year. Compared to the Founder's Edition, the Strix GTX 1080 Ti is 1.23-in longer, 0.9-in taller, and takes up an extra PCIe slot in width.

Continue reading our review of the ASUS ROG Strix GTX 1080 Ti!!

How deep is your learning?

Recently, we've had some hands-on time with NVIDIA's new TITAN V graphics card. Equipped with the GV100 GPU, the TITAN V has shown us some impressive results in both gaming and GPGPU compute workloads.

However, one of the most interesting areas that NVIDIA has been touting for GV100 has been deep learning. With a 1.33x increase in single-precision FP32 compute over the Titan Xp, and the addition of specialized Tensor Cores for deep learning, the TITAN V is well positioned for deep learning workflows.

In mathematics, a tensor is a multi-dimensional array of numerical values with respect to a given basis. While we won't go deep into the math behind it, Tensors are a crucial data structure for deep learning applications.

07.jpg

NVIDIA's Tensor Cores aim to accelerate Tensor-based math by utilizing half-precision FP16 math in order to process both dimensions of a Tensor at the same time. The GV100 GPU contains 640 of these Tensor Cores to accelerate FP16 neural network training.

It's worth noting that these are not the first Tensor operation-specific hardware, with others such as Google developing hardware for these specific functions.

Test Setup

  PC Perspective Deep Learning Testbed
Processor AMD Ryzen Threadripper 1920X
Motherboard GIGABYTE X399 AORUS Gaming 7
Memory 64GB Corsair Vengeance RGB DDR4-3000 
Storage Samsung SSD 960 Pro 2TB
Power Supply Corsair AX1500i 1500 watt
OS Ubuntu 16.04.3 LTS
Drivers AMD: AMD GPU Pro 17.50
NVIDIA: 387.34

For our NVIDIA testing, we used the NVIDIA GPU Cloud 17.12 Docker containers for both TensorFlow and Caffe2 inside of our Ubuntu 16.04.3 host operating system.

AMD testing was done using the hiptensorflow port from the AMD ROCm GitHub repositories.

For all tests, we are using the ImageNet Large Scale Visual Recognition Challenge 2012 (ILSVRC2012) data set.

Continue reading our look at deep learning performance with the NVIDIA Titan V!!

Author:
Manufacturer: NVIDIA

Looking Towards the Professionals

This is a multi-part story for the NVIDIA Titan V:

Earlier this week we dove into the new NVIDIA Titan V graphics card and looked at its performacne from a gaming perspective. Our conclusions were more or less what we expected - the card was on average ~20% faster than the Titan Xp and about ~80% faster than the GeForce GTX 1080. But with that $3000 price tag, the Titan V isn't going to win any enthusiasts over.

What the Titan V is meant for in reality is the compute space. Developers, coders, engineers, and professionals that use GPU hardware for research, for profit, or for both. In that case, $2999 for the Titan V is simply an investment that needs to show value in select workloads. And though $3000 is still a lot of money, keep in mind that the NVIDIA Quadro GP100, the most recent part with full-performance double precision compute from the Pascal chip, is still selling for well over $6000 today. 

IMG_5009.JPG

The Volta GV100 GPU offers 1:2 double precision performance, equating to 2560 FP64 cores. That is a HUGE leap over the GP102 GPU used on the Titan Xp that uses a 1:32 ratio, giving us just 120 FP64 cores equivalent.

  Titan V Titan Xp GTX 1080 Ti GTX 1080 GTX 1070 Ti GTX 1070 RX Vega 64 Liquid Vega Frontier Edition
GPU Cores 5120 3840 3584 2560 2432 1920 4096 4096
FP64 Cores 2560 120 112 80 76 60 256 256
Base Clock 1200 MHz 1480 MHz 1480 MHz 1607 MHz 1607 MHz 1506 MHz 1406 MHz 1382 MHz
Boost Clock 1455 MHz 1582 MHz 1582 MHz 1733 MHz 1683 MHz 1683 MHz 1677 MHz 1600 MHz
Texture Units 320 240 224 160 152 120 256 256
ROP Units 96 96 88 64 64 64 64 64
Memory 12GB 12GB 11GB 8GB 8GB 8GB 8GB 16GB
Memory Clock 1700 MHz MHz 11400 MHz 11000 MHz 10000 MHz 8000 MHz 8000 MHz 1890 MHz 1890 MHz
Memory Interface 3072-bit
HBM2
384-bit G5X 352-bit G5X 256-bit G5X 256-bit 256-bit 2048-bit HBM2 2048-bit HBM2
Memory Bandwidth 653 GB/s 547 GB/s 484 GB/s 320 GB/s 256 GB/s 256 GB/s 484 GB/s 484 GB/s
TDP 250 watts 250 watts 250 watts 180 watts 180 watts 150 watts 345 watts 300 watts
Peak Compute 12.2 (base) TFLOPS
14.9 (boost) TFLOPS
12.1 TFLOPS 11.3 TFLOPS 8.2 TFLOPS 7.8 TFLOPS 5.7 TFLOPS 13.7 TFLOPS 13.1 TFLOPS
Peak DP Compute 6.1 (base) TFLOPS
7.45 (boost) TFLOPS
0.37 TFLOPS 0.35 TFLOPS 0.25 TFLOPS 0.24 TFLOPS 0.17 TFLOPS 0.85 TFLOPS 0.81 TFLOPS
MSRP (current) $2999 $1299 $699 $499 $449 $399 $699 $999

The current AMD Radeon RX Vega 64, and the Vega Frontier Edition, all ship with a 1:16 FP64 ratio, giving us the equivalent of 256 DP cores per card.

Test Setup and Benchmarks

Our testing setup remains the same from our gaming tests, but obviously the software stack is quite different. 

  PC Perspective GPU Testbed
Processor Intel Core i7-5960X Haswell-E
Motherboard ASUS Rampage V Extreme X99
Memory G.Skill Ripjaws 16GB DDR4-3200
Storage OCZ Agility 4 256GB (OS)
Adata SP610 500GB (games)
Power Supply Corsair AX1500i 1500 watt
OS Windows 10 x64
Drivers AMD: 17.10.2
NVIDIA: 388.59

Applications in use include:

  • Luxmark 
  • Cinebench R15
  • VRay
  • Sisoft Sandra GPU Compute
  • SPECviewperf 12.1
  • FAHBench

Let's not drag this along - I know you are hungry for results! (Thanks to Ken for running most of these tests for us!!)

Continue reading part 2 of our Titan V review on compute performance!!

Author:
Manufacturer: NVIDIA

A preview of potential Volta gaming hardware

This is a multi-part story for the NVIDIA Titan V:

As a surprise to most of us in the media community, NVIDIA launched a new graphics card to the world, the TITAN V. No longer sporting the GeForce brand, NVIDIA has returned the Titan line of cards to where it began – clearly targeted at the world of developers and general purpose compute. And if that branding switch isn’t enough to drive that home, I’m guessing the $2999 price tag will be.

Today’s article is going to look at the TITAN V from the angle that is likely most interesting to the majority of our readers, that also happens to be the angle that NVIDIA is least interested in us discussing. Though targeted at machine learning and the like, there is little doubt in my mind that some crazy people will want to take on the $3000 price to see what kind of gaming power this card can provide. After all, this marks the first time that a Volta-based GPU from NVIDIA has shipped in a place a consumer can get their hands on it, and the first time it has shipped with display outputs. (That’s kind of important to build a PC around it…)

IMG_4999.JPG

From a scientific standpoint, we wanted to look at the Titan V for the same reasons we tested the AMD Vega Frontier Edition cards upon their launch: using it to estimate how future consumer-class cards will perform in gaming. And, just as we had to do then, we purchased this Titan V from NVIDIA.com with our own money. (If anyone wants to buy this from me to recoup the costs, please let me know! Ha!)

  Titan V Titan Xp GTX 1080 Ti GTX 1080 GTX 1070 Ti GTX 1070 RX Vega 64 Liquid Vega Frontier Edition
GPU Cores 5120 3840 3584 2560 2432 1920 4096 4096
Base Clock 1200 MHz 1480 MHz 1480 MHz 1607 MHz 1607 MHz 1506 MHz 1406 MHz 1382 MHz
Boost Clock 1455 MHz 1582 MHz 1582 MHz 1733 MHz 1683 MHz 1683 MHz 1677 MHz 1600 MHz
Texture Units 320 240 224 160 152 120 256 256
ROP Units 96 96 88 64 64 64 64 64
Memory 12GB 12GB 11GB 8GB 8GB 8GB 8GB 16GB
Memory Clock 1700 MHz MHz 11400 MHz 11000 MHz 10000 MHz 8000 MHz 8000 MHz 1890 MHz 1890 MHz
Memory Interface 3072-bit
HBM2
384-bit G5X 352-bit G5X 256-bit G5X 256-bit 256-bit 2048-bit HBM2 2048-bit HBM2
Memory Bandwidth 653 GB/s 547 GB/s 484 GB/s 320 GB/s 256 GB/s 256 GB/s 484 GB/s 484 GB/s
TDP 250 watts 250 watts 250 watts 180 watts 180 watts 150 watts 345 watts 300 watts
Peak Compute 12.2 (base) TFLOPS
14.9 (boost) TFLOPS
12.1 TFLOPS 11.3 TFLOPS 8.2 TFLOPS 7.8 TFLOPS 5.7 TFLOPS 13.7 TFLOPS 13.1 TFLOPS
MSRP (current) $2999 $1299 $699 $499   $399 $699 $999

The Titan V is based on the GV100 GPU though with some tweaks that lower performance and capability slightly when compared to the Tesla-branded equivalent hardware. Though our add-in card iteration has the full 5120 CUDA cores enabled, the HBM2 memory bus is reduced from 4096-bit to 3072-bit and it has one of the four stacks on the package disabled. This also drops the memory capacity from 16GB to 12GB, and memory bandwidth to 652.8 GB/s.

Continue reading our gaming review of the NVIDIA Titan V!!