Author:
Manufacturer: NVIDIA

A preview of potential Volta gaming hardware

As a surprise to most of us in the media community, NVIDIA launched a new graphics card to the world, the TITAN V. No longer sporting the GeForce brand, NVIDIA has returned the Titan line of cards to where it began – clearly targeted at the world of developers and general purpose compute. And if that branding switch isn’t enough to drive that home, I’m guessing the $2999 price tag will be.

Today’s article is going to look at the TITAN V from the angle that is likely most interesting to the majority of our readers, that also happens to be the angle that NVIDIA is least interested in us discussing. Though targeted at machine learning and the like, there is little doubt in my mind that some crazy people will want to take on the $3000 price to see what kind of gaming power this card can provide. After all, this marks the first time that a Volta-based GPU from NVIDIA has shipped in a place a consumer can get their hands on it, and the first time it has shipped with display outputs. (That’s kind of important to build a PC around it…)

IMG_4999.JPG

From a scientific standpoint, we wanted to look at the Titan V for the same reasons we tested the AMD Vega Frontier Edition cards upon their launch: using it to estimate how future consumer-class cards will perform in gaming. And, just as we had to do then, we purchased this Titan V from NVIDIA.com with our own money. (If anyone wants to buy this from me to recoup the costs, please let me know! Ha!)

  Titan V Titan Xp GTX 1080 Ti GTX 1080 GTX 1070 Ti GTX 1070 RX Vega 64 Liquid Vega Frontier Edition
GPU Cores 5120 3840 3584 2560 2432 1920 4096 4096
Base Clock 1200 MHz 1480 MHz 1480 MHz 1607 MHz 1607 MHz 1506 MHz 1406 MHz 1382 MHz
Boost Clock 1455 MHz 1582 MHz 1582 MHz 1733 MHz 1683 MHz 1683 MHz 1677 MHz 1600 MHz
Texture Units 320 240 224 160 152 120 256 256
ROP Units 96 96 88 64 64 64 64 64
Memory 12GB 12GB 11GB 8GB 8GB 8GB 8GB 16GB
Memory Clock 1700 MHz MHz 11400 MHz 11000 MHz 10000 MHz 8000 MHz 8000 MHz 1890 MHz 1890 MHz
Memory Interface 3072-bit
HBM2
384-bit G5X 352-bit G5X 256-bit G5X 256-bit 256-bit 2048-bit HBM2 2048-bit HBM2
Memory Bandwidth 653 GB/s 547 GB/s 484 GB/s 320 GB/s 256 GB/s 256 GB/s 484 GB/s 484 GB/s
TDP 250 watts 250 watts 250 watts 180 watts 180 watts 150 watts 345 watts 300 watts
Peak Compute 12.2 (base) TFLOPS
14.9 (boost) TFLOPS
12.1 TFLOPS 11.3 TFLOPS 8.2 TFLOPS 7.8 TFLOPS 5.7 TFLOPS 13.7 TFLOPS 13.1 TFLOPS
MSRP (current) $2999 $1299 $699 $499   $399 $699 $999

The Titan V is based on the GV100 GPU though with some tweaks that lower performance and capability slightly when compared to the Tesla-branded equivalent hardware. Though our add-in card iteration has the full 5120 CUDA cores enabled, the HBM2 memory bus is reduced from 4096-bit to 3072-bit and it has one of the four stacks on the package disabled. This also drops the memory capacity from 16GB to 12GB, and memory bandwidth to 652.8 GB/s.

Continue reading our gaming review of the NVIDIA Titan V!!

Author:
Manufacturer: AMD

The flower, not the hormone

It was way back in December of 2014 that AMD and the Radeon group first started down the path of major driver updates on an annual cadence. The Catalyst Omega release marked the beginning of a recommitment to the needs of gamers (and now professionals) with more frequent, and more dramatic, software updates and improvements. Cognizant of the previous reputation the company had with drivers and software, often a distant second to the success that NVIDIA had created with it GeForce drivers, Radeon users were promised continuous increases.

And make no mistake, the team at AMD had an uphill battle. But with releases like Omega, Crimson, ReLive, and now Adrenalin, it’s clear that the leadership has received the message and put emphasis on the portion of its product that can have the most significant impact on experience.

AMD joins us at the PCPer offices to talk through all the new features and capabilities!

Named after the adrenalin rose, rather than the drug that flows through your body when being chased by feral cats, this latest major software release for Radeon users includes a host of new features and upgraded ones that should bring a fresh coat of paint to any existing GPU. Two big features will steal the show, the new Radeon Overlay and a mobile app called AMD Link. But expansions to ReLive, Wattman, Enhanced Sync, and Chill are equally compelling.

Let’s start with what I think will get the most attention and deservedly so, the Radeon Overlay. As the name would suggest, the overlay can be turned out through a hotkey in-game, and allows the gamer to access graphics card monitoring tools and many driver settings without leaving the game, having to alt-tab, or having to close the game to apply. By hitting Alt-R, a screen will show up on the right-hand side of the display, with the game continuing to run in the background. The user will be able to interact with the menu via mouse or keyboard, and then hit the same hotkey or Esc to return.

adrenalin-49.jpg

Continue reading our look at the new AMD Radeon Softare Adrenalin Edition driver!!

A New Frontier

Console game performance has always been an area that we've been interested in here at PC Perspective but has been mostly out of our reach to evaluate with any kind of scientific tilt. Our Frame Rating methodology for PC-based game analysis relies on having an overlay application during screen capture which is later analyzed by a series of scripts. Obviously, we can not take this approach with consoles as we cannot install our own code on the consoles to run that overlay. 

A few other publications such as Eurogamer with their Digital Foundry subsite have done fantastic work developing their internal toolsets for evaluating console games, but this type of technology has mostly remained out of reach of the everyman.

trdrop2.PNG_.png

Recently, we came across an open source project which aims to address this. Trdrop is an open source software built upon OpenCV, a stalwart library in the world of computer vision. Using OpenCV, trdrop can analyze the frames of ordinary gameplay (without an overlay), detecting if there are differences between two frames, looking for dropped frames and tears to come up with a real-time frame rate.

trdrop1.PNG

This means that trdrop can analyze gameplay footage from any source, be it console, PC, or anything in-between from which you can get a direct video capture feed. Now that PC capture cards capable of 1080p60, and even 4K60p are coming down in price, software like this is allowing more gamers to peek at the performance of their games, which we think is always a good thing.

It's worth noting that trdrop is still listed as "alpha" software on it's GitHub repo, but we have found the software to be very stable and flexible in the current iteration.

  Xbox One S Xbox One X PS4 PS4 Pro
CPU 8x Jaguar
1.75 Ghz
8x Jaguar
2.3 Ghz
8x Jaguar
1.6 Ghz
8x Jaguar
2.1 Ghz
GPU CU 12x GCN
914 Mhz
40x Custom
1172 Mhz
18x GCN
800 Mhz
36x GCN
911 Mhz
GPU
Compute
1.4 TF 6.0 TF 1.84 TF 4.2 TF
Memory 8 GB DDR3
32MB ESRAM
12 GB GDDR5 8 GB GDDR5 8 GB GDDR5
Memory
Bandwidth
219GB/s 326GB/s 176GB/s 218GB/s

Now that the Xbox One X is out, we figured it would be a good time to take a look at the current generation of consoles and their performance in a few games as a way to get our feet wet with this new software and method. We are only testing 1080p here, but we now have our hands on a 4K HDMI capture card capable of 60Hz for some future testing! (More on that soon.)

Continue reading our look at measuring performance of the Xbox One X!

Author:
Manufacturer: Intel

The Expected Unexpected

Last night we first received word that Raja had resigned from AMD (during a sabbatical) after they had launched Vega.  The initial statement was that Raja would come back to resume his position at AMD in a December/January timeframe.  During this time there was some doubt as to if Raja would in fact come back to AMD, as “sabbaticals” in the tech world would often lead the individual to take stock of their situation and move on to what they would consider to be greener pastures.

raja_ryan.JPG

Raja has dropped by the PCPer offices in the past.

Initially it was thought that Raja would take the time off and then eventually jump to another company and tackle the issues there.  This behavior is quite common in Silicon Valley and Raja is no stranger to this.  Raja cut his teeth on 3D graphics at S3, but in 2001 he moved to ATI.  While there he worked on a variety of programs including the original Radeon, the industry changing Radeon 9700 series, and finishing up with the strong HD 4000 series of parts.  During this time ATI was acquired by AMD and he became one of the top graphics guru at that company.  In 2009 he quit AMD and moved on to Apple.  He was Director of Graphics Architecture at Apple, but little is known about what he actually did.  During that time Apple utilized AMD GPUs and licensed Imagination Technologies graphics technology.  Apple could have been working on developing their own architecture at this point, which has recently showed up in the latest iPhone products.

In 2013 Raja rejoined AMD and became a corporate VP of Visual Computing, but in 2015 he was promoted to leading the Radeon Technology Group after Lisu Su became CEO of the company. While there Raja worked to get AMD back on an even footing under pretty strained conditions. AMD had not had the greatest of years and had seen their primary moneymakers start taking on water.  AMD had competitive graphics for the most part, and the Radeon technology integrated into AMD’s APUs truly was class leading.  On the discrete side AMD was able to compare favorably to NVIDIA with the HD 7000 and later R9 200 series of cards.  After NVIDIA released their Maxwell based chips, AMD had a hard time keeping up.  The general consensus here is that the RTG group saw its headcount decreased by the company-wide cuts as well as a decrease in R&D funds.

Continue reading about Raja Koduri joinging Intel...

Author:
Manufacturer: NVIDIA

Here comes a new challenger

The release of the GeForce GTX 1070 Ti has been an odd adventure. Launched into a narrow window of a product stack between the GTX 1070 and the GTX 1080, the GTX 1070 Ti is a result of the competition from the AMD RX Vega product line. Sure, NVIDIA might have speced out and prepared an in-between product for some time, but it was the release of competitive high-end graphics cards from AMD (for the first time in forever it seems) that pushed NVIDIA to launch what you see before us today.

With MSRPs of $399 and $499 for the GTX 1070 and GTX 1080 respectively, a new product that fits between them performance wise has very little room to stretch its legs. Because of that, there are some interesting peculiarities involved with the release cycle surrounding overclocks, partner cards, and more.

IMG_4944.JPG

But before we get into that concoction, let’s first look at the specifications of this new GPU option from NVIDIA as well as the reference Founders Edition and EVGA SC Black Edition cards that made it to our offices!

GeForce GTX 1070 Ti Specifications

We start with our classic table of details.

  RX Vega 64 Liquid RX Vega 64 Air RX Vega 56 Vega Frontier Edition GTX 1080 Ti GTX 1080 GTX 1070 Ti GTX 1070
GPU Cores 4096 4096 3584 4096 3584 2560 2432 1920
Base Clock 1406 MHz 1247 MHz 1156 MHz 1382 MHz 1480 MHz 1607 MHz 1607 MHz 1506 MHz
Boost Clock 1677 MHz 1546 MHz 1471 MHz 1600 MHz 1582 MHz 1733 MHz 1683 MHz 1683 MHz
Texture Units 256 256 256 256 224 160 152 120
ROP Units 64 64 64 64 88 64 64 64
Memory 8GB 8GB 8GB 16GB 11GB 8GB 8GB 8GB
Memory Clock 1890 MHz 1890 MHz 1600 MHz 1890 MHz 11000 MHz 10000 MHz 8000 MHz 8000 MHz
Memory Interface 2048-bit HBM2 2048-bit HBM2 2048-bit HBM2 2048-bit HBM2 352-bit G5X 256-bit G5X 256-bit 256-bit
Memory Bandwidth 484 GB/s 484 GB/s 410 GB/s 484 GB/s 484 GB/s 320 GB/s 256 GB/s 256 GB/s
TDP 345 watts 295 watts 210 watts 300 watts 250 watts 180 watts 180 watts 150 watts
Peak Compute 13.7 TFLOPS 12.6 TFLOPS 10.5 TFLOPS 13.1 TFLOPS 11.3 TFLOPS 8.2 TFLOPS 7.8 TFLOPS 5.7 TFLOPS
MSRP (current) $699 $499 $399 $999 $699 $499 $449 $399

If you have followed the leaks and stories over the last month or so, the information here isn’t going to be a surprise. The CUDA core count of the GTX 1070 Ti is 2432, only one SM unit less than the GTX 1080. Base and boost clock speeds are the same as the GTX 1080. The memory system includes 8GB of GDDR5 running at 8 GHz, matching the performance of the GTX 1070 in this case. The TDP gets a bump up to 180 watts, in line with the GTX 1080 and slightly higher than the GTX 1070.

Continue reading our review of the GeForce GTX 1070 Ti!

Forza Motorsport 7 Performance

The first full Forza Motorsport title available for the PC, Forza Motorsport 7 on Windows 10 launched simultaneously with the Xbox version earlier this month. With native 4K assets, HDR support, and new visual features like fully dynamic weather, this title is an excellent showcase of what modern PC hardware can do.

forza7-screen.png

Now that both AMD and NVIDIA have released drivers optimized for Forza 7, we've taken an opportunity to measure performance across an array of different GPUs. After some significant performance mishaps with last year's Forza Horizon 3 at launch on PC, we are excited to see if Forza Motorsport 7 brings any much-needed improvements. 

For this testing, we used our standard GPU testbed, including an 8-core Haswell-E processor and plenty of memory and storage.

  PC Perspective GPU Testbed
Processor Intel Core i7-5960X Haswell-E
Motherboard ASUS Rampage V Extreme X99
Memory G.Skill Ripjaws 16GB DDR4-3200
Storage OCZ Agility 4 256GB (OS)
Adata SP610 500GB (games)
Power Supply Corsair AX1500i 1500 watt
OS Windows 10 x64 
Drivers AMD: 17.10.1 (Beta)
NVIDIA: 387.92

As with a lot of modern console-first titles, Forza 7 defaults to "Dynamic" image quality settings. This means that the game engine is supposed to find the best image settings for your hardware automatically, and dynamically adjust them so that you hit a target frame rate (adjustable between 30 and 60fps) no matter what is going on in the current scene that is being rendered.

While this is a good strategy for consoles, and even for casual PC gamers, it poses a problem for us trying to measure equivalent performance across GPUs. Luckily the developers of Forza Motorsport 7, Turn 10 Studios, still let you disable the dynamic control and configure the image quality settings as you desire.

One quirk however though is that in order for V-Sync to be disabled, the rendering resolution within the game must match the native resolution of your monitor. This means that if you are running 2560x1440 on your 4K monitor, you must first set the resolution within windows to 2560x1440 in order to run the game in V-Sync off mode.

forza7-settings.png

We did our testing with an array of three different resolutions (1080p, 1440p, and 4K) at maximum image quality settings. We tested both AMD and NVIDIA graphics cards in similar price and performance segments. The built-in benchmark mode for this game was used, which does feature some variance due to dynamic weather patterns. However, our testing within the full game matched the results of the benchmark mode closely, so we used it for our final results.

forza7-avgfps.png

Right off the bat, I have been impressed at how well optimized Forza Motorsport 7 seems to be on the PC. Compared to the unoptimized disaster that was Forza Horizon 3 when it launched on PC last year, it's clear that Turn 10 Studios and Microsoft have come a long way.

Even gamers looking to play on a 4K display at 60Hz can seemingly get away with the cheaper, and more mainstream GPUs such as the RX 580 or the GTX 1060 with acceptable performance in most scenarios.

Games on high-refresh-rate displays don't appear to have the same luxury. If you want to game at a resolution such as 2560x1440 at a full 144Hz, neither the RX Vega 64 or GTX 1080 will do this with maximum image quality settings. Although these GPUs appear to be in the margin where you could turn down a few settings to achieve your full refresh rate.

For some reason, the RX Vega cards didn't seem to show any scaling in performance when moving from 2560x1440 to 1920x1080, unlike the Polaris-based RX 580 and the NVIDIA options. We aren't quite sure of the cause of this and have reached out to AMD for clarification.

As far as frame times are concerned, we also gathered some data with our Frame Rating capture analysis system

Forza7_2560x1440_PLOT.png

Forza7_2560x1440_STUT.png

Taking a look at the first chart, we can see while the GTX 1080 frame times are extremely consistent, the RX Vega 64 shows some additional variance.

However, the frame time variance chart shows that over 95% of the frame times of the RX Vega 64 come in at under 2ms of variance, which will still provide a smooth gameplay experience in most scenarios. This matches with our experience while playing on both AMD and NVIDIA hardware where we saw no major issues with gameplay smoothness.

forza7-screen2.png

Forza Motorsport 7 seems to be a great addition to the PC gaming world (if you don't mind using the Microsoft store exclusively) and will run great on a wide array of hardware. Whether or not you have a NVIDIA or AMD GPU, you should be able to enjoy this fantastic racing simulator. 

Author:
Manufacturer: NVIDIA

Can you hear me now?

One of the more significant downsides to modern gaming notebooks is noise. These devices normally have small fans that have to spin quickly to cool the high-performance components found inside. While the answer for loud gaming desktops might be a nice set of headphones, for notebooks that may be used in more public spaces, that's not necessarily a good solution for friends or loved ones.

Attempting to address the problem of loud gaming notebooks, NVIDIA released a technology called WhisperMode. WhisperMode launched alongside NVIDIA's Max-Q design notebooks earlier this year, but it will work with any notebook enabled with an NVIDIA GTX 1060 or higher. This software solution aims to limit noise and power consumption of notebooks by restricting the frame rate of your game to a reasonable compromise of performance, noise, and power levels. NVIDIA has profiled over 400 games to find this sweet spot and added profiles for those games to WhisperMode technology.

WhisperMode is enabled through the NVIDIA GeForce Experience application.

GFE-whisper.PNG

From GFE, you can also choose to "Optimize games for WhisperMode." This will automatically adjust settings (in-game) to complement the frame rate target control of WhisperMode.

NVCP_whisper.PNG

If you want to adjust the Frame Rate Target, that must be done in the traditional NVIDIA Control Panel and is done on a per app basis. The target can be set at intervals of 5 FPS from 30 to the maximum refresh of your display. Having to go between two pieces of software to tweak these settings seems overly complex and hopefully some upcoming revamp of the NVIDIA software stack might address this user interface falacy. 

To put WhisperMode through its paces, we tried it on two notebooks - one with a GTX 1070 Max-Q (the MSI GS63VR) and one with a GTX 1080 Max-Q (the ASUS ROG Zephyrus). Our testing consisted of two games, Metro: Last Light and Hitman. Both of these games were run for 15 minutes to get the system up to temperature and achieve sound measurements that are more realistic to extended gameplay sessions. Sound levels were measured with our Extech 407739 Sound Level Meter placed at a distance of 6 inches from the given notebooks, above the keyboard and offset to the right.

Continue reading our review of the new NVIDIA WhisperMode technology!

Author:
Manufacturer: Intel

A surprise twist from Intel

Any expectations I had of a slower and less turbulent late summer and fall for the technology and hardware segments is getting shattered today with the beginning stages of Intel’s 8th Generation Core Processors. If you happen to think that this 8th generation is coming hot on the heels of the 7th generation that only just released to the consumer desktop market in January of this year, you’d be on the same page as me. If you are curious how Intel plans to balance Kaby Lake, Coffee Lake, and Cannon Lake, all releasing in similar time frames and still use terms like “generation,” then again, we are on the same page.

badge.jpg

Today Intel launches the 15-watt version of its 8th Generation Core Processors, based on a refresh of the Kaby Lake CPU design. This not a new architecture nor is this is not a new process node, though Intel does talk about slight changes in design and manufacturing that make it possible. The U-series processors that make up the majority of the thin and light and 2-in-1 designs for consumers and businesses are getting a significant upgrade in performance with this release. The Core i7 and Core i5 processors being announced will all be quad-core, HyperThreaded designs, moving us away from the world of dual-core processors in the 7th generation. Doubling core and thread count, while remaining inside the 15-watt thermal envelope for designs, is an incredible move and will strengthen Intel’s claim to this very important and very profitable segment.

Let’s look at the specifications table first. After all, we’re all geeks here.

  Core i7-8650U Core i7-8550U Core i5-8350U Core i5-8250U Core i7-7600U Core i7-7500U
Architecture Kaby Lake Refresh Kaby Lake Refresh Kaby Lake Refresh Kaby Lake Refresh Kaby Lake Kaby Lake
Process Tech 14nm+ 14nm+ 14nm+ 14nm+ 14nm+ 14nm+
Socket BGA1356 BGA1356 BGA1356 BGA1356 BGA1356 BGA1356
Cores/Threads 4/8 4/8 4/8 4/8 2/4 2/4
Base Clock 1.9 GHz 1.8 GHz 1.7 GHz 1.6 GHz 2.8 GHz 2.7 GHz
Max Turbo Clock 4.2 GHz 4.0 GHz 3.8 GHz 3.6 GHz 3.9 GHz 3.5 GHz
Memory Tech DDR4/LPDDR3 DDR4/LPDDR3 DDR4/LPDDR3 DDR4/LPDDR3 DDR4/LPDDR3 DDR4/LPDDR3
Memory Speeds 2400/2133 2400/2133 2400/2133 2400/2133 2133/1866 2133/1866
Cache (L4 Cache) 8MB 8MB 6MB 6MB 4MB 4MB
System Bus DMI3 - 8.0 GT/s DMI3 - 8.0 GT/s DMI2 - 6.4 GT/s DMI2 - 5.0 GT/s DMI2 - 5.0 GT/s DMI2 - 5.0 GT/s
Graphics UHD Graphics 620 UHD Graphics 620 UHD Graphics 620 UHD Graphics 620 HD Graphics 620 HD Graphics 620
Max Graphics Clock 1.15 GHz 1.15 GHz 1.1 GHz 1.1 GHz 1.15 GHz 1.05 GHz
TDP 15W 15W 15W 15W 15W 15W
MSRP $409 $409 $297 $297 $393 $393

The only differences between the Core i7 and Core i5 designs will be in cache size (Core i5 has 6MB, Core i7 has 8MB) and the clock speeds of the processors. All of them feature four true Kaby Lake cores with HyperThreading enabled to support 8 simultaneous threads in a notebook. Dual channel memory capable of speeds of 2400 MHz in DDR4 and 2133 MHz in LPDDR3 remain. The integrated graphics portion offers the same performance as the 7th generation designs, though the branding has moved from Intel HD Graphics to Intel UHD Graphics. Because Ultra.

8th Gen Intel Core U-series front.jpg

But take a gander at the clock speeds. The base clock on the four new CPUs range from 1.6 GHz to 1.9 GHz, with 100 MHz steps as you go up the SKU ladder. Those are low frequencies for modern processors, no doubt, but Intel has always been very conservative when it comes to setting specs for base frequency. This is the speed that Intel guarantees the processors will run at when the CPU is fully loaded using a 15-watt TDP cooling design. Keeping in mind that we moved from dual-core to quad-core processors, it makes sense that these base frequencies would drop. Intel doesn’t expect users in thin and light machines to utilize all 8 threads for very long, or very often, and instead focuses on shorter use cases for multi-threaded workloads (photo manipulation) that might run at 3.x GHz. If this period of time is short enough, the cooling solution will be able to “catch up” and keep the core within a reasonable range.

Continue reading about the new 8th Generation Intel Core Processors!!

Author:
Manufacturer: AMD

A confusing market

I feel like I have been writing about AMD non-stop in 2017. Starting with the release of Ryzen 7 and following through last week’s review of the HEDT Threadripper processor, AMD has gone from a nearly-dormant state in 2015-2016 to a wildly active and successful organization with more than a dozen new product launches under its belt. Today we will reveal the company's first consumer products based on the new Vega GPU architecture, thrusting the Radeon brand back into the fight at the $400+ price segments.

At this point, with architecture teases, product unboxings, professional card reviews, and pricing and availability reveals, we almost know everything we need to know about the new Radeon RX Vega 64 and RX Vega 56 products. Almost. Today we can show you the performance.

I want to be honest with our readers: AMD gave me so little time with these cards that I am going to gloss over some of the more interesting technological and architectural changes that Vega brings to market. I will come back to them at a later time, but I feel it is most important for us to talk about the performance and power characteristics of these cards as consumers finally get the chance to spend their hard-earned money on them.

01.jpg

If you already know about the specifications and pricing peculiarities of Vega 64 and Vega 56 and instead want direct access to performance results, I encourage you to skip ahead. If you want a refresher those details, check out the summary below.

Interesting statistics from the creation of this review in a VERY short window:

  • 175 graphs 
  • 8 cards, 8 games, 2 resolutions, 3 runs = 384 test runs
  • >9.6 TB of raw captured video (average ~25 GB/min)

Radeon RX Vega 64 and Vega 56 Specifications

Much of the below is sourced from our Vega 64/56 announcement story last month.

Though the leaks have been frequent and getting closer to reality, as it turns out AMD was in fact holding back quite a bit of information about the positioning of RX Vega for today. Radeon will launch the Vega 64 and Vega 56 today, with three different versions of the Vega 64 on the docket. Vega 64 uses the full Vega 10 chip with 64 CUs and 4096 stream processors. Vega 56 will come with 56 CUs enabled (get it?) and 3584 stream processors.

Pictures of the various product designs have already made it out to the field including the Limited Edition with the brushed anodized aluminum shroud, the liquid cooled card with a similar industrial design, and the more standard black shroud version that looks very similar to the previous reference cards from AMD.

  RX Vega 64 Liquid RX Vega 64 Air RX Vega 56 Vega Frontier Edition GTX 1080 Ti GTX 1080 TITAN X GTX 980 R9 Fury X
GPU Vega 10 Vega 10 Vega 10 Vega 10 GP102 GP104 GM200 GM204 Fiji XT
GPU Cores 4096 4096 3584 4096 3584 2560 3072 2048 4096
Base Clock 1406 MHz 1247 MHz 1156 MHz 1382 MHz 1480 MHz 1607 MHz 1000 MHz 1126 MHz 1050 MHz
Boost Clock 1677 MHz 1546 MHz 1471 MHz 1600 MHz 1582 MHz 1733 MHz 1089 MHz 1216 MHz -
Texture Units 256 256 224 256 224 160 192 128 256
ROP Units 64 64 64 64 88 64 96 64 64
Memory 8GB 8GB 8GB 16GB 11GB 8GB 12GB 4GB 4GB
Memory Clock 1890 MHz 1890 MHz 1600 MHz 1890 MHz 11000 MHz 10000 MHz 7000 MHz 7000 MHz 1000 MHz
Memory Interface 2048-bit HBM2 2048-bit HBM2 2048-bit HBM2 2048-bit HBM2 352-bit G5X 256-bit G5X 384-bit 256-bit 4096-bit (HBM)
Memory Bandwidth 484 GB/s 484 GB/s 410 GB/s 484 GB/s 484 GB/s 320 GB/s 336 GB/s 224 GB/s 512 GB/s
TDP 345 watts 295 watts 210 watts 300 watts 250 watts 180 watts 250 watts 165 watts 275 watts
Peak Compute 13.7 TFLOPS 12.6 TFLOPS 10.5 TFLOPS 13.1 TFLOPS 10.6 TFLOPS 8.2 TFLOPS 6.14 TFLOPS 4.61 TFLOPS 8.60 TFLOPS
Transistor Count 12.5B 12.5B 12.5B 12.5B 12.0B 7.2B 8.0B 5.2B 8.9B
Process Tech 14nm 14nm 14nm 14nm 16nm 16nm 28nm 28nm 28nm
MSRP (current) $699 $499 $399 $999 $699 $599 $999 $499 $649

If you are a frequent reader of PC Perspective, you have already seen our reviews of the Vega Frontier Edition air cooled and liquid cards, so some of this is going to look very familiar. Looking at the Vega 64 first, we need to define the biggest change to the performance ratings of RX and FE versions of the Vega architecture. When we listed the “boost clock” of the Vega FE cards, and really any Radeon cards previous to RX Vega, we were referring the maximum clock speed of the card in its out of box state. This was counter to the method that NVIDIA used for its “boost clock” rating that pointed towards a “typical” clock speed that the card would run at in a gaming workload. Essentially, the NVIDIA method was giving consumers a more realistic look at how fast the card would be running while AMD was marketing the theoretical peak with perfect thermals, perfect workloads. This, to be clear, never happened.

Continue reading our review of the Radeon RX Vega 64, Vega 64 Liquid, and Vega 56!!

Author:
Manufacturer: AMD

RX Vega is here

Though we are still a couple of weeks from availability and benchmarks, today we finally have the details on the Radeon RX Vega product line. That includes specifications, details on the clock speed changes, pricing, some interesting bundle programs, and how AMD plans to attack NVIDIA through performance experience metrics.

There is a lot going on today and I continue to have less to tell you about more products, so I’m going to defer a story on the architectural revelations that AMD made to media this week and instead focus on what I think more of our readers will want to know. Let’s jump in.

Radeon RX Vega Specifications

Though the leaks have been frequent and getting closer to reality, as it turns out AMD was in fact holding back quite a bit of information about the positioning of RX Vega for today. Radeon will launch the Vega 64 and Vega 56 today, with three different versions of the Vega 64 on the docket. Vega 64 uses the full Vega 10 chip with 64 CUs and 4096 stream processors. Vega 56 will come with 56 CUs enabled (get it?) and 3584 stream processors.

Pictures of the various product designs have already made it out to the field including the Limited Edition with the brushed anodized aluminum shroud, the liquid cooled card with a similar industrial design, and the more standard black shroud version that looks very similar to the previous reference cards from AMD.

  RX Vega 64 Liquid RX Vega 64 Air RX Vega 56 Vega Frontier Edition GTX 1080 Ti GTX 1080 TITAN X GTX 980 R9 Fury X
GPU Vega 10 Vega 10 Vega 10 Vega 10 GP102 GP104 GM200 GM204 Fiji XT
GPU Cores 4096 4096 3584 4096 3584 2560 3072 2048 4096
Base Clock 1406 MHz 1247 MHz 1156 MHz 1382 MHz 1480 MHz 1607 MHz 1000 MHz 1126 MHz 1050 MHz
Boost Clock 1677 MHz 1546 MHz 1471 MHz 1600 MHz 1582 MHz 1733 MHz 1089 MHz 1216 MHz -
Texture Units 256 256 256 256 224 160 192 128 256
ROP Units 64 64 ? 64 88 64 96 64 64
Memory 8GB 8GB 8GB 16GB 11GB 8GB 12GB 4GB 4GB
Memory Clock 1890 MHz 1890 MHz 1600 MHz 1890 MHz 11000 MHz 10000 MHz 7000 MHz 7000 MHz 1000 MHz
Memory Interface 2048-bit HBM2 2048-bit HBM2 2048-bit HBM2 2048-bit HBM2 352-bit G5X 256-bit G5X 384-bit 256-bit 4096-bit (HBM)
Memory Bandwidth 484 GB/s 484 GB/s 484 GB/s 484 GB/s 484 GB/s 320 GB/s 336 GB/s 224 GB/s 512 GB/s
TDP 345 watts 295 watts 210 watts 300 watts 250 watts 180 watts 250 watts 165 watts 275 watts
Peak Compute 13.7 TFLOPS 12.6 TFLOPS 10.5 TFLOPS 13.1 TFLOPS 10.6 TFLOPS 8.2 TFLOPS 6.14 TFLOPS 4.61 TFLOPS 8.60 TFLOPS
Transistor Count 12.5B 12.5B 12.5B 12.5B 12.0B 7.2B 8.0B 5.2B 8.9B
Process Tech 14nm 14nm 14nm 14nm 16nm 16nm 28nm 28nm 28nm
MSRP (current) $699 $499 $399 $999 $699 $599 $999 $499 $649

If you are a frequent reader of PC Perspective, you have already seen our reviews of the Vega Frontier Edition air cooled and liquid cards, so some of this is going to look very familiar. Looking at the Vega 64 first, we need to define the biggest change to the performance ratings of RX and FE versions of the Vega architecture. When we listed the “boost clock” of the Vega FE cards, and really any Radeon cards previous to RX Vega, we were referring the maximum clock speed of the card in its out of box state. This was counter to the method that NVIDIA used for its “boost clock” rating that pointed towards a “typical” clock speed that the card would run at in a gaming workload. Essentially, the NVIDIA method was giving consumers a more realistic look at how fast the card would be running while AMD was marketing the theoretical peak with perfect thermals, perfect workloads. This, to be clear, never happened.

vega-44.jpg

With the RX Vega cards and their specifications, the “boost clock” is now a typical clock rate. AMD has told me that this is what they estimate the average clock speed of the card will be during a typical gaming workload with a typical thermal and system design. This is great news! It means that gamers will have a more realistic indication of performance, both theoretical and expected, and the listings on the retailers and partner sites will be accurate. It also means that just looking at the spec table above will give you an impression that the performance gap between Vega FE and RX Vega is smaller than it will be in testing. (This is, of course, if AMD’s claims are true; I haven’t tested it myself yet.)

Continue reading our preview of the Radeon RX Vega 64 and Vega 56!

Author:
Manufacturer: AMD

Software Iteration

The software team at AMD and the Radeon Technologies Group is releasing Radeon Crimson ReLive Edition 17.7.2 this evening and it includes a host of new features, improved performance capabilities, and stability improvements to boot. This isn’t the major reboot of the software that we have come to expect on an annual basis, but rather an attempt to get the software team’s work out in front of media and gamers before the onslaught of RX Vega and Threadripper steal the attention.

radeonsw-4.jpg

AMD’s software team is big on its user satisfaction ratings, which it should be after the many years of falling behind NVIDIA in this department. With 16 individual driver releases in 2017 (so far) and 20 new games optimized and supported with day one releases, the 90% rating seems to be about right. Much of the work that could be done to improve multi-GPU and other critical problems are more than a calendar year behind us, so it seems reasonable the Radeon gamers would be in a good place in terms of software support.

radeonsw-7.jpg

One big change for Crimson ReLive today is that all of those lingering settings that remained in the old Catalyst Control Panel will now reside in the proper Radeon Settings. This means matching UI and streamlined interface.

radeonsw-14.jpg

The ReLive capture and streaming capability sees a handful of upgrades today including a bump from 50mbps to 100mbps maximum bit rate, transparency support for webcams, improved optimization to lower the memory usage (and thus the overhead of running ReLive), notifications of replays and record timers, and audio controls for microphone volume and push-to-talk.

Continue reading about the latest Crimson ReLive driver updates!

Author:
Manufacturer: AMD

Specifications and Design

Just a couple of short weeks ago we looked at the Radeon Vega Frontier Edition 16GB graphics card in its air-cooled variety. The results were interesting – gaming performance proved to fall somewhere between the GTX 1070 and the GTX 1080 from NVIDIA’s current generation of GeForce products. That is under many of the estimates from players in the market, including media, fans, and enthusiasts.  But before we get to the RX Vega product family that is targeted at gamers, AMD has another data point for us to look at with a water-cooled version of Vega Frontier Edition. At a $1500 MSRP, which we shelled out ourselves, we are very interested to see how it changes the face of performance for the Vega GPU and architecture.

Let’s start with a look at the specifications of this version of the Vega Frontier Edition, which will be…familiar.

  Vega Frontier Edition (Liquid) Vega Frontier Edition Titan Xp GTX 1080 Ti Titan X (Pascal) GTX 1080 TITAN X GTX 980 R9 Fury X
GPU Vega Vega GP102 GP102 GP102 GP104 GM200 GM204 Fiji XT
GPU Cores 4096 4096 3840 3584 3584 2560 3072 2048 4096
Base Clock 1382 MHz 1382 MHz 1480 MHz 1480 MHz 1417 MHz 1607 MHz 1000 MHz 1126 MHz 1050 MHz
Boost Clock 1600 MHz 1600 MHz 1582 MHz 1582 MHz 1480 MHz 1733 MHz 1089 MHz 1216 MHz -
Texture Units ? ? 224 224 224 160 192 128 256
ROP Units 64 64 96 88 96 64 96 64 64
Memory 16GB 16GB 12GB 11GB 12GB 8GB 12GB 4GB 4GB
Memory Clock 1890 MHz 1890 MHz 11400 MHz 11000 MHz 10000 MHz 10000 MHz 7000 MHz 7000 MHz 1000 MHz
Memory Interface 2048-bit HBM2 2048-bit HBM2 384-bit G5X 352-bit 384-bit G5X 256-bit G5X 384-bit 256-bit 4096-bit (HBM)
Memory Bandwidth 483 GB/s 483 GB/s 547.7 GB/s 484 GB/s 480 GB/s 320 GB/s 336 GB/s 224 GB/s 512 GB/s
TDP 300 watts
~350 watts
300 watts 250 watts 250 watts 250 watts 180 watts 250 watts 165 watts 275 watts
Peak Compute 13.1 TFLOPS 13.1 TFLOPS 12.0 TFLOPS 10.6 TFLOPS 10.1 TFLOPS 8.2 TFLOPS 6.14 TFLOPS 4.61 TFLOPS 8.60 TFLOPS
Transistor Count ? ? 12.0B 12.0B 12.0B 7.2B 8.0B 5.2B 8.9B
Process Tech 14nm 14nm 16nm 16nm 16nm 16nm 28nm 28nm 28nm
MSRP (current) $1499 $999 $1200 $699 $1,200 $599 $999 $499 $649

The base specs remain unchanged and AMD lists the same memory frequency and even GPU clock rates across both models. In practice though, the liquid cooled version runs at higher sustained clocks and can overclock a bit easier as well (more details later). What does change with the liquid cooled version is a usable BIOS switch on top of the card that allows you to move between two distinct power draw states: 300 watts and 350 watts.

IMG_4728.JPG

First, it’s worth noting this is a change from the “375 watt” TDP that this card was listed at during the launch and announcement. AMD was touting a 300-watt and 375-watt version of Frontier Edition, but it appears the company backed off a bit on that, erring on the side of caution to avoid breaking any of the specifcations of PCI Express (board slot or auxiliary connectors). Even more concerning is that AMD chose to have the default state of the switch on the Vega FE Liquid card at 300 watts rather than the more aggressive 350 watts. AMD claims this to avoid any problems with lower quality power supplies that may struggle to hit slightly over 150 watts of power draw (and resulting current) from the 8-pin power connections. I would argue that any system that is going to install a $1500 graphics card can and should be prepared to provide the necessary power, but for the professional market, AMD leans towards caution. (It’s worth pointing out the RX 480 power issues that may have prompted this internal decision making were more problematic because they impacted the power delivery through the motherboard, while the 6- and 8-pin connectors are generally much safer to exceed the ratings.)

Even without clock speed changes, the move to water cooling should result in better and more consistent performance by removing the overheating concerns that surrounded our first Radeon Vega Frontier Edition review. But let’s dive into the card itself and see how the design process created a unique liquid cooled solution.

Continue reading our review of the Radeon Vega Frontier Edition Liquid-Cooled card!!

Author:
Manufacturer: Sapphire

Overview

There has been a lot of news lately about the release of Cryptocurrency-specific graphics cards from both NVIDIA and AMD add-in board partners. While we covered the currently cryptomining phenomenon in an earlier article, today we are taking a look at one of these cards geared towards miners.

IMG_4681.JPG

It's worth noting that I purchased this card myself from Newegg, and neither AMD or Sapphire are involved in this article. I saw this card pop up on Newegg a few days ago, and my curiosity got the best of me.

There has been a lot of speculation, and little official information from vendors about what these mining cards will actually entail.

From the outward appearance, it is virtually impossible to distinguish this "new" RX 470 from the previous Sapphire Nitro+ RX 470, besides the lack of additional display outputs beyond the DVI connection. Even the branding and labels on the card identify it as a Nitro+ RX 470.

In order to test the hashing rates of this GPU, we are using Claymore's Dual Miner Version 9.6 (mining Ethereum only) against a reference design RX 470, also from Sapphire.

IMG_4684.JPG

On the reference RX 470 out of the box, we hit rates of about 21.8 MH/s while mining Ethereum. 

Once we moved to the Sapphire mining card, we move up to at least 24 MH/s from the start.

Continue reading about the Sapphire Radeon RX 470 Mining Edition!

Author:
Manufacturer: AKiTiO

A long time coming

External video cards for laptops have long been a dream of many PC enthusiasts, and for good reason. It’s compelling to have a thin-and-light notebook with great battery life for things like meetings or class, with the ability to plug it into a dock at home and enjoy your favorite PC games.

Many times we have been promised that external GPUs for notebooks would be a viable option. Over the years there have been many commercial solutions involving both industry standard protocols like ExpressCard, as well as proprietary connections to allow you to externally connect PCIe devices. Inspiring hackers have also had their hand with this for many years, cobbling together interesting solutions using mPCIe and M.2 ports on their notebooks which were meant for other devices.

With the introduction of Intel’s Thunderbolt standard in 2011, there was a hope that we would finally achieve external graphics nirvana. A modern, Intel-backed protocol promising PCIe x4 speeds (PCIe 2.0 at that point) sounded like it would be ideal for connecting GPUs to notebooks, and in some ways it was. Once again the external graphics communities managed to get it to work through the use of enclosures meant to connect other non-GPU PCIe devices such as RAID and video capture cards to systems. However, software support was still a limiting factor. You were required to use an external monitor to display your video, and it still felt like you were just riding the line between usability and a total hack. It felt like we were never going to get true universal support for external GPUs on notebooks.

Then, seemingly of out of nowhere, Intel decided to promote native support for external GPUs as a priority when they introduced Thunderbolt 3. Fast forward, and we've already seen a much larger adoption of Thunderbolt 3 on PC notebooks than we ever did with the previous Thunderbolt implementations. Taking all of this into account, we figured it was time to finally dip our toes into the eGPU market. 

For our testing, we decided on the AKiTio Node for several reasons. First, at around $300, it's by far the lowest cost enclosure built to support GPUs. Additionally, it seems to be one of the most compatible devices currently on the market according to the very helpful comparison chart over at eGPU.io. The eGPU site is a wonderful resource for everything external GPU, over any interface possible, and I would highly recommend heading over there to do some reading if you are interested in trying out an eGPU for yourself.

The Node unit itself is a very utilitarian design. Essentially you get a folded sheet metal box with a Thunderbolt controller and 400W SFX power supply inside.

DSC03490.JPG

In order to install a GPU into the Node, you must first unscrew the enclosure from the back and slide the outer shell off of the device.

DSC03495.JPG

Once inside, we can see that there is ample room for any graphics card you might want to install in this enclosure. In fact, it seems a little too large for any of the GPUs we installed, including GTX 1080 Ti models. Here, you can see a more reasonable RX 570 installed.

Beyond opening up the enclosure to install a GPU, there is very little configuration required. My unit required a firmware update, but that was easily applied with the tools from the AKiTio site.

From here, I simply connected the Node to a ThinkPad X1, installed the NVIDIA drivers for our GTX 1080 Ti, and everything seemed to work — including using the 1080 Ti with the integrated notebook display and no external monitor!

Now that we've got the Node working, let's take a look at some performance numbers.

Continue reading our look at external graphics with the Thunderbolt 3 AKiTiO Node!

Author:
Manufacturer: AMD

Two Vegas...ha ha ha

When the preorders for the Radeon Vega Frontier Edition went up last week, I made the decision to place orders in a few different locations to make sure we got it in as early as possible. Well, as it turned out, we actually had the cards show up very quickly…from two different locations.

dualvega.jpg

So, what is a person to do if TWO of the newest, most coveted GPUs show up on their doorstep? After you do the first, full review of the single GPU iteration, you plug those both into your system and do some multi-GPU CrossFire testing!

There of course needs to be some discussion up front about this testing and our write up. If you read my first review of the Vega Frontier Edition you will clearly note my stance on the idea that “this is not a gaming card” and that “the drivers aren’t ready. Essentially, I said these potential excuses for performance were distraction and unwarranted based on the current state of Vega development and the proximity of the consumer iteration, Radeon RX.

IMG_4688.JPG

But for multi-GPU, it’s a different story. Both competitors in the GPU space will tell you that developing drivers for CrossFire and SLI is incredibly difficult. Much more than simply splitting the work across different processors, multi-GPU requires extra attention to specific games, game engines, and effects rendering that are not required in single GPU environments. Add to that the fact that the market size for CrossFire and SLI has been shrinking, from an already small state, and you can see why multi-GPU is going to get less attention from AMD here.

Even more, when CrossFire and SLI support gets a focus from the driver teams, it is often late in the process, nearly last in the list of technologies to address before launch.

With that in mind, we all should understand the results we are going to show you might be indicative of the CrossFire scaling when Radeon RX Vega launches, but it very well could not. I would look at the data we are presenting today as a “current state” of CrossFire for Vega.

Continue reading our look at a pair of Vega Frontier Edition cards in CrossFire!

Manufacturer: NVIDIA

Performance not two-die four.

When designing an integrated circuit, you are attempting to fit as much complexity as possible within your budget of space, power, and so forth. One harsh limitation for GPUs is that, while your workloads could theoretically benefit from more and more processing units, the number of usable chips from a batch shrinks as designs grow, and the reticle limit of a fab’s manufacturing node is basically a brick wall.

What’s one way around it? Split your design across multiple dies!

nvidia-2017-multidie.png

NVIDIA published a research paper discussing just that. In their diagram, they show two examples. In the first diagram, the GPU is a single, typical die that’s surrounded by four stacks of HBM, like GP100; the second configuration breaks the GPU into five dies, four GPU modules and an I/O controller, with each GPU module attached to a pair of HBM stacks.

NVIDIA ran simulations to determine how this chip would perform, and, in various workloads, they found that it out-performed the largest possible single-chip GPU by about 45.5%. They scaled up the single-chip design until it had the same amount of compute units as the multi-die design, even though this wouldn’t work in the real world because no fab could actual lithograph it. Regardless, that hypothetical, impossible design was only ~10% faster than the actually-possible multi-chip one, showing that the overhead of splitting the design is only around that much, according to their simulation. It was also faster than the multi-card equivalent by 26.8%.

While NVIDIA’s simulations, run on 48 different benchmarks, have accounted for this, I still can’t visualize how this would work in an automated way. I don’t know how the design would automatically account for fetching data that’s associated with other GPU modules, as this would probably be a huge stall. That said, they spent quite a bit of time discussing how much bandwidth is required within the package, and figures of 768 GB/s to 3TB/s were mentioned, so it’s possible that it’s just the same tricks as fetching from global memory. The paper touches on the topic several times, but I didn’t really see anything explicit about what they were doing.

amd-2017-epyc-breakdown.jpg

If you’ve been following the site over the last couple of months, you’ll note that this is basically the same as AMD is doing with Threadripper and EPYC. The main difference is that CPU cores are isolated, so sharing data between them is explicit. In fact, when that product was announced, I thought, “Huh, that would be cool for GPUs. I wonder if it’s possible, or if it would just end up being Crossfire / SLI.”

Apparently not? It should be possible?

I should note that I doubt this will be relevant for consumers. The GPU is the most expensive part of a graphics card. While the thought of four GP102-level chips working together sounds great for 4K (which is 4x1080p in resolution) gaming, quadrupling the expensive part sounds like a giant price-tag. That said, the market of GP100 (and the upcoming GV100) would pay five-plus digits for the absolute fastest compute device for deep-learning, scientific research, and so forth.

The only way I could see this working for gamers is if NVIDIA finds the sweet-spot for performance-to-yield (for a given node and time) and they scale their product stack with multiples of that. In that case, it might be cost-advantageous to hit some level of performance, versus trying to do it with a single, giant chip.

This is just my speculation, however. It’ll be interesting to see where this goes, whenever it does.

Author:
Manufacturer: Galax

GTX 1060 keeps on kicking

Despite the market for graphics cards being disrupted by the cryptocurrency mining craze, board partners like Galax continue to build high quality options for gamers...if they can get their hands on them. We recently received a new Galax GTX 1060 EXOC White 6GB card that offers impressive performance and features as well as a visual style to help it stand out from the crowd.

We have worked with GeForce GTX 1060 graphics cards quite a bit on PC Perspective, so there is not a need to dive into the history of the GPU itself. If you need a refresher on this GP106 GPU, where it stands in the pantheon on the current GPU market, check out my launch review of the GTX 1060 from last year. The release of AMD’s Radeon RX 580 did change things a bit in the market landscape, so that review might be worth looking at too.

Our quick review at the Galax GTX 1060 EXOC White will look at performance (briefly), overclocking, and cost. But first, let’s take a look at this thing.

The Galax GTX 1060 EXOC White

As the name implies, the EXOC White card from Galax is both overclocked and uses a white fan shroud to add a little flair to the design. The PCB is a standard black color, but with the fan and back plate both a bright white, the card will be a point of interest for nearly any PC build. Pairing this with a white-accented motherboard, like the recent ASUS Prime series, would be an excellent visual combination.

IMG_4674.JPG

The fans on the EXOC White have clear-ish white blades that are illuminated by the white LEDs that shine through the fan openings on the shroud.

IMG_4675.JPG

IMG_4676.JPG

The cooler that Galax has implemented is substantial, with three heatpipes used to distribute the load from the GPU across the fins. There is a 6-pin power connector (standard for the GTX 1060) but that doesn’t appear to hold back the overclocking capability of the GPU.

IMG_4677.JPG

There is a lot of detail on the heatsink shroud – and either you like it or you don’t.

IMG_4678.JPG

Galax has included a white backplate that doubles as artistic style and heatsink. I do think that with most users’ cases showcasing the rear of the graphics card more than the front, a good quality back plate is a big selling point.

IMG_4680.JPG

The output connectivity includes a pair of DVI ports, a full-size HDMI and a full-size DisplayPort; more than enough for nearly any buyer of this class of GPU.

Continue reading about the Galax GTX 1060 EXOC White 6GB!

Author:
Manufacturer: AMD

An interesting night of testing

Last night I did our first ever live benchmarking session using the just-arrived Radeon Vega Frontier Edition air-cooled graphics card. Purchased directly from a reseller, rather than being sampled by AMD, gave us the opportunity to testing for a new flagship product without an NDA in place to keep us silenced, so I thought it would be fun to the let the audience and community go along for the ride of a traditional benchmarking session. Though I didn’t get all of what I wanted done in that 4.5-hour window, it was great to see the interest and excitement for the product and the results that we were able to generate.

But to the point of the day – our review of the Radeon Vega Frontier Edition graphics card. Based on the latest flagship GPU architecture from AMD, the Radeon Vega FE card has a lot riding on its shoulders, despite not being aimed at gamers. It is the FIRST card to be released with Vega at its heart. It is the FIRST instance of HBM2 being utilized in a consumer graphics card. It is the FIRST in a new attempt from AMD to target the group of users between gamers and professional users (like NVIDIA has addressed with Titan previously). And, it is the FIRST to command as much attention and expectation for the future of a company, a product line, and a fan base.

IMG_4621.JPG

Other than the architectural details that AMD gave us previously, we honestly haven’t been briefed on the performance expectations or the advancements in Vega that we should know about. The Vega FE products were released to the market with very little background, only well-spun turns of phrase emphasizing the value of the high performance and compatibility for creators. There has been no typical “tech day” for the media to learn fully about Vega and there were no samples from AMD to media or analysts (that I know of). Unperturbed by that, I purchased one (several actually, seeing which would show up first) and decided to do our testing.

On the following pages, you will see a collection of tests and benchmarks that range from 3DMark to The Witcher 3 to SPECviewperf to LuxMark, attempting to give as wide a viewpoint of the Vega FE product as I can in a rather short time window. The card is sexy (maybe the best looking I have yet seen), but will disappoint many on the gaming front. For professional users that are okay not having certified drivers, performance there is more likely to raise some impressed eyebrows.

Radeon Vega Frontier Edition Specifications

Through leaks and purposeful information dumps over the past couple of months, we already knew a lot about the Radeon Vega Frontier Edition card prior to the official sale date this week. But now with final specifications in hand, we can start to dissect what this card actually is.

  Vega Frontier Edition Titan Xp GTX 1080 Ti Titan X (Pascal) GTX 1080 TITAN X GTX 980 R9 Fury X R9 Fury
GPU Vega GP102 GP102 GP102 GP104 GM200 GM204 Fiji XT Fiji Pro
GPU Cores 4096 3840 3584 3584 2560 3072 2048 4096 3584
Base Clock 1382 MHz 1480 MHz 1480 MHz 1417 MHz 1607 MHz 1000 MHz 1126 MHz 1050 MHz 1000 MHz
Boost Clock 1600 MHz 1582 MHz 1582 MHz 1480 MHz 1733 MHz 1089 MHz 1216 MHz - -
Texture Units ? 224 224 224 160 192 128 256 224
ROP Units 64 96 88 96 64 96 64 64 64
Memory 16GB 12GB 11GB 12GB 8GB 12GB 4GB 4GB 4GB
Memory Clock 1890 MHz 11400 MHz 11000 MHz 10000 MHz 10000 MHz 7000 MHz 7000 MHz 1000 MHz 1000 MHz
Memory Interface 2048-bit HBM2 384-bit G5X 352-bit 384-bit G5X 256-bit G5X 384-bit 256-bit 4096-bit (HBM) 4096-bit (HBM)
Memory Bandwidth 483 GB/s 547.7 GB/s 484 GB/s 480 GB/s 320 GB/s 336 GB/s 224 GB/s 512 GB/s 512 GB/s
TDP 300 watts 250 watts 250 watts 250 watts 180 watts 250 watts 165 watts 275 watts 275 watts
Peak Compute 13.1 TFLOPS 12.0 TFLOPS 10.6 TFLOPS 10.1 TFLOPS 8.2 TFLOPS 6.14 TFLOPS 4.61 TFLOPS 8.60 TFLOPS 7.20 TFLOPS
Transistor Count ? 12.0B 12.0B 12.0B 7.2B 8.0B 5.2B 8.9B 8.9B
Process Tech 14nm 16nm 16nm 16nm 16nm 28nm 28nm 28nm 28nm
MSRP (current) $999 $1200 $699 $1,200 $599 $999 $499 $649 $549

The Vega FE shares enough of a specification listing with the Fury X that it deserves special recognition. Both cards sport 4096 stream processors, 64 ROPs and 256 texture units. The Vega FE is running at much higher clock speeds (35-40% higher) and also upgrades to the next generation of high-bandwidth memory and quadruples capacity. Still, there will be plenty of comparisons between the two products, looking to measure IPC changes from the CUs (compute units) from Fiji to the NCUs built for Vega.

DSC03536 copy.jpg

The Radeon Vega GPU

The clock speeds also see another shift this time around with the adoption of “typical” clock speeds. This is something that NVIDIA has been using for a few generations with the introduction of GPU Boost, and tells the consumer how high they should expect clocks to go in a nominal workload. Normally I would say a gaming workload, but since this card is supposedly for professional users and the like, I assume this applies across the board. So even though the GPU is rated at a “peak” clock rate of 1600 MHz, the “typical” clock rate is 1382 MHz. (As an early aside, I did NOT see 1600 MHz in any of my testing time with our Vega FE but did settle in a ~1440 MHz clock most of the time.)

Continue reading our review of the AMD Radeon Vega Frontier Edition!

Author:
Manufacturer: PC Perspective

Why?

Astute readers of the site might remember the original story we did on Bitcoin mining in 2011, the good ole' days where the concept of the blockchain was new and exciting and mining Bitcoin on a GPU was still plenty viable.

gpu-bitcoin.jpg

However, that didn't last long, as the race for cash lead people to developing Application Specific Integrated Circuits (ASICs) dedicated solely to Bitcoin mining quickly while sipping power. Use of the expensive ASICs drove the difficulty of mining Bitcoin to the roof and killed any sort of chance of profitability from mere mortals mining cryptocurrency.

Cryptomining saw a resurgence in late 2013 with the popular adoption of alternate cryptocurrencies, specifically Litecoin which was based on the Scrypt algorithm instead of AES-256 like Bitcoin. This meant that the ASIC developed for mining Bitcoin were useless. This is also the period of time that many of you may remember as the "Dogecoin" era, my personal favorite cryptocurrency of all time. 

dogecoin-300.png

Defenders of these new "altcoins" claimed that Scrypt was different enough that ASICs would never be developed for it, and GPU mining would remain viable for a larger portion of users. As it turns out, the promise of money always wins out, and we soon saw Scrypt ASICs. Once again, the market for GPU mining crashed.

That brings us to today, and what I am calling "Third-wave Cryptomining." 

While the mass populous stopped caring about cryptocurrency as a whole, the dedicated group that was left continued to develop altcoins. These different currencies are based on various algorithms and other proofs of works (see technologies like Storj, which use the blockchain for a decentralized Dropbox-like service!).

As you may have predicted, for various reasons that might be difficult to historically quantify, there is another very popular cryptocurrency from this wave of development, Ethereum.

ETHEREUM-LOGO_LANDSCAPE_Black.png

Ethereum is based on the Dagger-Hashimoto algorithm and has a whole host of different quirks that makes it different from other cryptocurrencies. We aren't here to get deep in the woods on the methods behind different blockchain implementations, but if you have some time check out the Ethereum White Paper. It's all very fascinating.

Continue reading our look at this third wave of cryptocurrency!

Author:
Manufacturer: AMD

We are up to two...

UPDATE (5/31/2017): Crystal Dynamics was able to get back to us with a couple of points on the changes that were made with this patch to affect the performance of AMD Ryzen processors.

  1. Rise of the Tomb Raider splits rendering tasks to run on different threads. By tuning the size of those tasks – breaking some up, allowing multicore CPUs to contribute in more cases, and combining some others, to reduce overheads in the scheduler – the game can more efficiently exploit extra threads on the host CPU.
     
  2. An optimization was identified in texture management that improves the combination of AMD CPU and NVIDIA GPU.  Overhead was reduced by packing texture descriptor uploads into larger chunks.

There you have it, a bit more detail on the software changes made to help adapt the game engine to AMD's Ryzen architecture. Not only that, but it does confirm our information that there was slightly MORE to address in the Ryzen+GeForce combinations.

END UPDATE

Despite a couple of growing pains out of the gate, the Ryzen processor launch appears to have been a success for AMD. Both the Ryzen 7 and the Ryzen 5 releases proved to be very competitive with Intel’s dominant CPUs in the market and took significant leads in areas of massive multi-threading and performance per dollar. An area that AMD has struggled in though has been 1080p gaming – performance in those instances on both Ryzen 7 and 5 processors fell behind comparable Intel parts by (sometimes) significant margins.

Our team continues to watch the story to see how AMD and game developers work through the issue. Most recently I posted a look at the memory latency differences between Ryzen and Intel Core processors. As it turns out, the memory latency differences are a significant part of the initial problem for AMD:

Because of this, I think it is fair to claim that some, if not most, of the 1080p gaming performance deficits we have seen with AMD Ryzen processors are a result of this particular memory system intricacy. You can combine memory latency with the thread-to-thread communication issue we discussed previously into one overall system level complication: the Zen memory system behaves differently than anything we have seen prior and it currently suffers in a couple of specific areas because of it.

In that story I detailed our coverage of the Ryzen processor and its gaming performance succinctly:

Our team has done quite a bit of research and testing on this topic. This included a detailed look at the first asserted reason for the performance gap, the Windows 10 scheduler. Our summary there was that the scheduler was working as expected and that minimal difference was seen when moving between different power modes. We also talked directly with AMD to find out its then current stance on the results, backing up our claims on the scheduler and presented a better outlook for gaming going forward. When AMD wanted to test a new custom Windows 10 power profile to help improve performance in some cases, we took part in that too. In late March we saw the first gaming performance update occur courtesy of Ashes of the Singularity: Escalation where an engine update to utilize more threads resulted in as much as 31% average frame increase.

Quick on the heels of the Ryzen 7 release, AMD worked with the developer Oxide on the Ashes of the Singularity: Escalation engine. Through tweaks and optimizations, the game was able to showcase as much as a 30% increase in average frame rate on the integrated benchmark. While this was only a single use case, it does prove that through work with the developers, AMD has the ability to improve the 1080p gaming positioning of Ryzen against Intel.

rotr-screen4-small.jpg

Fast forward to today and I was surprised to find a new patch for Rise of the Tomb Raider, a game that was actually one of the worst case scenarios for AMD with Ryzen. (Patch #12, v1.0.770.1) The patch notes mention the following:

The following changes are included in this patch

- Fix certain DX12 crashes reported by users on the forums.

- Improve DX12 performance across a variety of hardware, in CPU bound situations. Especially performance on AMD Ryzen CPUs can be significantly improved.

While we expect this patch to be an improvement for everyone, if you do have trouble with this patch and prefer to stay on the old version we made a Beta available on Steam, build 767.2, which can be used to switch back to the previous version.

We will keep monitoring for feedback and will release further patches as it seems required. We always welcome your feedback!

Obviously the data point that stood out for me was the improved DX12 performance “in CPU bound situations. Especially on AMD Ryzen CPUs…”

Remember how the situation appeared in April?

rotr.png

The Ryzen 7 1800X was 24% slower than the Intel Core i7-7700K – a dramatic difference for a processor that should only have been ~8-10% slower in single threaded workloads.

How does this new patch to RoTR affect performance? We tested it on the same Ryzen 7 1800X benchmarks platform from previous testing including the ASUS Crosshair VI Hero motherboard, 16GB DDR4-2400 memory and GeForce GTX 1080 Founders Edition using the 378.78 driver. All testing was done under the DX12 code path.

tr-1.png

tr-2.png

The Ryzen 7 1800X score jumps from 107 FPS to 126.44 FPS, an increase of 17%! That is a significant boost in performance at 1080p while still running at the Very High image quality preset, indicating that the developer (and likely AMD) were able to find substantial inefficiencies in the engine. For comparison, the 8-core / 16-thread Intel Core i7-6900K only sees a 2.4% increase from this new game revision. This tells us that the changes to the game were specific to Ryzen processors and their design, but that no performance was redacted from the Intel platforms.

Continue reading our look at the new Rise of the Tomb Raider patch for Ryzen!