To the Max?
Much of the PC enthusiast internet, including our comments section, has been abuzz with “Asynchronous Shader” discussion. Normally, I would explain what it is and then outline the issues that surround it, but I would like to swap that order this time. Basically, the Ashes of the Singularity benchmark utilizes Asynchronous Shaders in DirectX 12, but they disable it (by Vendor ID) for NVIDIA hardware. They say that this is because, while the driver reports compatibility, “attempting to use it was an unmitigated disaster in terms of performance and conformance”.
AMD's Robert Hallock claims that NVIDIA GPUs, including Maxwell, cannot support the feature in hardware at all, while all AMD GCN graphics cards do. NVIDIA has yet to respond to our requests for an official statement, although we haven't poked every one of our contacts yet. We will certainly update and/or follow up if we hear from them. For now though, we have no idea whether this is a hardware or software issue. Either way, it seems more than just politics.
So what is it?
Simply put, Asynchronous Shaders allows a graphics driver to cram workloads in portions of the GPU that are idle, but not otherwise available. For instance, if a graphics task is hammering the ROPs, the driver would be able to toss an independent physics or post-processing task into the shader units alongside it. Kollock from Oxide Games used the analogy of HyperThreading, which allows two CPU threads to be executed on the same core at the same time, as long as it has the capacity for it.
Kollock also notes that compute is becoming more important in the graphics pipeline, and it is possible to completely bypass graphics altogether. The fixed-function bits may never go away, but it's possible that at least some engines will completely bypass it -- maybe even their engine, several years down the road.
But, like always, you will not get an infinite amount of performance by reducing your waste. You are always bound by the theoretical limits of your components, and you cannot optimize past that (except for obviously changing the workload itself). The interesting part is: you can measure that. You can absolutely observe how long a GPU is idle, and represent it as a percentage of a time-span (typically a frame).
And, of course, game developers profile GPUs from time to time...
According to Kollock, he has heard of some console developers getting up to 30% increases in performance using Asynchronous Shaders. Again, this is on console hardware and so this amount may increase or decrease on the PC. In an informal chat with a developer at Epic Games, so massive grain of salt is required, his late night ballpark “totally speculative” guesstimate is that, on the Xbox One, the GPU could theoretically accept a maximum ~10-25% more work in Unreal Engine 4, depending on the scene. He also said that memory bandwidth gets in the way, which Asynchronous Shaders would be fighting against. It is something that they are interested in and investigating, though.
This is where I speculate on drivers. When Mantle was announced, I looked at its features and said “wow, this is everything that a high-end game developer wants, and a graphics developer absolutely does not”. From the OpenCL-like multiple GPU model taking much of the QA out of SLI and CrossFire, to the memory and resource binding management, this should make graphics drivers so much easier.
It might not be free, though. Graphics drivers might still have a bunch of games to play to make sure that work is stuffed through the GPU as tightly packed as possible. We might continue to see “Game Ready” drivers in the coming years, even though much of that burden has been shifted to the game developers. On the other hand, maybe these APIs will level the whole playing field and let all players focus on chip design and efficient injestion of shader code. As always, painfully always, time will tell.
Subject: Graphics Cards | August 31, 2015 - 07:19 PM | Scott Michaud
Tagged: nvidia, graphics drivers, geforce, drivers
Unlike last week's 355.80 Hotfix, today's driver is fully certified by both NVIDIA and Microsoft (WHQL). According to users on GeForce Forums, this driver includes the hotfix changes, although I am still seeing a few users complain about memory issues under SLI. The general consensus seems to be that a number of bugs were fixed, and that driver quality is steadily increasing. This is also a “Game Ready” driver for Mad Max and Metal Gear Solid V: The Phantom Pain.
NVIDIA's GeForce Game Ready 355.82 WHQL Mad Max and Metal Gear Solid V: The Phantom Pain drivers (inhale, exhale, inhale) are now available for download at their website. Note that Windows 10 drivers are separate from Windows 7 and Windows 8.x ones, so be sure to not take shortcuts when filling out the “select your driver” form. That, or just use GeForce Experience.
Subject: Graphics Cards, Processors | August 30, 2015 - 09:14 PM | Scott Michaud
Tagged: amd, carrizo, Fiji, opencl, opencl 2.0
Apart from manufacturers with a heavy first-party focus, such as Apple and Nintendo, hardware is useless without developer support. In this case, AMD has updated their App SDK to include support for OpenCL 2.0, with code samples. It also updates the SDK for Windows 10, Carrizo, and Fiji, but it is not entirely clear how.
That said, OpenCL is important to those two products. Fiji has a very high compute throughput compared to any other GPU at the moment, and its memory bandwidth is often even more important for GPGPU workloads. It is also useful for Carrizo, because parallel compute and HSA features are what make it a unique product. AMD has been creating first-party software software and helping popular third-party developers such as Adobe, but a little support to the world at large could bring a killer application or two, especially from the open-source community.
The SDK has been available in pre-release form for quite some time now, but it is finally graduated out of beta. OpenCL 2.0 allows for work to be generated on the GPU, which is especially useful for tasks that vary upon previous results without contacting the CPU again.
Subject: Graphics Cards | August 27, 2015 - 05:23 PM | Scott Michaud
Tagged: windows 10, nvidia, geforce, drivers, graphics drivers
While GeForce Hotfix driver 355.80 is not certified, or even beta, I know that a lot of our readers have issues with SLI in Windows 10. Especially in games like Battlefield 4, memory usage would expand until, apparently, a crash occurs. Since I run a single GPU, I have not experienced this issue and so I cannot comment on what happens. I just know that it was very common in the GeForce forums and in our comment section, so it was probably a big problem for many users.
If you are not experiencing this problem, then you probably should not install this driver. This is a hotfix that, as stated above, was released outside of NVIDIA's typical update process. You might experience new, unknown issues. Affected users, on the other hand, have the choice to install the fix now, which could very well be stable, or wait for a certified release later.
You can pick it up from NVIDIA's support site.
The Tiniest Fiji
Way back on June 16th, AMD held a live stream event during E3 to announce a host of new products. In that group was the AMD Radeon R9 Fury X, R9 Fury and the R9 Nano. Of the three, the Nano was the most intriguing to most of the online press as it was the one we knew the least about. AMD promised a full Fiji GPU in a package with a 6-in PCB and a 175 watt TDP. Well today, AMD is, uh, re-announcing (??) the AMD Radeon R9 Nano with more details on specifications, performance and availability.
First, let’s get this out of the way: AMD is making this announcement today because they publicly promised the R9 Nano for August. And with the final days of summer creeping up on them, rather than answer questions about another delay, AMD is instead going the route of a paper launch, but one with a known end date. We will apparently get our samples of the hardware in early September with reviews and the on-sale date following shortly thereafter. (Update: AMD claims the R9 Nano will be on store shelves on September 10th and should have "critical mass" of availability.)
Now let’s get to the details that you are really here for. And rather than start with the marketing spin on the specifications that AMD presented to the media, let’s dive into the gory details right now.
|R9 Nano||R9 Fury||R9 Fury X||GTX 980 Ti||TITAN X||GTX 980||R9 290X|
|GPU||Fiji XT||Fiji Pro||Fiji XT||GM200||GM200||GM204||Hawaii XT|
|Rated Clock||1000 MHz||1000 MHz||1050 MHz||1000 MHz||1000 MHz||1126 MHz||1000 MHz|
|Memory Clock||500 MHz||500 MHz||500 MHz||7000 MHz||7000 MHz||7000 MHz||5000 MHz|
|Memory Interface||4096-bit (HBM)||4096-bit (HBM)||4096-bit (HBM)||384-bit||384-bit||256-bit||512-bit|
|Memory Bandwidth||512 GB/s||512 GB/s||512 GB/s||336 GB/s||336 GB/s||224 GB/s||320 GB/s|
|TDP||175 watts||275 watts||275 watts||250 watts||250 watts||165 watts||290 watts|
|Peak Compute||8.19 TFLOPS||7.20 TFLOPS||8.60 TFLOPS||5.63 TFLOPS||6.14 TFLOPS||4.61 TFLOPS||5.63 TFLOPS|
AMD wasn’t fooling around, the Radeon R9 Nano graphics card does indeed include a full implementation of the Fiji GPU and HBM, including 4096 stream processors, 256 texture units and 64 ROPs. The GPU core clock is rated “up to” 1.0 GHz, nearly the same as the Fury X (1050 MHz), and the only difference that I can see in the specifications on paper is that the Nano is rated at 8.19 TFLOPS of theoretical compute performance while the Fury X is rated at 8.60 TFLOPS.
Retail Card Design
AMD is in an interesting spot right now. The general consensus is that both the AMD Radeon R9 Fury X and the R9 Fury graphics cards had successful launches into the enthusiast community. We found that the performance of the Fury X was slightly under that of the GTX 980 Ti from NVIDIA, but also that the noise levels and power draw were so improved on Fiji over Hawaii that many users would dive head first into the new flagship from the red team.
The launch of the non-X AMD Fury card was even more interesting – here was a card with a GPU performing better than the competition in a price point that NVIDIA didn’t have an exact answer. The performance gap between the GTX 980 and GTX 980 Ti resulted in a $550 graphics card that AMD had a victory with. Add in the third Fiji-based product due out in a few short weeks, the R9 Nano, and you have a robust family of products that don’t exactly dominate the market but do put AMD in a positive position unlike any it has seen in recent years.
But there are some problems. First and foremost for AMD, continuing drops in market share. With the most recent reports from multiple source claiming that AMD’s Q2 2015 share has dropped to 18%, an all-time low in the last decade or so, AMD needs some growth and they need it now. Here’s the catch: AMD can’t make enough of the Fiji chip to affect that number at all. The Fury X, Fury and Nano are going to be hard to find for the foreseeable future thanks to production limits on the HBM (high bandwidth memory) integration; that same feature that helps make Fiji the compelling product it is. I have been keeping an eye on the stock of the Fury and Fury X products and found that it often can’t be found anywhere in the US for purchase. Maybe even more damning is the fact that the Radeon R9 Fury, the card that is supposed to be the model customizable by AMD board partners, still only has two options available: the Sapphire, which we reviewed when it launched, and the ASUS Strix R9 Fury that we are reviewing today.
AMD’s product and financial issues aside, the fact is that the Radeon R9 Fury 4GB and the ASUS Strix iteration of it are damned good products. ASUS has done its usual job of improving on the design of the reference PCB and cooler, added in some great features and packaged it up a price that is competitive and well worth the investment for enthusiast gamers. Our review today will only lightly touch on out-of-box performance of the Strix card mostly because it is so similar to that of the initial Fury review we posted in July. Instead I will look at the changes to the positioning of the AMD Fury product (if any) and how the cooler and design of the Strix product helps it stand out. Overclocking, power consumption and noise will all be evaluated as well.
Subject: Graphics Cards | August 25, 2015 - 02:23 PM | Sebastian Peak
Tagged: Radeon R9 Nano, radeon, r9 nano, hbm, graphics, gpu, amd
New detailed photos of the upcoming Radeon R9 Nano have surfaced, and Ryan has confirmed with AMD that these are in fact real.
We've seen the outside of the card before, but for the first time we are provided a detailed look under the hood.
The cooler is quite compact and has copper heatpipes for both core and VRM
The R9 Nano is a very small card and it will be powered with a single 8-pin power connector directed toward the back.
Connectivity is provided via three DisplayPort outputs and a single HDMI port
And fans of backplates will need to seek 3rd-party offerings as it looks like this will have a bare PCB around back.
We will keep you updated if any official specifications become available, and of course we'll have complete coverage once the R9 Nano is officially launched!
Subject: Graphics Cards | August 24, 2015 - 03:43 PM | Jeremy Hellstrom
Tagged: nvidia, moba, maxwell, gtx 950, GM206, geforce, DOTA 2
It is more fun testing at the high end and the number of MOBA gamers here at PCPer could be described as very sparse, to say the least. Perhaps you are a MOBA gamer looking to play on a 1080p screen and have less than $200 to invest in a GPU and feel that Ryan somehow missed a benchmark that is important to you. One of the dozens of reviews linked to below are likely to have covered that game or specific feature which you are looking for. They also represent the gamut of cards available at launch from a wide variety of vendors, both stock and overclocked models. If you just want a quick refresher on the specifications and what has happened to the pricing on already released models, The Tech Report has handy tables for you to reference here.
"For most of this summer, much of the excitement in the GPU market has been focused on pricey, high-end products like the Radeon Fury and the GeForce GTX 980 Ti. Today, Nvidia is turning the spotlight back on more affordable graphics cards with the introduction of the GeForce GTX 950, a $159.99 offering that promises to handle the latest games reasonably well at the everyman's resolution of 1080p."
Here are some more Graphics Card articles from around the web:
- ASUS GeForce GTX 950 STRIX Graphics Card Review @ Techgage
- MSI GTX 950 Gaming 2G @ Modders-Inc
- EVGA GeForce GTX 950 FTW Edition Video Card Review @HiTech Legion
- Nvidia GTX 950 Round-Up Review: Three Cards Go Head to Head @ eTeknix
- NVIDIA GeForce GTX 950 Roundup featuring ASUS and MSI @ Neoseeker
- Palit GTX 950 2GB StormX Dual @ Kitguru
- GeForce GTX 950 @ HardwareHeaven
- Asus STRIX Gaming GTX 950 2GB DC2 OC @ Kitguru
- NVIDIA Introduces the GeForce GTX 950 for MOBA Gamers and Shares the GeForce Experience @ OCC
- ZOTAC GeForce GTX 950 AMP! Edition 2 GB @ techPowerUp
- Gigabyte GeForce GTX 950 OC 2 GB @ techPowerUp
- EVGA GeForce GTX 950 SSC 2 GB @ techPowerUp
- ASUS GTX 950 STRIX OC 2 GB @ techPowerUp
- Nvidia GeForce GTX 950 @ Legion Hardware
- The NVIDIA GTX 950 Review @ Hardware Canucks
- Asus, MSI, EVGA GTX 950 Review @ OCC
- The Extreme Cases Where A Sub-$200 NVIDIA GPU Can Beat A $550+ AMD R9 Fury On Linux @ Phoronix
- NVIDIA's GeForce GTX 950 Is A $150+ Bargain For Linux Gamers @ Phoronix
- The NVIDIA GPUs Delivering The Best Performance Per Watt & Per Dollar For Linux Gamers @ Phoronix
- The ASUS GTX 980 Ti STRIX OC Review @ Hardware Canucks
- Inno3D iChill GeForce GTX 980 4GB Ultra Review @ NikKTech
- HIS R7 370 IceQ X2 OC 2GB Video Card Review @ Madshrimps
- ASUS R9 390X STRIX DirectCU III 8G OC @ [H]ard|OCP
Subject: Graphics Cards | August 24, 2015 - 02:37 PM | Sebastian Peak
Tagged: rumor, report, Radeon R9 Nano, R9 290X, leak, hot chips, hbm, amd
A report from German-language tech site Golem contains what appears to be a slide leaked from AMD's GPU presentation at Hot Chips in Cupertino, and the results paint a very efficient picture of the upcoming Radeon R9 Nano GPU.
The spelling of "performance" doesn't mean this is fake, does it?
While only managing 3 FPS better than the Radeon R9 290X in this particular benchmark, this result was achieved with 1.9x the performance per watt of the baseline 290X in the test. The article speculates on the possible clock speed of the R9 Nano based on the relative performance, and estimates 850 MHz (which is of course up for debate as no official specs are known).
The most compelling part of the result has to be the ability of the Nano to match or exceed the R9 290X in performance, while only requiring a single 8-pin PCIe connector and needing an average of only 175 watts. With a mini-ITX friendly 15 cm board (5.9 inches) this could be one of the more compelling options for a mini gaming rig going forward.
We have a lot of questions that have yet to be answered of course, including the actual speed of both core and HBM, and just how quiet this air-cooled card might be under load. We shouldn't have to wait much longer!
Subject: Graphics Cards | August 21, 2015 - 11:30 AM | Sebastian Peak
Tagged: PC, nvidia, Matrox, jpr, graphics cards, gpu market share, desktop market share, amd, AIB, add in board
While we reported recently on the decline of overall GPU shipments, a new report out of John Peddie Research covers the add-in board segment to give us a look at the desktop graphics card market. So how are the big two (sorry Matrox) doing?
|GPU Supplier||Market Share This Quarter||Market Share Last Quarter||Market Share Last Year|
The big news is of course a drop in market share for AMD of 4.5% quarter-to-quarter, and down to just 18% from 37.9% last year. There will be many opinions as to why their share has been dropping in the last year, but it certainly didn't help that the 300-series GPUs are rebrands of 200-series, and the new Fury cards have had very limited availability so far.
The graph from Mercury Research illustrates what is almost a mirror image, with NVIDIA gaining 20% as AMD lost 20%, for a 40% swing in overall share. Ouch. Meanwhile (not pictured) Matrox didn't have a statistically meaningful quarter but still manage to appear on the JPR report with 0.1% market share (somehow) last quarter.
The desktop market isn't actually suffering quite as much as the overall PC market, and specifically the enthusiast market.
"The AIB market has benefited from the enthusiast segment PC growth, which has been partially fueled by recent introductions of exciting new powerful (GPUs). The demand for high-end PCs and associated hardware from the enthusiast and overclocking segments has bucked the downward trend and given AIB vendors a needed prospect to offset declining sales in the mainstream consumer space."
But not all is well considering overall the add-in board attach rate with desktops "has declined from a high of 63% in Q1 2008 to 37% this quarter". This is indicative of the overall trend toward integrated GPUs in the industry with AMD APUs and Intel processor graphics, as illustrated by this graphic from the report.
The year-to-year numbers show an overall drop of 18.8%, and even with their dominant 81.9% market share NVIDIA has still seen their shipments decrease by 12% this quarter. These trends seem to indicate a gloomy future for discrete graphics in the coming years, but for now we in the enthusiast community will continue to keep it afloat. It would certainly be nice to see some gains from AMD soon to keep things interesting, which might help lower prices down from their lofty $400 - $600 mark for flagship cards at the moment.