Subject: Graphics Cards | July 27, 2016 - 01:56 AM | Tim Verry
Tagged: solid state, radeon pro, Polaris, gpgpu, amd
UPDATE (July 27th, 1am ET): More information on the Radeon Pro SSG has surfaced since the original article. According to AnandTech, the prototype graphics card actually uses an AMD Fiji GPU. The Fiji GPU is paired onboard PCI-E based storage using the same PEX8747 bridge chip used in the Radeon Pro Duo. Storage is handled by two PCI-E 3.0 x4 M.2 slots that can accommodate up to 1TB of NAND flash storage. As I mentioned below, having the storage on board the graphics card vastly reduces latency by reducing the number of hops and not having to send requests out to the rest of the system. AMD had more numbers to share following their demo, however.
From the 8K video editing demo, the dual Samsung 950 Pro PCI-E SSDs (in RAID 0) on board the Radeon Pro SSG hit 4GB/s while scrubbing through the video. That same video source stored on a Samsung 950 Pro attached to the motherboard had throughput of only 900MB/s. In theory, reaching out to system RAM still has raw throughput advantages (with DDR4 @ 3200 MHz on a Haswell-E platform theroretically capable of 62 GB/s reads and 47 GB/s writes though that would be bottlenecked by the graphics card having to go over the PCI-E 3.0 x16 link and it's maximum of 15.754 GB/s.). Of course if you can hold it in (much smaller) GDDR5 (300+GB/s depending on clocks and memory bus width) or HBM (1TB/s) and not have to go out to any other storage tier that's ideal but not always feasible especially in the HPC world.
However, having onboard storage on the same board as the GPU only a single "hop" away vastly reduces latency and offers much more total storage space than most systems have in DDR3 or DDR4. In essence, the solid state storage on the graphics card (which developers will need to specifically code for) acts as a massive cache for streaming in assets for data sets and workloads that are highly impacted by latency. This storage is not the fastest, but is the next best thing for holding active data outside of GDDR5/x or HBM. For throughput intensive workloads reaching out to system RAM will be better Finally, reaching out to system attached storage should be the last resort as it will be the slowest and most latent. Several commentors mentioned using a PCI-E based SSD in a second slot on the motherboard accessed much like GPUs in CrossFire communicate now (DMA over the PCI-E bus) which is an interesting idea that I had not considered.
Per my understanding of the situation, I think that the on board SSG storage would still be slightly more beneficial than this setup but it would get you close (I am assuming the GPU would be able to directly interact and request data from the SSD controller and not have to rely on the system CPU to do this work but I may well be mistaken. I will have to look into this further and ask the experts heh). On the prototype Radeon Pro SSG the M.2 slots are actually able to be seen as drives by the system and OS so it is essentially acting as if there was a PCI-E adapter card in a slot on the motherboard holding those drives but that may not be the case should this product actually hit the market. I do question their choice to go with Fiji rather than Polaris, but it sounds like they built the prototype off of the Radeon Pro Duo platform so I suppose it would make sense there.
Hopefully the final versions in 2017 or beyond use at least Vega though :).
Alongside the launch of new Radeon Pro WX (workstation) series graphics cards, AMD teased an interesting new Radeon Pro product: the Radeon Pro SSG. This new professional graphics card pairs a Polaris GPU with up ot a terabyte of on board solid state storage and seeks to solve one of the biggest hurdles in GP GPU performance when dealing with extremely large datasets which is latency.
One of the core focuses of AMD's HSA (heterogeneous system architecture) is unified memory and the ability of various processors (CPU, GPU, specialized co-processors, et al) to work together efficiently by being able to access and manipulate data from the same memory pool without having to copy data bck and forth between CPU-accessible memory and GPU-accessible memory. With the Radeon Pro SSG, this idea is not fully realized (it is more of a sidestep), but it will move performance further. It does not eliminate the need to copy data to the GPU before it can work on it, but once copied the GPU will be able to work on data stored in what AMD describes as a one terabyte frame buffer. This memory will be solid state and very fast, but more importantly it will be able to get at the data with much lower latency than previous methods. AMD claims the solid state storage (likely NAND but they have not said) will link with the GPU over a dedicated PCI-E bus. I suppose that if you can't bring the GPU to the data, you bring the data to the GPU!
Considering AMD's previous memory champ – the Radeon W9100 – maxed out at 32GB of GDDR5, the teased Radeon Pro SSG with its 1TB of purportedly low latency onboard flash storage opens up a slew of new possibilities for researchers and professionals in media, medical, and scientific roles working with massive datasets for imaging, creation, and simulations! I expect that there are many professionals out there eager to get their hands on one of these cards! They will be able to as well thanks to a beta program launching shortly, so long as they have $10,000 for the hardware!
AMD gave a couple of examples in their PR on the potential benefits of its "solid state graphics" including the ability to image a patient's beating heart in real time to allow medical professionals to examine and spot issues as early as possible and using the Radeon Pro SSG to edit and scrub through 8K video in real time at 90 FPS versus 17 with current offerings. On the scientific side of things being able to load up entire models into the new graphics memory (not as low latency as GDDR5 or HBM certainly) will be a boon as will being able to get data sets as close to the GPU as possible into servers using GPU accelerated databases powering websites accessed by millions of users.
It is not exactly the HSA future I have been waiting for ever so impatiently, but it is a nice advancement and an intriguing idea that I am very curious to see how well it pans out and if developers and researchers will truly take advantage of and use to further their projects. I suspect something like this could be great for deep learning tasks as well (such as powering the "clouds" behind self driving cars perhaps).
Stay tuned to PC Perspective for more information as it develops.
This is definitely a product that I will be watching and I hope that it does well. I am curious what Nvidia's and Intel's plans are here as well! What are your thoughts on AMD's "Solid State Graphics" card? All hype or something promising?
Radeon Software 16.7.1 Adjustments
Last week we posted a story that looked at a problem found with the new AMD Radeon RX 480 graphics card’s power consumption. The short version of the issue was that AMD’s new Polaris 10-based reference card was drawing more power than its stated 150 watt TDP and that it was drawing more power through the motherboard PCI Express slot that the connection was rated for. And sometimes that added power draw was significant, both at stock settings and overclocked. Seeing current draw over a connection rated at just 5.5A peaking over 7A at stock settings raised an alarm (validly) and our initial report detailed the problem very specifically.
AMD responded initially that “everything was fine here” but the company eventually saw the writing on the wall and started to work on potential solutions. The Radeon RX 480 is a very important product for the future of Radeon graphics and this was a launch that needs to be as perfect as it can be. Though the risk to users’ hardware with the higher than expected current draw is muted somewhat by motherboard-based over-current protection, it’s crazy to think that AMD actually believed that was the ideal scenario. Depending on the “circuit breaker” in any system to save you when standards exists for exactly that purpose is nuts.
Today AMD has released a new driver, version 16.7.1, that actually introduces a pair of fixes for the problem. One of them is hard coded into the software and adjusts power draw from the different +12V sources (PCI Express slot and 6-pin connector) while the other is an optional flag in the software that is disabled by default.
Reconfiguring the power phase controller
The Radeon RX 480 uses a very common power controller (IR3567B) on its PCB to cycle through the 6 power phases providing electricity to the GPU itself. Allyn did some simple multimeter trace work to tell us which phases were connected to which sources and the result is seen below.
The power controller is responsible for pacing the power coming in from the PCI Express slot and the 6-pin power connection to the GPU, in phases. Phases 1-3 come in from the power supply via the 6-pin connection, while phases 4-6 source power from the motherboard directly. At launch, the RX 480 drew nearly identical amounts of power from both the PEG slot and the 6-pin connection, essentially giving each of the 6 phases at work equal time.
That might seem okay, but it’s far from the standard of what we have seen in the past. In no other case have we measured a graphics card drawing equal power from the PEG slot as from an external power connector on the card. (Obviously for cards without external power connections, that’s a different discussion.) In general, with other AMD and NVIDIA based graphics cards, the motherboard slot would provide no more than 50-60 watts of power, while any above that would come from the 6/8-pin connections on the card. In many cases I saw that power draw through the PEG slot was as low as 20-30 watts if the external power connections provided a lot of overage for the target TDP of the product.
Subject: General Tech | July 7, 2016 - 02:20 PM | Allyn Malventano
Tagged: xbox play, video, Thrustmaster, technology, Samsung 840, rx 480, review, radeon 490, radeon, power, Polaris, podcast, pcper, news, Micron 9100 MAX SSD, lenovo thinkpad x1 yoga, Kinetic, gtx 1060, EVO, cooler, coolchip, alcantera
PC Perspective Podcast #407 - 07/07/2016
Join us this week as we discuss RX 480 Power Concerns, X1 Yoga, Thrustmaster, Micron 9100 MAX, and more!
The URL for the podcast is: http://pcper.com/podcast - Share with your friends!
- iTunes - Subscribe to the podcast directly through the Store (audio only)
- Google Play - Subscribe to our audio podcast directly through Google Play!
- RSS - Subscribe through your regular RSS reader (audio only)
- MP3 - Direct download link to the MP3 file
This episode of the PC Perspective Podcast is sponsored by Kaspersky! (promo code pcper)
Hosts: Ryan Shrout, Allyn Malventano, Jeremy Hellstrom, and Josh Walrath
Week in Review:
News items of interest:
Hardware/Software Picks of the Week
Jeremy: Canuck with no patience? Gigabyte GeForce GTX 1070 G1 Gaming
Subject: Graphics Cards | July 6, 2016 - 09:37 PM | Scott Michaud
Tagged: amd, linux, graphics drivers, rx 480, Polaris
Linux support from AMD seems to be improving, as it has been on Windows. We'll be combining two separate, tiny stories into one, so bear with us. The first is from Fudzilla, and it states that AMD has AMDGPU-PRO 16.30 drivers for the RX 480 out on day one. It's nice to see that their Radeon driver initiative applies to Linux, too.
That brings us to the second story, this one from Phoronix. One Windows, the Crimson 16.7.1 drivers will include a fix for the RX 480 power issues (which we will obviously test of course). Michael Larabel was apparently talking with AMD's Linux team, and it seems likely that this update will roll into the Linux driver as well. They "are still investigating", of course, but it is apparently on their radar.
Subject: Graphics Cards | July 6, 2016 - 08:11 PM | Scott Michaud
Tagged: rx 480, Polaris, graphics drivers, amd
In the next 24 hours or so, AMD will publish Radeon Software 16.7.1, which addresses the power distribution issues in the AMD Radeon RX 480. The driver makes two major changes. First, AMD claims that it will lower the draw from the PCIe bus. While they don't explicitly say how, it sounds like it will increase the load on the 6-pin PCIe cable, which is typically over-provisioned. In fact, many power supplies have 6-pin connectors that have the extra two pins of an 8-pin connector hanging off of it.
Second, seemingly for those who aren't comfortable with the extra load on the 6-pin PCIe connector, a UI control has been added to lower overall power. Being that the option's called “compatibility”, it sounds like it should put the RX 480 back into spec on both slot and the extra power connector. Again, AMD says that they believe it's not necessary, and it seems to be true, because that option is off by default.
Beyond these changes, the driver also adds a bunch of game optimizations. Allyn and Ryan have been working on this coverage, so expect more content from them in the very near future.
Subject: Graphics Cards | July 6, 2016 - 05:32 PM | Scott Michaud
Tagged: amd, Polaris, rx 460, rx 470, rx 480, RX 490, sapphire
Unfortunately, I don't have a Sapphire SSC ID, so I cannot verify these myself. That said, a Reddit user by the name of CBwardog found a few extra listings on the company's drop-down menu for products which really shouldn't exist yet. The product name doesn't really have much associated with it, but it does have video RAM and display outputs.
Image Credit: CBwardog on Reddit
According to Sapphire, the Radeon RX 460 will launch in 2GB and 4GB versions, each of which have one HDMI, one DVI, and one DisplayPort connector. The RX 470 will come in 4GB and 8GB versions. The 4GB version of the RX 470 will have HDMI and three DisplayPorts, while the 8GB version of the RX 470 will have two HDMI ports, one DVI port, and two DisplayPort connectors. Lastly, ignoring the RX 480 that we already know about, a “RADEON 490” (which an earlier leak by AMD called the RX 490) will be available in just an 8GB version, with one HDMI and three DisplayPorts.
As always, rumors should be taken with a grain of salt. Also, it is possible that port configuration could be specific to Sapphire, as we've seen AIB partners modify outputs before, but you would think that there would be at least one reference design per model, so, chances are, it should be fairly uniform across vendors.
Subject: Graphics Cards | July 6, 2016 - 07:01 AM | Scott Michaud
Tagged: rx 480, Polaris, amd
Apparently, some people think that AMD will be releasing an RX 490 based on Polaris 10 with an extra four compute units, bringing the total number of stream processors to 2560. I'm guessing that people expected it to be a nice, round number or something, but that's not the case. According to Evan Groenke, Senior Product Manager at AMD, the die has 36 compute units, and there is “nothing else hidden on the product that end users might be looking forward to unlocking”.
Really, this kind-of makes sense. AMD seems to have designed this chip around the performance target of VR, which the RX 480 hits. I don't think that it would really make sense to push about 11% more compute processors into the design, decreasing their yield per wafer for such a relatively small gain.
We are expecting an RX 490 card to land at some point though, thanks to a mistake in publishing on AMD's part. It won't be Polaris 10 or 11.
Subject: Graphics Cards | July 5, 2016 - 12:38 PM | Tim Verry
Tagged: rx 480, Radeon RX 480, polaris 10, Polaris, msi, gcn4
It appears that MSI will be one of the first AIB partners to get a reference version of the AMD RX 480 graphics card out. Available as soon as next week, the MSI Radeon RX 480 8G pairs AMD’s Polaris-based GPU with 8GB of GDDR5 memory on a reference platform and cooler.
The MSI card uses the AMD reference cooler with a blower style fan and measures 9.45” in length. It is a dual slot design with a red and black aesthetic. Rear IO includes three DisplayPort and one HDMI ports. It is powered by a single 6-pin PCI-E power connector.
There is not much to say with regards to clocks on this GCN4-based card as there are no factory overclocks to speak of. The base clock sits at 1120 MHz (which is an average expected clock, not necessarily the minimum) and the GPU can boost up to a maximum of 1266 MHz out of the box. MSI is clocking the memory at the full 8 GHz though, which is good (AMD stated that partners could clock memory anywhere from seven to eight GHz).
Looking around various retailers, it appears that you will be able to get your hands on it as soon as July 9th from Newegg for $240.
Watch out for pricing before clicking that buy button though, because some sites that allow third party sellers have jacked up the prices quite a bit! If you are looking for a reference design, this card should be as good as the rest. Personally, I am looking forward to MSI and other AIB partner’s custom RX 480 cards which should have much higher overclocking potential and a better power phase setup that should alleviate any power consumption concerns of the reference design’s VRM setup. That is not to say that the reference MSI is going to blow up your PC or anything, but from a buyer's perspective I would rather wait for the custom boards with better coolers that I can push further and faster for only a fairly slight premium. If you need a blower style cooler, this card should work.
- The AMD Radeon RX 480 Review - The Polaris Promise
- PCPer Live! Radeon RX 480 Live Stream with Raja Koduri!
- AMD's Raja Koduri talks moving past CrossFire, smaller GPU dies, HBM2 and more.
Subject: Graphics Cards | June 30, 2016 - 07:54 PM | Scott Michaud
Tagged: amd, nvidia, FinFET, Polaris, polaris 10, pascal
If you're trying to purchase a Pascal or Polaris-based GPU, then you are probably well aware that patience is a required virtue. The problem is that, as a hardware website, we don't really know whether the issue is high demand or low supply. Both are manufactured on a new process node, which could mean that yield is a problem. On the other hand, it's been about four years since the last fabrication node, which means that chips got much smaller for the same performance.
Over time, manufacturing processes will mature, and yield will increase. But what about right now? AMD made a very small chip that produces ~GTX 970-level performance. NVIDIA is sticking with their typical, 3XXmm2 chip, which ended up producing higher than Titan X levels of performance.
It turns out that, according to online retailer, Overclockers UK, via Fudzilla, both the RX480 and GTX 1080 have sold over a thousand units at that location alone. That's quite a bit, especially when you consider that it only considers one (large) online retailer from Europe. It's difficult to say how much stock other stores (and regions) received compared to them, but it's still a thousand units in a day.
It's sounding like, for both vendors, pent-up demand might be the dominant factor.
Too much power to the people?
UPDATE (7/1/16): I have added a third page to this story that looks at the power consumption and power draw of the ASUS GeForce GTX 960 Strix card. This card was pointed out by many readers on our site and on reddit as having the same problem as the Radeon RX 480. As it turns out...not so much. Check it out!
UPDATE 2 (7/2/16): We have an official statement from AMD this morning.
As you know, we continuously tune our GPUs in order to maximize their performance within their given power envelopes and the speed of the memory interface, which in this case is an unprecedented 8Gbps for GDDR5. Recently, we identified select scenarios where the tuning of some RX 480 boards was not optimal. Fortunately, we can adjust the GPU's tuning via software in order to resolve this issue. We are already testing a driver that implements a fix, and we will provide an update to the community on our progress on Tuesday (July 5, 2016).
Honestly, that doesn't tell us much. And AMD appears to be deflecting slightly by using words like "some RX 480 boards". I don't believe this is limited to a subset of cards, or review samples only. AMD does indicate that the 8 Gbps memory on the 8GB variant might be partially to blame - which is an interesting correlation to test out later. The company does promise a fix for the problem via a driver update on Tuesday - we'll be sure to give that a test and see what changes are measured in both performance and in power consumption.
The launch of the AMD Radeon RX 480 has generally been considered a success. Our review of the new reference card shows impressive gains in architectural efficiency, improved positioning against NVIDIA’s competing parts in the same price range as well as VR-ready gaming performance starting at $199 for the 4GB model. AMD has every right to be proud of the new product and should have this lone position until the GeForce product line brings a Pascal card down into the same price category.
If you read carefully through my review, there was some interesting data that cropped up around the power consumption and delivery on the new RX 480. Looking at our power consumption numbers, measured directly from the card, not from the wall, it was using slightly more than the 150 watt TDP it was advertised as. This was done at 1920x1080 and tested in both Rise of the Tomb Raider and The Witcher 3.
When overclocked, the results were even higher, approaching the 200 watt mark in Rise of the Tomb Raider!
A portion of the review over at Tom’s Hardware produced similar results but detailed the power consumption from the motherboard PCI Express connection versus the power provided by the 6-pin PCIe power cable. There has been a considerable amount of discussion in the community about the amount of power the RX 480 draws through the motherboard, whether it is out of spec and what kind of impact it might have on the stability or life of the PC the RX 480 is installed in.
As it turns out, we have the ability to measure the exact same kind of data, albeit through a different method than Tom’s, and wanted to see if the result we saw broke down in the same way.
Our Testing Methods
This is a complex topic so it makes sense to detail the methodology of our advanced power testing capability up front.
How do we do it? Simple in theory but surprisingly difficult in practice, we are intercepting the power being sent through the PCI Express bus as well as the ATX power connectors before they go to the graphics card and are directly measuring power draw with a 10 kHz DAQ (data acquisition) device. A huge thanks goes to Allyn for getting the setup up and running. We built a PCI Express bridge that is tapped to measure both 12V and 3.3V power and built some Corsair power cables that measure the 12V coming through those as well.
The result is data that looks like this.
What you are looking at here is the power measured from the GTX 1080. From time 0 to time 8 seconds or so, the system is idle, from 8 seconds to about 18 seconds Steam is starting up the title. From 18-26 seconds the game is at the menus, we load the game from 26-39 seconds and then we play through our benchmark run after that.
There are four lines drawn in the graph, the 12V and 3.3V results are from the PCI Express bus interface, while the one labeled PCIE is from the PCIE power connection from the power supply to the card. We have the ability to measure two power inputs there but because the GTX 1080 only uses a single 8-pin connector, there is only one shown here. Finally, the blue line is labeled total and is simply that: a total of the other measurements to get combined power draw and usage by the graphics card in question.
From this we can see a couple of interesting data points. First, the idle power of the GTX 1080 Founders Edition is only about 7.5 watts. Second, under a gaming load of Rise of the Tomb Raider, the card is pulling about 165-170 watts on average, though there are plenty of intermittent, spikes. Keep in mind we are sampling the power at 1000/s so this kind of behavior is more or less expected.
Different games and applications impose different loads on the GPU and can cause it to draw drastically different power. Even if a game runs slowly, it may not be drawing maximum power from the card if a certain system on the GPU (memory, shaders, ROPs) is bottlenecking other systems.
One interesting note on our data compared to what Tom’s Hardware presents – we are using a second order low pass filter to smooth out the data to make it more readable and more indicative of how power draw is handled by the components on the PCB. Tom’s story reported “maximum” power draw at 300 watts for the RX 480 and while that is technically accurate, those figures represent instantaneous power draw. That is interesting data in some circumstances, and may actually indicate other potential issues with excessively noisy power circuitry, but to us, it makes more sense to sample data at a high rate (10 kHz) but to filter it and present it more readable way that better meshes with the continuous power delivery capabilities of the system.
Image source: E2E Texas Instruments
An example of instantaneous voltage spikes on power supply phase changes
Some gamers have expressed concern over that “maximum” power draw of 300 watts on the RX 480 that Tom’s Hardware reported. While that power measurement is technically accurate, it doesn’t represent the continuous power draw of the hardware. Instead, that measure is a result of a high frequency data acquisition system that may take a reading at the exact moment that a power phase on the card switches. Any DC switching power supply that is riding close to a certain power level is going to exceed that on the leading edges of phase switches for some minute amount of time. This is another reason why our low pass filter on power data can help represent real-world power consumption accurately. That doesn’t mean the spikes they measure are not a potential cause for concern, that’s just not what we are focused on with our testing.