The AMD Radeon HD 6990 4GB card has been known by the media and even gamers since the first announcements from the Cayman launch last year but finally today we are able to discuss the technology behind it and the gaming performance it will provide users willing to shell out the $700 it will take to acquire. Stop in and see if your mortgage is worth this graphics card!
Graphics card that are this well endowed don't come along very often; the last was the Radeon HD 5970 from AMD back in November of 2009. In a world where power efficiency is touted as a key feature it has become almost a stigma to have an add-in card in your system that might pull 350-400 watts of power. Considering we were just writing about a complete AMD Fusion platform that used 34 watts IN TOTAL under load, it is an easy task to put killer gaming products like the HD 6990 in an unfair and unreasonable light.
But we aren't those people. Do most people need a $700, 400 watt graphics card? Nope. Do they want it though? Yup. And we are here to show it to you.
Both AMD and NVIDIA have written this story before: take one of your top level GPUs and double them up on a single PCB or card design to plug into a single PCI Express slot and get maximum performance. CrossFire (or SLI) in a single slot - lots to like about that.
The current GPU lineup paints an interesting picture with the Fermi-based GTX 500 series from NVIDIA and the oddly segregated AMD HD 6800 and HD 6900 series of cards. Cayman, the redesigned architecture AMD released as the HD 6970 and HD 6950, brings a lot of changes to the Evergreen design used in previous cards. It has done fairly well in the market though it didn't improve the landscape for AMD discrete graphics as much as many had thought it would and NVIDIA's graphics chips have remained very relevant.
With the rumors swirling about a new dual-GPU option from AMD there was some discussion on whether it would be an HD 6800 / Evergreen based design or an HD 6900 / Cayman contraption. Let's just get that mystery out of the way:
With the VLIW4 microarchitecture we absolutely are seeing a dual Cayman card and with a surprisingly high clock speed of 830 MHz out of the gate with lots of headroom for the overclocker in all of us. There are 1536 stream processors per GPU for a total of 3072 and a raw computing power of more than 5 TeraFLOPs. This is analogous to the HD 6970 GPU that shares the 1536 shader count but runs at a clock rate of 880 MHz.
The memory architecture runs a bit slower as well at 5.0 Gbps (versus the 5.5 Gbps on the HD 6970) but we are still getting a full 2GB per GPU for a grand-spanking-total of 4GB on this single card. Load power on the board is rated at "<375 watts" and just barely makes the budget for PCI Express based solutions with the provided dual 8-pin power connectors.
You might remember that AMD introduced a dual-BIOS switch with the HD 6900 cards as well that would allow users to easily revert back to the original BIOS and settings should their overclocking attempts take a turn for the worse. For this card though, they are taking a slightly different approach by having the switch pull duties as an overclocking option directly, pushing up the clock frequency from 830 MHz to 880 MHz. That might not seem like that dramatic of a change (and it isn't) but more noticeable is the change in voltage on the GPUs (going from 1.12v to 1.175v) and what that does to the power consumption and PowerTune options on the card for further tweaking. More on that below.
AMD is definitely letting the enthusiast users have their way with the HD 6990 as even in the Overdrive settings in the control panel you will have the option of pushing the boundaries of the 3000+ stream processors even further. The software will allow clock speed settings as high as 1.2 GHz, a 44% increase over the reference speed, though of course they aren't going to be promising you will get there. You will remember that previous iterations of Overdrive on single GPU boards limited the overclocking potential to 1.0 GHz and it actually limited us in our attempts on several occasions.
Memory speeds are overclockable as well from the 5.0 GHz default speed up to 6.0 GHz - and again, your mileage will vary.
I mentioned PowerTune above, a technology that was also first introduced with the Radeon HD 6900 cards that addressed the issue of "power virus" applications like Furmark. Basically, PowerTune is a way AMD has engineered to control power consumption on a card to allow them to set default clock speeds as high as possible without worrying how Furmark and other programs that draw MUCH more power than games act.
When you move that BIOS switch on the HD 6990 from the standard setting to the overclocked setting, you aren't just changing the clock speed of the GPU but you are also changing the default settings for PowerTune. Instead of a target load power consumption of about 375 watts, the overclocked card will be able to target as high as 450 watts using some updated and improved circuitry on the board. It is worth nothing though that AMD is forced to make this 450 watt option an "overclocked" setting because it does exceeded the power draw of the PCI Express slot and associated connectors and would cause a fit for vendors attempt to selling systems using the HD 6990 to consumers. Enthusiasts that buy this card themselves though will have that option and we are glad that AMD continues to support readers like ours by enabling this type of thing.
Above you will see the digital programmable voltage regulators placed in the middle of the PCB, equidistant from either GPU, responsible for VDDC regulation, efficiency and the high current capacity of the card in its "unlocked" state. AMD claims that the symmetrical layout of the card will provide the most efficient power delivery to EACH of the two GPUs and associated memory so you don't have one chip able to run at 975 MHz while the other can only hit 950 MHz, for example. And you can bet that AMD is going to be putting only the very best of the Cayman GPUs on the HD 6990 in order to appease enthusiasts and to help keep that overall power consumption low(er).
Though some users have expressed concerns over what PowerTune does, the truth is that AMD has done a very good job at demonstrating why this feature is good for gamers, though somewhat bad for those of us that want to stress the company's technology to the limits. Above you see a graph that shows the clock speed on the Y axis with two dashed lines - a yellow at about 570 MHz and a red at 830 MHz. AMD tells us that without the PowerTune technology at work, they would have been forced to put the clock speeds of the HD 6990 card at the ~570 MHz mark or so in order to compensate for the maximum power draw seen from Furmark and OCCT stress tests. The gray bars fro each game listed there then would be the performance of the HD 6990 at that speed.
The red line relates to the clock speed of the HD 6990 as we know it today and with the PowerTune technology implemented as we have discussed. The red arrows you see coming up from each game indicates the added performance a gamer gets out of this graphics card with the additional clock speed that it would not have been able to get if AMD removed the PowerTune settings. Now I realize that these results are from AMD and as such they might be a bit stingy with that the 570 MHz clock speed estimate without PowerTune - it IS in their interest to make us like and understand what it is doing. But even if the real-world speed is something like 650 MHz I think most users would easily take the ~150+ MHz of performance for the ability to run Furmark.
I wasn't quite sure what to expect out of the display connection configuration on the HD 6990 and I know we had some doubts about the 6 mini-DP configuration we saw on a few of the HD 5000 series cards over the years. While offering support for 6 monitors out of the box seems great, all the dongles and connectors were kind of a hassle. The HD 6990 seems to find a decent compromise by including 4 mini-DP connections and a single dual-link DVI output. The DVI output means users of a single display will be able to just plug in and go without need to get any of the adapters out and the 4 mini-DP outputs will allow users to hook up a TON of monitors to their system for Eyefinity configurations.
Obviously there is the downside that you will not be able to do the 6-panel Eyefinity configuration right away here - you can still do it, but you are going to need a DisplayPort 1.2 hub of some kind and good finding one of those.
In the box with the HD 6990 users will find a set of three DisplayPort adapters to help make the transition a bit easier. There are two PASSIVE dongles, one for HDMI and one for single-link DVI, that will allow you to install a second display of a maximum resolution of 1920x1200. The ACTIVE miniDP to SL-DVI adapter is helpful for users to connect a THIRD monitor to their HD 6990 for Eyefinity or other usage. Keep in mind if you are looking for more adapters that they will need to be ACTIVE for monitors 3-5.
Speaking of Eyefinity, AMD is finally introducing the 5x1 portrait configuration with this driver version that really is the best gaming option in my opinion. 3x1 landscape is also pretty good but a potential 6000x1920 screen (five 1920x1200 displays in portrait mode) provides a much better pixel ratio for game play and doesn't cause some of those annoying bezel issues that stand out in the 3x2 and 2x2 configurations.
Is one HD 6990 not enough for you? First, you're a nut and second, AMD has good news for you. The driver team there has continued to improve the scalability of Radeon GPUs in CrossFire configurations and the above image shows how much performance gain you are getting by moving from 2 GPUs in a single HD 6990 card to 4 GPUs with a pair of them. The results show anything from 1.6x to 1.9x and if that holds up then it definitely rivals what NVIDIA has been able to do with SLI in the past couple of years.
Finally, a specification table shows us the full details of the architecture behind the HD 6990 and what changes are made when you flip that magical switch into the overclocked settings. Obviously all of the silicon details stay the same but with the clock speed bump we see an increase to a total raw processing power of 5.4 TeraFLOPs, 169.0 GigaTexels/s texture fill rate and 56.3 GigaPixels/s fill rate.