All | Editorial | General Tech | Graphics Cards | Networking | Motherboards | Cases and Cooling | Processors | Chipsets | Memory | Displays | Systems | Storage | Mobile | Shows and Expos
What Mantle signifies about GPU architectures
Mantle is a very interesting concept. From the various keynote speeches, it sounds like the API is being designed to address the current state (and trajectory) of graphics processors. GPUs are generalized and highly parallel computation devices which are assisted by a little bit of specialized silicon, when appropriate. The vendors have even settled on standards, such as IEEE-754 floating point decimal numbers, which means that the driver has much less reason to shield developers from the underlying architectures.
Still, Mantle is currently a private technology for an unknown number of developers. Without a public SDK, or anything beyond the half-dozen keynotes, we can only speculate on its specific attributes. I, for one, have technical questions and hunches which linger unanswered or unconfirmed, probably until the API is suitable for public development.
Or, until we just... ask AMD.
Our response came from Guennadi Riguer, the chief architect for Mantle. In it, he discusses the API's usage as a computation language, the future of the rendering pipeline, and whether there will be a day where Crossfire-like benefits can occur by leaving an older Mantle-capable GPU in your system when purchasing a new, also Mantle-supporting one.
Q: Mantle's shading language is said to be compatible with HLSL. How will optimizations made for DirectX, such as tweaks during shader compilation, carry over to Mantle? How much tuning will (and will not) be shared between the two APIs?
[Guennadi] The current Mantle solution relies on the same shader generation path games the DirectX uses and includes an open-source component for translating DirectX shaders to Mantle accepted intermediate language (IL). This enables developers to quickly develop Mantle code path without any changes to the shaders. This was one of the strongest requests we got from our ISV partners when we were developing Mantle.
Follow-Up: What does this mean, specifically, in terms of driver optimizations? Would AMD, or anyone else who supports Mantle, be able to re-use the effort they spent on tuning their shader compilers (and so forth) for DirectX?
[Guennadi] With the current shader compilation strategy in Mantle, the developers can directly leverage DirectX shader optimization efforts in Mantle. They would use the same front-end HLSL compiler for DX and Mantle, and inside of the DX and Mantle drivers we share the shader compiler that generates the shader code our hardware understands.
A quick look at performance results
Late last week, EA and Dice released the long awaited patch for Battlefield 4 that enables support for the Mantle renderer. This new API technology was introduced by AMD back in September. Unfortunately, AMD wasn't quite ready for its release with their Catalyst 14.1 beta driver. I wrote a short article that previewed the new driver's features, its expected performance with the Mantle version of BF4, and commentary about the current state of Mantle. You should definite read that as a primer before continuing if you haven't yet.
Today, after really just a few short hours with a useable driver, I have only limited results. Still, I know that you, our readers, clamor for ANY information on the topic. I thought I would share what we have thus far.
As I mentioned in the previous story, the Mantle version of Battlefield 4 has the biggest potential to show advantages in times where the game is more CPU limited. AMD calls this the "low hanging fruit" for this early release of Mantle and claim that further optimizations will come, especially for GPU-bound scenarios. Because of that dependency on CPU limitations, that puts some non-standard requirements on our ability to showcase Mantle's performance capabilities.
For example, the level of the game and even the section of that level, in the BF4 single player campaign, can show drastic swings in Mantle's capabilities. Multiplayer matches will also show more consistent CPU utilization (and thus could be improved by Mantle) though testing those levels in a repeatable, semi-scientific method is much more difficult. And, as you'll see in our early results, I even found a couple instances in which the Mantle API version of BF4 ran a smidge slower than the DX11 instance.
For our testing, we compiled two systems that differed in CPU performance in order to simulate the range of processors installed within consumers' PCs. Our standard GPU test bed includes a Core i7-3960X Sandy Bridge-E processor specifically to remove the CPU as a bottleneck and that has been included here today. We added in a system based on the AMD A10-7850K Kaveri APU which presents a more processor-limited (especially per-thread) system, overall, and should help showcase Mantle benefits more easily.
A troubled launch to be sure
AMD has released some important new drivers with drastic feature additions over the past year. Remember back in August of 2013 when Frame Pacing was first revealed? Today’s Catalyst 14.1 beta release will actually complete the goals that AMD set forth upon itself in early 2013 in regards to introducing (nearly) complete Frame Pacing technology integration for non-XDMA GPUs while also adding support for Mantle
and HSA capability.
Frame Pacing Phase 2 and HSA Support
When AMD released the first frame pacing capable beta driver in August of 2013, it added support to existing GCN designs (HD 7000-series and a few older generations) at resolutions of 2560x1600 and below. While that definitely addressed a lot of the market, the fact was that CrossFire users were also amongst the most likely to have Eyefinity (3+ monitors spanned for gaming) or even 4K displays (quickly dropping in price). Neither of those advanced display options were supported with any Catalyst frame pacing technology.
That changes today as Phase 2 of the AMD Frame Pacing feature has finally been implemented for products that do not feature the XDMA technology (found in Hawaii GPUs for example). That includes HD 7000-series GPUs, the R9 280X and 270X cards, as well as older generation products and Dual Graphics hardware combinations such as the new Kaveri APU and R7 250. I have already tested Kaveri and the R7 250 in fact, and you can read about its scaling and experience improvements right here. That means that users of the HD 7970, R9 280X, etc., as well as those of you with HD 7990 dual-GPU cards, will finally be able to utilize the power of both GPUs in your system with 4K displays and Eyefinity configurations!
This is finally fixed!!
As of this writing I haven’t had time to do more testing (other than the Dual Graphics article linked above) to demonstrate the potential benefits of this Phase 2 update, but we’ll be targeting it later in the week. For now, it appears that you’ll be able to get essentially the same performance and pacing capabilities on the Tahiti-based GPUs as you can with Hawaii (R9 290X and R9 290).
Catalyst 14.1 beta is also the first public driver to add support for HSA technology, allowing owners of the new Kaveri APU to take advantage of the appropriately enabled applications like LibreOffice and the handful of Adobe apps. AMD has since let us know that this feature DID NOT make it into the public release of Catalyst 14.1.
The First Mantle Ready Driver (sort of)
A technology that has been in development for more than two years according to AMD, the newly released Catalyst 14.1 beta driver is the first to enable support for the revolutionary new Mantle API for PC gaming. Essentially, Mantle is AMD’s attempt at creating a custom API that will replace DirectX and OpenGL in order to more directly target the GPU hardware in your PC, specifically the AMD-based designs of GCN (Graphics Core Next).
Mantle runs at a lower level than DX or OGL does, able to more directly access the hardware resources of the graphics chips, and with that ability is able to better utilize the hardware in your system, both CPU and GPU. In fact, the primary benefit of Mantle is going to be seen in the form of less API overhead and bottlenecks such as real-time shader compiling and code translation.
If you are interested in the meat of what makes Mantle tick and why it was so interesting to us when it was first announced in September of 2013, you should check out our first deep-dive article written by Josh. In it you’ll get our opinion on why Mantle matters and why it has the potential for drastically changing the way the PC is thought of in the gaming ecosystem.
Hybrid CrossFire that actually works
The road to redemption for AMD and its driver team has been a tough one. Since we first started to reveal the significant issues with AMD's CrossFire technology back in January of 2013 the Catalyst driver team has been hard at work on a fix, though I will freely admit it took longer to convince them that the issue was real than I would have liked. We saw the first steps of the fix released in August of 2013 with the release of the Catalyst 13.8 beta driver. It supported DX11 and DX10 games and resolutions of 2560x1600 and under (no Eyefinity support) but was obviously still less than perfect.
In October with the release of AMD's latest Hawaii GPU the company took another step by reorganizing the internal architecture of CrossFire on the chip level with XDMA. The result was frame pacing that worked on the R9 290X and R9 290 in all resolutions, including Eyefinity, though still left out older DX9 titles.
One thing that had not been addressed, at least not until today, was the issues that surrounded AMD's Hybrid CrossFire technology, now known as Dual Graphics. This is the ability for an AMD APU with integrated Radeon graphics to pair with a low cost discrete GPU to improve graphics performance and gaming experiences. Recently over at Tom's Hardware they discovered that Dual Graphics suffered from the exact same scaling issues as standard CrossFire; frame rates in FRAPS looked good but the actually perceived frame rate was much lower.
A little while ago a new driver made its way into my hands under the name of Catalyst 13.35 Beta X, a driver that promised to enable Dual Graphics frame pacing with Kaveri and R7 graphics cards. As you'll see in the coming pages, the fix definitely is working. And, as I learned after doing some more probing, the 13.35 driver is actually a much more important release than it at first seemed. Not only is Kaveri-based Dual Graphics frame pacing enabled, but Richland and Trinity are included as well. And even better, this driver will apparently fix resolutions higher than 2560x1600 in desktop graphics as well - something you can be sure we are checking on this week!
Just as we saw with the first implementation of Frame Pacing in the Catalyst Control Center, with the 13.35 Beta we are using today you'll find a new set of options in the Gaming section to enable or disable Frame Pacing. The default setting is On; which makes me smile inside every time I see it.
The hardware we are using is the same basic setup we used in my initial review of the AMD Kaveri A8-7600 APU review. That includes the A8-7600 APU, an Asrock A88X mini-ITX motherboard, 16GB of DDR3 2133 MHz memory and a Samsung 840 Pro SSD. Of course for our testing this time we needed a discrete card to enable Dual Graphics and we chose the MSI R7 250 OC Edition with 2GB of DDR3 memory. This card will run you an additional $89 or so on Amazon.com. You could use either the DDR3 or GDDR5 versions of the R7 250 as well as the R7 240, but in our talks with AMD they seemed to think the R7 250 DDR3 was the sweet spot for the CrossFire implementation.
Both the R7 250 and the A8-7600 actually share the same number of SIMD units at 384, otherwise known as 384 shader processors or 6 Compute Units based on the new nomenclature that AMD is creating. However, the MSI card is clocked at 1100 MHz while the GPU portions of the A8-7600 APU are running at only 720 MHz.
So the question is, has AMD truly fixed the issues with frame pacing with Dual Graphics configurations, once again making the budget gamer feature something worth recommending? Let's find out!
A Refreshing Change
Refreshes are bad, right? I guess that depends on who you talk to. In the case of AMD, it is not a bad thing. For people who live for cutting edge technology in the 3D graphics world, it is not pretty. Unfortunately for those people, reality has reared its ugly head. Process technology is slowing down, but product cycles keep moving along at a healthy pace. This essentially necessitates minor refreshes for both AMD and NVIDIA when it comes to their product stack. NVIDIA has taken the Kepler architecture to the latest GTX 700 series of cards. AMD has done the same thing with the GCN architecture, but has radically changed the nomenclature of the products.
Gone are the days of the Radeon HD 7000 series. Instead AMD has renamed their GCN based product stack with the Rx 2xx series. The products we are reviewing here are the R9 280X and the R9 270X. These products were formerly known as the HD 7970 and HD 7870 respectively. These products differ in clock speeds slightly from the previous versions, but the differences are fairly minimal. What is different are the prices for these products. The R9 280X retails at $299 while the R9 270X comes in at $199.
Asus has taken these cards and applied their latest DirectCU II technology to them. These improvements relate to design, component choices, and cooling. These are all significant upgrades from the reference designs, especially when it comes to the cooling aspects. It is good to see such a progression in design, but it is not entirely surprising given that the first HD 7000 series debuted in January, 2012.
DisplayPort to Save the Day?
During an impromptu meeting with AMD this week, the company's Corporate Vice President for Visual Computing, Raja Koduri, presented me with an interesting demonstration of a technology that allowed the refresh rate of a display on a Toshiba notebook to perfectly match with the render rate of the game demo being shown. The result was an image that was smooth and with no tearing effects. If that sounds familiar, it should. NVIDIA's G-Sync was announced in November of last year and does just that for desktop systems and PC gamers.
Since that November unveiling, I knew that AMD would need to respond in some way. The company had basically been silent since learning of NVIDIA's release but that changed for me today and the information discussed is quite extraordinary. AMD is jokingly calling the technology demonstration "FreeSync".
Variable refresh rates as discussed by NVIDIA.
During the demonstration AMD's Koduri had two identical systems side by side based on a Kabini APU . Both were running a basic graphics demo of a rotating windmill. One was a standard software configuration while the other model had a modified driver that communicated with the panel to enable variable refresh rates. As you likely know from our various discussions about variable refresh rates an G-Sync technology from NVIDIA, this setup results in a much better gaming experience as it produces smoother animation on the screen without the horizontal tearing associated with v-sync disabled.
Obviously AMD wasn't using the same controller module that NVIDIA is using on its current G-Sync displays, several of which were announced this week at CES. Instead, the internal connection on the Toshiba notebook was the key factor: Embedded Display Port (eDP) apparently has a feature to support variable refresh rates on LCD panels. This feature was included for power savings on mobile and integrated devices as refreshing the screen without new content can be a waste of valuable battery resources. But, for performance and gaming considerations, this feature can be used to initiate a variable refresh rate meant to smooth out game play, as AMD's Koduri said.
Introduction and Unboxing
We've been covering NVIDIA's new G-Sync tech for quite some time now, and displays so equipped are finally shipping. With all of the excitement going on, I became increasingly interested in the technology, especially since I'm one of those guys who is extremely sensitive to input lag and the inevitable image tearing that results from vsync-off gaming. Increased discussion on our weekly podcast, coupled with the inherent difficulty of demonstrating the effects without seeing G-Sync in action in-person, led me to pick up my own ASUS VG248QE panel for the purpose of this evaluation and review. We've generated plenty of other content revolving around the G-Sync tech itself, so lets get straight into what we're after today - evaluating the out of box installation process of the G-Sync installation kit.
All items are well packed and protected.
Included are installation instructions, a hard plastic spudger for opening the panel, a couple of stickers, and all necessary hardware bits to make the conversion.
Sapphire Triple Fan Hawaii
It was mid-December when the very first custom cooled AMD Radeon R9 290X card hit our offices in the form of the ASUS R9 290X DirectCU II. It was cooler, quieter, and faster than the reference model; this is a combination that is hard to pass up (if you could buy it yet). More and more of these custom models, both in the R9 290 and R9 290X flavor, are filtering their way into PC Perspective. Next on the chopping block is the Sapphire Tri-X model of the R9 290X.
Sapphire's triple fan cooler already made quite an impression on me when we tested a version of it on the R9 280X retail round up from October. It kept the GPU cool but it was also the loudest of the retail cards tested at the time. For the R9 290X model, Sapphire has made some tweaks to the fan speeds and the design of the cooler which makes it a better overall solution as you will soon see.
The key tenets for any AMD R9 290/290X custom cooled card is to beat AMD's reference cooler in performance, noise, and variable clock rates. Does Sapphire meet these goals?
The Sapphire R9 290X Tri-X 4GB
While the ASUS DirectCU II card was taller and more menacing than the reference design, the Sapphire Tri-X cooler is longer and appears to be more sleek than the competition thus far. The bright yellow and black color scheme is both attractive and unique though it does lack the LED light that the 280X showcased.
Sapphire has overclocked this model slightly, to 1040 MHz on the GPU clock, which puts it in good company.
|AMD Radeon R9 290X||ASUS R9 290X DirectCU II||Sapphire R9 290X Tri-X|
|Rated Clock||1000 MHz||1050 MHz||1040 MHz|
|Memory Clock||5000 MHz||5400 MHz||5200 MHz|
|TDP||~300 watts||~300 watts||~300 watts|
|Peak Compute||5.6 TFLOPS||5.6+ TFLOPS||5.6T TFLOPS|
There are three fans on the Tri-X design, as the name would imply, but each are the same size unlike the smaller central fan design of the R9 280X.
The First Custom R9 290X
It has been a crazy launch for the AMD Radeon R9 series of graphics cards. When we first reviewed both the R9 290X and the R9 290, we came away very impressed with the GPU and the performance it provided. Our reviews of both products resulted in awards of the Gold class. The 290X was a new class of single GPU performance while the R9 290 nearly matched performance at a crazy $399 price tag.
But there were issues. Big, glaring issues. Clock speeds had a huge amount of variance depending on the game and we saw a GPU that was rated as "up to 1000 MHz" running at 899 MHz in Skyrim and 821 MHz in Bioshock Infinite. Those are not insignificant deltas in clock rate that nearly perfectly match deltas in performance. These speeds also changed based on the "hot" or "cold" status of the graphics card - had it warmed up and been active for 10 minutes prior to testing? If so, the performance was measurably lower than with a "cold" GPU that was just started.
That issue was not necessarily a deal killer; rather, it just made us rethink how we test GPUs. The fact that many people were seeing lower performance on retail purchased cards than with the reference cards sent to press for reviews was a much bigger deal. In our testing in November the retail card we purchased, that was using the exact same cooler as the reference model, was running 6.5% slower than we expected.
The obvious hope was the retail cards with custom PCBs and coolers would be released from AMD partners and somehow fix this whole dilemma. Today we see if that was correct.
A slightly smaller MARS
The NVIDIA GeForce GTX 760 was released in June of 2013. Based on the same GK104 GPU as the GTX 680, GTX 670 and GTX 770, the GTX 760 disabled a couple more of the clusters of processor cores to offer up impressive performance levels for a lower cost than we had seen previously. My review of the GTX 760 was very positive as NVIDIA had priced it aggressively against the competing products from AMD.
As for ASUS, they have a storied history with the MARS brand. Typically an over-built custom PCB with two of the highest end NVIDIA GPUs stapled together, the ASUS MARS cards have been limited edition products with a lot of cache around them. The first MARS card was a dual GTX 285 product that was the first card to offer 4GB of memory (though 2GB per GPU of course). The MARS II took a pair of GTX 580 GPUs and pasted them on a HUGE card and sold just 1000 of them worldwide. It was heavy, expensive and fast; blazing fast. But at a price of $1200+ it wasn't on the radar of most PC gamers.
Interestingly, the MARS iteration for the GTX 680 never occurred and why that is the case is still a matter of debate. Some point the finger at poor sales and ASUS while others think that NVIDIA restricted ASUS' engineers from being as creative as they needed to be.
Today's release of the ASUS ROG MARS 760 is a bit different - this is still a high end graphics card but it doesn't utilize the fastest single-GPU option on the market. Instead ASUS has gone with a more reasonable design that combines a pair of GTX 760 GK104 GPUs on a single PCB with a PCI Express bridge chip between them. The MARS 760 is significantly smaller and less power hungry than previous MARS cards but it is still able to pack a punch in the performance department as you'll soon see.
Quality time with G-Sync
Readers of PC Perspective will already know quite alot about NVIDIA's G-Sync technology. When it was first unveiled in October we were at the event and were able to listen to NVIDIA executives, product designers and engineers discuss and elaborate on what it is, how it works and why it benefits gamers. This revolutionary new take on how displays and graphics cards talk to each other enables a new class of variable refresh rate monitors that will offer up the smoothness advantages of having V-Sync off, while offering the tear-free images normally reserved for gamers enabling V-Sync.
NVIDIA's Prototype G-Sync Monitor
We were lucky enough to be at NVIDIA's Montreal tech day while John Carmack, Tim Sweeney and Johan Andersson were on stage discussing NVIDIA G-Sync among other topics. All three developers were incredibly excited about G-Sync and what it meant for gaming going forward.
Also on that day, I published a somewhat detailed editorial that dug into the background of V-sync technology, why the 60 Hz refresh rate existed and why the system in place today is flawed. This basically led up to an explanation of how G-Sync works, including integration via extending Vblank signals and detailed how NVIDIA was enabling the graphics card to retake control over the entire display pipeline.
In reality, if you want the best explanation of G-Sync, how it works and why it is a stand-out technology for PC gaming, you should take the time to watch and listen to our interview with NVIDIA's Tom Petersen, one of the primary inventors of G-Sync. In this video we go through quite a bit of technical explanation of how displays work today, and how the G-Sync technology changes gaming for the better. It is a 1+ hour long video, but I selfishly believe that it is the most concise and well put together collection of information about G-Sync for our readers.
The story today is more about extensive hands-on testing with the G-Sync prototype monitors. The displays that we received this week were modified versions of the 144Hz ASUS VG248QE gaming panels, the same ones that will in theory be upgradeable by end users as well sometime in the future. These monitors are TN panels, 1920x1080 and though they have incredibly high refresh rates, aren't usually regarded as the highest image quality displays on the market. However, the story about what you get with G-Sync is really more about stutter (or lack thereof), tearing (or lack thereof), and a better overall gaming experience for the user.
Another retail card reveals the results
Since the release of the new AMD Radeon R9 290X and R9 290 graphics cards, we have been very curious about the latest implementation of AMD's PowerTune technology and its scaling of clock frequency as a result of the thermal levels of each graphics card. In the first article covering this topic, I addressed the questions from AMD's point of view - is this really a "configurable" GPU as AMD claims or are there issues that need to be addressed by the company?
The biggest problems I found were in the highly variable clock speeds from game to game and from a "cold" GPU to a "hot" GPU. This affects the way many people in the industry test and benchmark graphics cards as running a game for just a couple of minutes could result in average and reported frame rates that are much higher than what you see 10-20 minutes into gameplay. This was rarely something that had to be dealt with before (especially on AMD graphics cards) so to many it caught them off-guard.
Because of the new PowerTune technology, as I have discussed several times before, clock speeds are starting off quite high on the R9 290X (at or near the 1000 MHz quoted speed) and then slowly drifting down over time.
Another wrinkle occurred when Tom's Hardware reported that retail graphics cards they had seen were showing markedly lower performance than the reference samples sent to reviewers. As a result, AMD quickly released a new driver that attempted to address the problem by normalizing to fan speeds (RPM) rather than fan voltage (percentage). The result was consistent fan speeds on different cards and thus much closer performance.
However, with all that being said, I was still testing retail AMD Radeon R9 290X and R9 290 cards that were PURCHASED rather than sampled, to keep tabs on the situation.
EVGA Brings Custom GTX 780 Ti Early
Reference cards for new graphics card releases are very important for a number of reasons. Most importantly, these are the cards presented to the media and reviewers that judge the value and performance of these cards out of the gate. These various articles are generally used by readers and enthusiasts to make purchasing decisions, and if first impressions are not good, it can spell trouble. Also, reference cards tend to be the first cards sold in the market (see the recent Radeon R9 290/290X launch) and early adopters get the same technology in their hands; again the impressions reference cards leave will live in forums for eternity.
All that being said, retail cards are where partners can differentiate and keep the various GPUs relevant for some time to come. EVGA is probably the most well known NVIDIA partner and is clearly their biggest outlet for sales. The ACX cooler is one we saw popularized with the first GTX 700-series cards and the company has quickly adopted it to the GTX 780 Ti, released by NVIDIA just last week.
I would normally have a full review for you as soon as we could but thanks to a couple of upcoming trips that will keep me away from the GPU test bed, that will take a little while longer. However, I thought a quick preview was in order to show off the specifications and performance of the EVGA GTX 780 Ti ACX.
As expected, the EVGA ACX design of the GTX 780 Ti is overclocked. While the reference card runs at a base clock of 875 MHz and a typical boost clock of 928 MHz, this retail model has a base clock of 1006 MHz and a boost clock of 1072 MHz. This means that all 2,880 CUDA cores are going to run somewhere around 15% faster on the EVGA ACX model than the reference GTX 780 Ti SKUs.
We should note that though the cooler is custom built by EVGA, the PCB design of this GTX 780 Ti card remains the same as the reference models.
An issue of variance
AMD just sent along an email to the press with a new driver to use for Radeon R9 290X and Radeon R9 290 testing going forward. Here is the note:
We’ve identified that there’s variability in fan speeds across AMD R9 290 series boards. This variability in fan speed translates into variability of the cooling capacity of the fan-sink.
The flexibility of AMD PowerTune technology enables us to correct this variability in a driver update. This update will normalize the fan RPMs to the correct values.
The correct target RPM values are 2200RPM for the AMD Radeon R9 290X ‘Quiet mode’, and 2650RPM for the R9 290. You can verify these in GPU-Z.
If you’re working on stories relating to R9 290 series products, please use this driver as it will reduce any variability in fan speeds. This driver will be posted publicly tonight.
Great! This is good news! Except it also creates some questions.
When we first tested the R9 290X and the R9 290, we discussed the latest iteration of AMD's PowerTune technology. That feature attempts to keep clocks as high as possible under the constraints of temperature and power. I took issue with the high variability of clock speeds on our R9 290X sample, citing this graph:
I then did some digging into the variance and the claims that AMD was building a "configurable" GPU. In that article we found that there were significant performance deltas between "hot" and "cold" GPUs; we noticed that doing simple, quick benchmarks would produce certain results that were definitely not real-world in nature. At the default 40% fan speed, Crysis 3 showed 10% variance with the 290X at 2560x1440:
GK110 in all its glory
I bet you didn't realize that October and November were going to become the onslaught of graphics cards it has been. I know I did not and I tend to have a better background on these things than most of our readers. Starting with the release of the AMD Radeon R9 280X, 270X and R7 260X in the first week of October, it has pretty much been a non-stop battle between NVIDIA and AMD for the hearts, minds, and wallets of PC gamers.
Shortly after the Tahiti refresh came NVIDIA's move into display technology with G-Sync, a variable refresh rate feature that will work with upcoming monitors from ASUS and others as long as you have a GeForce Kepler GPU. The technology was damned impressive, but I am still waiting for NVIDIA to send over some panels for extended testing.
Later in October we were hit with the R9 290X, the Hawaii GPU that brought AMD back in the world of ultra-class single GPU card performance. It has produced stellar benchmarks and undercut the prices (then at least) of the GTX 780 and GTX TITAN. We tested it in both single and multi-GPU configurations and found that AMD had made some impressive progress in fixing its frame pacing issues, even with Eyefinity and 4K tiled displays.
NVIDIA dropped a driver release with ShadowPlay that allows gamers to record playback locally without a hit on performance. I posted a roundup of R9 280X cards which showed alternative coolers and performance ranges. We investigated the R9 290X Hawaii GPU and the claims that performance is variable and configurable based on fan speeds. Finally, the R9 290 (non-X model) was released this week to more fanfare than the 290X thanks to its nearly identical performance and $399 price tag.
And today, yet another release. NVIDIA's GeForce GTX 780 Ti takes the performance of the GK110 and fully unlocks it. The GTX TITAN uses one fewer SMX and the GTX 780 has three fewer SMX units so you can expect the GTX 780 Ti to, at the very least, become the fastest NVIDIA GPU available. But can it hold its lead over the R9 290X and validate its $699 price tag?
More of the same for a lot less cash
The week before Halloween, AMD unleashed a trick on the GPU world under the guise of the Radeon R9 290X and it was the fastest single GPU graphics card we had tested to date. With a surprising price point of $549, it was able to outperform the GeForce GTX 780 (and GTX TITAN in most cases) while under cutting the competitions price by $100. Not too bad!
Today's release might be more surprising (and somewhat confusing). The AMD Radeon R9 290 4GB card is based on the same Hawaii GPU with a few less compute units enabled (CUs) and an even more aggressive price and performance placement. Seriously, has AMD lost its mind?
Can a card with a $399 price tag cut into the same performance levels as the JUST DROPPED price of $499 for the GeForce GTX 780?? And, if so, what sacrifices are being made by users that adopt it? Why do so many of our introduction sentences end in question marks?
The R9 290 GPU - Hawaii loses a small island
If you are new to the Hawaii GPU and you missed our first review of the Radeon R9 290X from last month, you should probably start back there. The architecture is very similar to that of the HD 7000-series Tahiti GPUs with some modest changes to improve efficiency with the biggest jump in raw primitives per second to 4/clock over 2/clock.
The R9 290 is based on Hawaii though it has four fewer compute units (CUs) than the R9 290X. When I asked AMD if that meant there was one fewer CU per Shader Engine or if they were all removed from a single Engine, they refused to really answer. Instead, several "I'm not allowed to comment on the specific configuration" lines were given. This seems pretty odd as NVIDIA has been upfront about the dual options for its derivative GPU models. Oh well.
When AMD released the Radeon R9 290X last month, I came away from the review very impressed with the performance and price point the new flagship graphics card was presented with. My review showed that the 290X was clearly faster than the NVIDIA GeForce GTX 780 and (and that time) was considerably less expensive as well - a win-win for AMD without a doubt.
But there were concerns over a couple of aspects of the cards design. First was the temperature and, specifically, how AMD was okay with this rather large silicon hitting 95C sustained. Another concern, AMD has also included a switch at the top of the R9 290X to switch fan profiles. This switch essentially creates two reference defaults and makes it impossible for us to set a baseline of performance. These different modes only changed the maximum fan speed that the card was allowed to reach. Still, performance changed because of this setting thanks to the newly revised (and updated) AMD PowerTune technology.
We also saw, in our initial review, a large variation in clock speeds both from one game to another as well as over time (after giving the card a chance to heat up). This led me to create the following graph showing average clock speeds 5-7 minutes into a gaming session with the card set to the default, "quiet" state. Each test is over a 60 second span.
Clearly there is variance here which led us to more questions about AMD's stance. Remember when the Kepler GPUs launched. AMD was very clear that variance from card to card, silicon to silicon, was bad for the consumer as it created random performance deltas between cards with otherwise identical specifications.
When it comes to the R9 290X, though, AMD claims both the GPU (and card itself) are a customizable graphics solution. The customization is based around the maximum fan speed which is a setting the user can adjust inside the Catalyst Control Center. This setting will allow you to lower the fan speed if you are a gamer desiring a quieter gaming configuration while still having great gaming performance. If you are comfortable with a louder fan, because headphones are magic, then you have the option to simply turn up the maximum fan speed and gain additional performance (a higher average clock rate) without any actual overclocking.
ASUS R9 280X DirectCU II TOP
Earlier this month AMD took the wraps off of a revamped and restyled family of GPUs under the Radeon R9 and R7 brands. When I reviewed the R9 280X, essentially a lower cost version of the Radoen HD 7970 GHz Edition, I came away impressed with the package AMD was able to put together. Though there was no new hardware to really discuss with the R9 280X, the price drop placed the cards in a very aggressive position adjacent the NVIDIA GeForce line-up (including the GeForce GTX 770 and the GTX 760).
As a result, I fully expect the R9 280X to be a great selling GPU for those gamers with a mid-range budget of $300.
But another of the benefits of using an existing GPU architecture is the ability for board partners to very quickly release custom built versions of the R9 280X. Companies like ASUS, MSI, and Sapphire are able to have overclocked and custom-cooled alternatives to the 3GB $300 card, almost immediately, by simply adapting the HD 7970 PCB.
Today we are going to be reviewing a set of three different R9 280X cards: the ASUS DirectCU II, MSI Twin Frozr Gaming, and the Sapphire TOXIC.
ARM is Serious About Graphics
Ask most computer users from 10 years ago who ARM is, and very few would give the correct answer. Some well informed people might mention “Intel” and “StrongARM” or “XScale”, but ARM remained a shadowy presence until we saw the rise of the Smartphone. Since then, ARM has built up their brand, much to the chagrin of companies like Intel and AMD. Partners such as Samsung, Apple, Qualcomm, MediaTek, Rockchip, and NVIDIA have all worked with ARM to produce chips based on the ARMv7 architecture, with Apple being the first to release the first ARMv8 (64 bit) SOCs. The multitude of ARM architectures are likely the most shipped chips in the world, going from very basic processors to the very latest Apple A7 SOC.
The ARMv7 and ARMv8 architectures are very power efficient, yet provide enough performance to handle the vast majority of tasks utilized on smartphones and tablets (as well as a handful of laptops). With the growth of visual computing, ARM also dedicated itself towards designing competent graphics portions of their chips. The Mali architecture is aimed at being an affordable option for those without access to their own graphics design groups (NVIDIA, Qualcomm), but competitive with others that are willing to license their IP out (Imagination Technologies).
ARM was in fact one of the first to license out the very latest graphics technology to partners in the form of the Mali-T600 series of products. These modules were among the first to support OpenGL ES 3.0 (compatible with 2.0 and 1.1) and DirectX 11. The T600 architecture is very comparable to Imagination Technologies’ Series 6 and the Qualcomm Adreno 300 series of products. Currently NVIDIA does not have a unified mobile architecture in production that supports OpenGL ES 3.0/DX11, but they are adapting the Kepler architecture to mobile and will be licensing it to interested parties. Qualcomm does not license out Adreno after buying that group from AMD (Adreno is an anagram of Radeon).
ShadowPlay is NVIDIA's latest addition to their GeForce Experience platform. This feature allows their GPUs, starting with Kepler, to record game footage either locally or stream it online through Twitch.tv (in a later update). It requires Kepler GPUs because it is accelerated by that hardware. The goal is to constantly record game footage without any noticeable impact to performance; that way, the player can keep it running forever and have the opportunity to save moments after they happen.
Also, it is free.
I know that I have several gaming memories which come unannounced and leave undocumented. A solution like this is very exciting to me. Of course a feature on paper not the same as functional software in the real world. Thankfully, at least in my limited usage, ShadowPlay mostly lives up to its claims. I do not feel its impact on gaming performance. I am comfortable leaving it on at all times. There are issues, however, that I will get to soon.
This first impression is based on my main system running the 331.65 (Beta) GeForce drivers recommended for ShadowPlay.
- Intel Core i7-3770, 3.4 GHz
- NVIDIA GeForce GTX 670
- 16 GB DDR3 RAM
- Windows 7 Professional
- 1920 x 1080 @ 120Hz.
- 3 TB USB3.0 HDD (~50MB/s file clone).
The two games tested are Starcraft II: Heart of the Swarm and Battlefield 3.