Competition is a Great Thing

We spent some time testing the Star Swarm benchmark with mainstream graphics cards on both a high end and budget platform. The results are surprising!

While doing some testing with the AMD Athlon 5350 Kabini APU to determine it's flexibility as a low cost gaming platform, we decided to run a handful of tests to measure something else that is getting a lot of attention right now: AMD Mantle and NVIDIA's 337.50 driver.

Earlier this week I posted a story that looked at performance scaling of NVIDIA's new 337.50 beta driver compared to the previous 335.23 WHQL. The goal was to assess the DX11 efficiency improvements that the company stated it had been working on and implemented into this latest beta driver offering. In the end, we found some instances where games scaled by as much as 35% and 26% but other cases where there was little to no gain with the new driver. We looked at both single GPU and multi-GPU scenarios on mostly high end CPU hardware though.

Earlier in April I posted an article looking at Mantle, AMD's answer to a lower level API that is unique to its ecosystem, and how it scaled on various pieces of hardware on Battlefield 4. This was the first major game to implement Mantle and it remains the biggest name in the field. While we definitely saw some improvements in gaming experiences with Mantle there was work to be done when it comes to multi-GPU scaling and frame pacing. 

Both parties in this debate were showing promise but obviously both were far from perfect.

While we were benchmarking the new AMD Athlon 5350 Kabini based APU, an incredibly low cost processor that Josh reviewed in April, it made sense to test out both Mantle and NVIDIA's 337.50 driver in an interesting side by side.

Here is the setup. Using a GeForce GTX 750 Ti and a Radeon R7 260X graphics card, somewhat equivalent in terms of pricing and performance, we ran the Star Swarm stress test benchmark. This application, built originally to demonstrate the performance abilities of AMD's Mantle API, was also used in some of NVIDIA's slides to demonstrate the performance improvement in its latest beta driver stack. To add some interest to the test, we ran these on both the AMD Athlon 5350 APU (considered one of the lowest performing platforms you'll find a x16 PCIe slot) as well as the high-end Core i7-3960X 6-core Sandy Bridge-E platform we use for our normal GPU test bed.

Test System Setup
CPU Intel Core i7-3960X Sandy Bridge-E
AMD Athlon 5350 Kabini APU
Motherboard ASUS P9X79 Deluxe
Gigabyte AM1M-S2H
Memory Corsair Dominator DDR3-1600 16GB
Hard Drive OCZ Agility 4 256GB SSD
Sound Card On-board
Graphics Card AMD Radeon R7 260X 2GB
NVIDIA GeForce GTX 750 Ti 2GB
Graphics Drivers NVIDIA: 335.23 WHQL, 337.50 Beta
AMD: Catalyst 14.3 Beta
Power Supply Corsair AX1200i
Operating System Windows 8 Pro x64

I think you'll find the results are quite interesting. Let's take a look.

Our first set of results looks at a pre-optimized software stack. That means we are looking at the GeForce GTX 750 Ti with the 335.23 driver and the Radeon R7 260X with DirectX on the 14.3 driver.

If you look at the results from the Core i7-3960X platform, the GeForce GTX 750 Ti has the advantage in average frame rate by 52% – a strong margin. On the Athlon 5350, the much slower processor in our set of tests, that performance lead for NVIDIA's GTX 750 Ti shrinks to 18%. Clearly the hardware and driver for the GTX 750 Ti is more able to take advantage of the available CPU headroom with DirectX compared to what AMD is doing with the 260X and its DX11 implementation. 

Moving from the slower CPU to the much faster CPU, NVIDIA's hardware sees improvement by 116%. AMD on the other hand only improves by 68% indicating that either the R7 260X itself, or the driver, isn't able to take advantage over the extra CPU headroom.

Now, let's look at these same sets of results but using a post-optimized software stack that enabled Mantle for the R7 260X with the Catalyst 14.3 driver and moved to the NVIDIA 337.50 driver for the GTX 750 Ti.

Things look a little bit different this time! At the top end of the graph on the results run on the Core i7-3960X processor, NVIDIA's GeForce GTX 750 Ti maintains a lead, though slightly smaller, at 45%. Keep in mind that this is WITH Mantle enabled on the 260X and with the new DX11 changes made in NVIDIA's 337.50 driver. 

On the Athlon 5350 platform, AMD's R7 260X is able to take the lead away from NVIDIA's GTX 750 Ti by 8%. This indicates that the performance advantage of Mantle on the lower end platform is larger for AMD than the DX11 changes in NVIDIA's 337.50 driver are for the GTX 750 Ti.

If we compare the 260X performance on both platforms though, clearly something is holding it back. Moving from the Athlon 5350 to the Core i7-3960X only sees a 12% performance improvement while NVIDIA's GeForce GTX 750 Ti average frame rate increases by more than 75%. 

Finally, let's see all these results on the same graphic for a different view.

On this image you can see how each platform with each graphics card is able to scale with the software changes made with Mantle and with the 337.50 driver. In the red bar we have results from the 335.23 NVIDIA driver and the 14.3 Catalyst driver running in DirectX mode while the blue represents the Mantle and 337.50 scores. 

Impressively, the Radeon R7 260X sees a performance improvement of 91% by enabling the Mantle version of Star Swarm on the lower end Athlon 5350 APU. On that same system, NVIDIA's GeForce GTX 750 Ti is able to achieve a 49% frame rate increase while continuing to use DirectX 11 and improving driver efficiency. On the high end platform with the Core i7-3960X, the R7 260X GPU improves by just 27% from enabling Mantle; the GTX 750 Ti scales by 21%.

Both GPUs and both software changes (Mantle and 337.50) see more performance advantages when running on the Athlon 5350 APU system than on the Core i7-3960X system, which obviously makes a lot of sense. Of these two different solutions we are showing, only on two processors and only with one game, AMD Mantle appears to have the bigger potential performance advantage for games that are CPU bound, or in CPU-bound sections of a game.

But what NVIDIA has done with the 337.50 driver changes are impressive considering they are staying within the bounds of the existing, well entrenched and well known API of DirectX 11. A 49% gain is nothing to sneeze at even though Mantle saw a 91% advantage under the same conditions. 

Many users will argue (and have in our comment sections) that what NVIDIA has done with 337.50 is really just game-specific optimizations and isn't anything different than what these two GPU companies have been doing for the past decade. Even though NVIDIA has clearly stated that is not the case, you are make up your own mind if you choose to believe them, but I would posit that it doesn't really matter. Is not a game-specific Mantle integration the same thing and possibly even more troublesome as it requires the developer, rather than the hardware vendor, to take the brunt of the work?

Clearly we would like to have more data on these graphs with more graphics cards and we might be able to do some of that testing starting next week. I think seeing how a GeForce GTX 780 or Radeon R9 290X scales on these two platforms would be equally as compelling. 

These tests above aren't meant to be conclusive evidence for either vendor's current direction but are simply there to add more data points to our discussion moving forward. AMD Mantle has huge potential upside but requires a lot of commitment from game developers and AMD's own software team to keep up. NVIDIA's continued DirectX improvement seems have a lower peak but can be implemented on a much wider array of games and without the need for developers to commit to something long term.

I'm almost afraid to ask but…please leave me your thoughts on this debate and the future of both Mantle and DX11 in the comments below!