Feedback

3DMark API Overhead Feature Test - Early DX12 Performance

Author:
Manufacturer: Futuremark

Our first DX12 Performance Results

Late last week, Microsoft approached me to see if I would be interested in working with them and with Futuremark on the release of the new 3DMark API Overhead Feature Test. Of course I jumped at the chance, with DirectX 12 being one of the hottest discussion topics among gamers, PC enthusiasts and developers in recent history. Microsoft set us up with the latest iteration of 3DMark and the latest DX12-ready drivers from AMD, NVIDIA and Intel. From there, off we went.

First we need to discuss exactly what the 3DMark API Overhead Feature Test is (and also what it is not). The feature test will be a part of the next revision of 3DMark, which will likely ship in time with the full Windows 10 release. Futuremark claims that it is the "world's first independent" test that allows you to compare the performance of three different APIs: DX12, DX11 and even Mantle.

It was almost one year ago that Microsoft officially unveiled the plans for DirectX 12: a move to a more efficient API that can better utilize the CPU and platform capabilities of future, and most importantly current, systems. Josh wrote up a solid editorial on what we believe DX12 means for the future of gaming, and in particular for PC gaming, that you should check out if you want more background on the direction DX12 has set.

View Full Size

One of DX12 keys for becoming more efficient is the ability for developers to get closer to the metal, which is a phrase to indicate that game and engine coders can access more power of the system (CPU and GPU) without having to have its hand held by the API itself. The most direct benefit of this, as we saw with AMD's Mantle implementation over the past couple of years, is improved quantity of draw calls that a given hardware system can utilize in a game engine.

Continue reading our overview of the new 3DMark API Overhead Feature Test with early DX12 Performance Results!!

Draw calls are, in a concise way of putting it, a request from the CPU (and the game engine running on it) to draw and render an object. There are typically thousands of draw calls being placed every frame in a modern game but each of those requests adds a level of overhead to the system, limiting performance in some extreme cases. As that draw call count rises, game engines can become limited by that API overhead. New APIs like Mantle and DX12 reduce that overhead by giving the developers more control. The effect is one clearly shown by Stardock and the Oxide Engine - a game without draw call overhead limits can immediately, and drastically, change how a game functions and how a developer can create new and exciting experiences.

View Full Size

Click to Enlarge

This new feature test from Futuremark, which will be integrated into an upcoming 3DMark release, measures API performance by looking at the balance between frame rates and draw calls. The goal: find out how many draw calls a PC can handle with each API before the frame rate drops below 30 FPS.

At a high level, here is how the test works: starting with a small number of draw calls per frame, the test increases the number of calls in steps every 20 frames until the frame rate drops below 30 FPS. Once that occurs, it keeps that draw call count and measures frame rates for 3 seconds. It then computes the draw calls per second (frame rate multiplied by draw calls per frame) and the result is displayed for the user.

View Full Size

Click to Enlarge

In order to ensure that the API is the bottleneck in this test, the scene is built procedurally with unique geometries that have an indexed mesh of 112-127 triangles. There is no post-processing and the shaders are very simple to make sure the GPU is not a primary bottleneck.

There are three primary tests the application runs through for all hardware, and a fourth if you have Mantle-capable AMD hardware. First, a DirectX 11 pass is done in a single-threaded method where all draw calls are made from a single thread. Another DX11 pass is made in multi-threaded method where all draw calls are divided evenly between a number of threads equal to one less than the number of addressable cores. That balance leaves one dedicated core for the display driver.

The DX12 and Mantle paths in the feature test are, of course, multi-threaded and utilize all cores available. They divide the draw calls even between the total thread count.

First 3DMark API Overhead Feature Test Results

Our test system was built around the following hardware:

  • Intel Core i7-5960X
  • ASUS X99-Deluxe
  • 16GB Corsair DDR4-2400
  • ADATA SP910 120GB SSD

The GPUs we used for this short feature test are the reference NVIDIA GeForce GTX 980, an ASUS R9 290X DirectCU II, the MSI GeForce GTX 960 100ME and a Sapphire R9 285 Tri-X. Driver revision for NVIDIA hardware was 349.90 and for AMD we used 15.200.1012.2.

For our GTX 980 and R9 290X results, you'll see a number of scores. The Haswell-E processor was run in its stock state (8 cores, HyperThreading on) to get baseline numbers but we also started disabling cores on the CPU in order to get some idea of the drop off as we reduce the amount of processor horsepower available to DirectX 12. As you'll no doubt see, six cores appears like it will be plenty to maximize draw call capability.

Let's digest our results.

View Full Size

Click to Enlarge

First on the bench is the GeForce GTX 980 and the results are immediately impressive. Even using the best-case for DirectX 11 multi-threading, our system can only handle 2.62 million draw calls per second, just over 2x the score from the single-threaded DX11 result. However, DX12 sees a substantial increase in efficiency, reaching as high as 15.67M draw calls per second, which is an increase of nearly 6x! While you should definitely not expect to see 6x improvements in gaming performance when DX12 titles begin to ship late this year, the additional CPU headroom that the new API offers means that developers can be beginning planning next-generation game engines accordingly.

For our core count reduction, we see that 8 cores with HyperThreading, 8C with no HT and 6C without HT all result in basically the same maximum draw call throughput. Once we drop to 4C, we decrease the peak draw call rate by nearly 24%. A move to a dual-core system falls to 7.22M draw calls per second, resulting in another 74% drop. Finally, at 1-core, the draw calls hit only 4.23M per second. We will still need to test other CPU platforms to see how they handle both CPU core and CPU clock speed scaling but it appears that even high end quad-core rigs will have more than enough performance headroom to stretch DX12's legs.

View Full Size

Click to Enlarge

Our results with the Radeon R9 290X in the same platform look similar. We see a peak draw call rate of 19.12M per second on DX12 but an even better result under Mantle, hitting 20.88M draw calls per second. That shouldn't surprise us: Mantle was written specifically for the AMD GPU architecture and drivers while DX12 has to be more agnostic to function on AMD, Intel and NVIDIA GPU hardware. Clearly the current implementation of drivers from AMD is doing quite well, besting the maximum draw count rate of the GTX 980 by 4M per second or so. That said, comparisons across GPU platforms at this point is less relevant than you might think. More on that later.

DX12 draw call performance remains basically the same across 8C with HT on, 8C and 6C testing, but it drops by about 33% with the move to a quad-core configuration. On Mantle, we do see a small but measurable 11% drop going from 8-cores to 6-cores but with it is also the only result that scales UP when given the full 8-cores on the Core i7-5960X.

Interestingly, AMD shows little to no scaling between the DX11 single threaded and DX11 multi-threaded scores with the API Overhead Feature Test, which gives credence to the idea that AMD's current driver stack is not as optimized for DX11 gaming as it should be. The DX12 results are definitely forward looking and things could shift in that area, but the DX11 results are very important to gamers and enthusiasts today - so these are results worth considering.

I also did some testing with a couple of more mainstream GPUs: the GTX 960 and the R9 285. The results here are more than a bit surprising:

View Full Size

Click to Enlarge

The green bar is the stock performance of our platform with the GTX 980, the blue bar is the stock GTX 960, but the yellow bar in the middle shows the results with a reasonably overclocked GTX 960 card. (We hit 1590 MHz peak GPU clock and a 2000 MHz memory clock.) At stock settings, the GTX 960 shows a 60% drop from the GTX 980 when it comes to peak draw calls; that's not totally unexpected. However, with a modest overclock on the mainstream card, we were able to record a DX12 draw call rate of 15.36M, only 2% slower than the GTX 980!

Now, clearly we do not and will never expect the in-game performance of the GTX 980 and GTX 960 to be within a margin of 2%, even with the latter heavily overclocked. No game available today shows that kind of difference - in fact we would expect the GTX 960 to be about 60-70% slower than the GTX 980 in average frame rates. Exactly why we see this scale so high with the overclocked GPU is still an unknown - we have asked Microsoft and Futuremark for some insight. What it does prove is that the API Overhead Feature Test should not be used to compare the performance of a GeForce and Radeon GPUs to any degree; if the differences in performance inside NVIDIA's own GPU stack can't match up with real-world performance, then it is very unlikely that competing architectures will fare better.

View Full Size

Click to Enlarge

Of course we ran the Radeon R9 285 through the same kind of comparison - stock and then overclocked. In this case we did not see the drastic increase in draw call rate with the overclocked R9 285 but we do see the R9 290X and R9 285 resulting a score within 5% of one another. Again, these two GPUs definitely have different real-world performance metrics that are further apart than 5%, proving the above point once again.

And how could we let a test like this pass us by without testing out an AMD APU?

View Full Size

Click to Enlarge

The DX11 MT results refused to complete in our testing, but we are working with pre-release drivers, pre-release operating systems and an unfinished API, so just this one hiccup is actually a positive outcome. Moving from DX11 single threaded results to what you get with both DX12 and Mantle, the A10-7850K APU benefits from a 7.8x increase in draw call handling capability. That should improve game performance for properly written DX12 applications tremendously, and do so on a platform that desperately needs it.

Initial Thoughts

Though minimal in quantity compared to the grand scheme of things we want to test with, the results we are showing here today paint a very positive picture about the future of DirectX 12. Since the announcement of Mantle from AMD and its subsequent release in a couple of key titles, the move to an API with less overhead and higher efficiency has been clamored for by enthusiasts, developers and even hardware vendors. Microsoft stepped up the plate, willing to sacrifice so much of what made DirectX a success the past to pave a new trail with DirectX 12.

Futuremark's new 3DMark API Overhead Feature Test proves that something as fundamental as draw calls can be drastically improved upon with forward thinking and a large dose of effort. We saw improvements in API efficiency as high as 18-19x with the Radeon R9 290X when comparing DX12 and DX11 results and while we definitely won't see that same kind of outright gaming performance with the new API, it gives developers a completely new outlook on engine development and integration. Processor bottlenecks that users didn't even know existed can now be pushed aside to stretch the bounds of what games can accomplish. It might not turn the world on it's head day one, but I truly think that APIs like DX12 and Vulkan (what Mantle has become for Khronos) will alter gaming more than anyone previous thought.

View Full Size

Click to Enlarge

As for the AMD and NVIDIA debate, both Futuremark and Microsoft continue to push upon us that this feature test is not a reasonable test of GPU performance. Based on our overclocked results with the GTX 960 in particular, that is definitely the case. I'm sure you will soon see stories claiming that one party is ahead of the other in terms of DX12 driver development, or that one GPU brand is going to be faster in DX12 than the other, but that is simply not a conclusion you can derive from the data sets provided by this test. Just keep calm and wait - you'll see more DX12 gaming tests in the near future that will paint a better picture of what the gaming landscape will look like in 2016. For now, let's just wait for Windows 10 to roll so we can get final DX12 feature and comparison information, and to allow Intel, NVIDIA and AMD a little time to tweak drivers.

It's going to be a great year for PC gamers. There is simply no doubt.


March 26, 2015 | 01:21 PM - Posted by Spacebob

I would love to see some more test results using different CPUs. Core i5's are highly recommended when it comes to gaming builds so I would be curious if potential performance is being left on the table by not going with a 6 or 8 core option.

March 26, 2015 | 06:33 PM - Posted by obababoy

I agree. I have a 4770k and I wonder even more if the AMD side of the house will benefit largely because they have the cheaper 8 core CPU's. I would love AMD to get back in the game more with CPU's that are beneficial.

March 27, 2015 | 12:34 PM - Posted by Anonymous (not verified)

While the draw call count seems to scale really well with core count, there's probably a logarithmic relationship between draw count and actual fps - meaning diminishing returns. What this could mean in practical application (FPS) is that all those "slow and cheap" 4-6-8-core AMD APUs people don't buy will suddenly compete very well for gaming against Intel's CPUs - especially at minimum frame rates.

March 27, 2015 | 04:31 PM - Posted by Anonymous (not verified)

I ran this test on my FX-9370 with my R9 290X and achieved 14,506,198 calls on Mantle and 15,113,326 calls on DX12.

April 16, 2015 | 01:21 AM - Posted by Anonymous (not verified)

Do you have turbo boost enabled? Your scores are considerably lower than others are getting with the fx 83xx series, you should be breaking into the '20s...

Something is not configured properly.

July 10, 2015 | 06:36 AM - Posted by Chronicle (not verified)

I ran it on my FX 8350 with R9 270x and I was breaking into the twenties, albeit the low twenties...

April 29, 2015 | 09:28 AM - Posted by Anonymous (not verified)

buuuuuuullshit! i have changed from Phenom II X4 to i5 2550K and no fps gain with R9 280X in most of games(gtaV, assassin IV...)....for best quality is needed really fast graphics, cpu is way less needable :)

March 26, 2015 | 01:35 PM - Posted by collie

"Late last week, Microsoft approached me to see if I would be interested in working with them....."

NICE ONE!!!!! before anyone calls you a M$ shill, and I'm sure someone will, NICE ONE!!

March 26, 2015 | 01:37 PM - Posted by Anonymous (not verified)

Be nice to see something outside of the same test we have seen so many time already

I hope UT alpha gets DX12 soon

March 26, 2015 | 01:43 PM - Posted by Anonymous (not verified)

edit

I wasn't complaining to pcper im talking about DX12 in general

March 26, 2015 | 01:41 PM - Posted by BillDStrong (not verified)

I would also like a similar test for Vulkan when it comes out. I realize it doesn't even have drivers available, but I wonder if it will function closer to Mantle's or DX12's numbers, considering it is based on Mantle.

Also, This test would seem to be more CPU bound, so I second more test across different hardware, such as gaming AMD builds, and non X99 builds.

March 26, 2015 | 02:04 PM - Posted by Robert_123 (not verified)

Vulkan and DirectX12 are basically both Mantle 1.1. The drivers will be the same and just map the functions to different function names, so the performance will be the same as well. The only difference is that Vulkan will be available on Windows older than 10 as well as Linux and SteamMachines - wonder why anyone would target DX12 then?

March 26, 2015 | 03:04 PM - Posted by Anonymous (not verified)

Yes and the majority of people that will stick with 7, and not go to 8.* or 10(with the hardware lockin) and this will definitely dictate that more attention will need to be focused on Vulkan, as well as any of the HSA foundation's work with HSAIL/LLVM, as The Vulkan API/LLVM and SPIR-V IL are essentially heading in the same direction. The overall goal of both the Vulkan Graphics/GPGPU(OpenGL, OpenCL) compute, and the HSAIL equivalent are based on the HSA concept of allowing different compute workloads to be performed on the CPU/GPU/other processing hardware than the processing hardware was originally intended for.

When testing The Steam machines running Steam OS, the Vulkan API will need to have many of the same benchmarks run, and as the Vulkan graphics/GPGPU API will continuously be improved and updated by Valve, and Khronos, the DX API will have to be continuously also. there will be no more room for anyone to drag their heels on keeping up their respective Graphics API's rates of improvement. AMD will sure be keeping its internal Mantle graphics API/etc. for development and improvement purposes, so expect Mantle's latest internal improvements to be made available to Khronos and others, as AMD has an interest in seeing that for their own GPU/Other hardware the graphics improvement pace needs to keep up with the changes in the hardware.

March 26, 2015 | 04:08 PM - Posted by renz (not verified)

why people not go to win 10 when it is free for people with win 7 to upgrade?

March 26, 2015 | 07:06 PM - Posted by Anonymous (not verified)

The upgrade is free, but the personal metrics gathering and the hardware lockin to a single OS is a definite possibility. With the OEMs given the "Choice" to disable any users' ability to turn off secure boot. M$ controls the secure boot key signing authority, and just what back room dealing is possible to get the OEMs to not allow the M$ secure boot to be disabled could potentially have adverse consequences for Linux, and other OS makers, who will be at the mercy of M$ for the secure keys. So, the upgrade to a potential subscription model may be free, but the subscription itself after the "supported lifetime of the device" is NOT spelled out in an actual end user license agreement is currently legally undefined at this point in time. Who knows what will happen with windows 10, as the final version with the legally proper end user license agreement has yet to be released. Be very careful with this often quoted "Free Upgrade to Win 10" as free upgrades have to be paid for at a later time, with more personal metrics gathering/ads pushed out in the OS, or some form of subscription model, and hardware lockin to a single OS/Application ecosystem. The Base OS may be Free, but all the functional applications may come at a cost, and if you are locked into a single OS on your laptop/PC hardware then the un-free applications/application ecosystem can become very costly, even if the base OS is always provided free of charge.

March 27, 2015 | 05:19 AM - Posted by SiliconDoc (not verified)

There, look ! Ahoy ! 'Tis Moby Dick on the waters !

Oh, nope, it's just a big fat scary story.

Nighty night Ahab.

March 26, 2015 | 11:30 PM - Posted by godrilla (not verified)

Not to mention that Vulkan is currently more attractive to developers because it will likely run on everything including ps4 which sold 20+ million units, Xbox ones, all pcs technically and linux,

Vs dx12 only on the One and windows 10 users ( 10 million gap vs ps 4 units sold )

March 27, 2015 | 05:26 AM - Posted by SiliconDoc (not verified)

Windows sells about 85 million every quarter, for a long time, so figure 260 million a year, and the "developers" are going to take that into account and the puny 20 million in comparison.

Good luck, maybe part of single year will hold some water concerning what you said, and that's a very big maybe.

March 27, 2015 | 03:32 PM - Posted by lantian (not verified)

god damn how many times will people mention this bs and have to be corrected, sony will never use dx or open gl/vulkan, sony have their own low level api, that is a lot better than what xbox one has had, always have had and always will have it, microsoft will stick with their dx 12 for xbox one, this is the same thing most idiot fanboys said about mantle and the next day almost amd said mantle will never be used on the consoles, the same thing applies here, vulkan will be used only in pc's/tablets phones and on pc it will also be linux territory, the amount of consoles sold changes nothing since the only api any of these consoles will use is dx12, you sir made the most uneducated statement ever, please restrain yourself against posting stuff that is plain wrong and has no bearing on reality

March 26, 2015 | 02:19 PM - Posted by Anonymous (not verified)

[deleted]

Editor: I can honestly say I don't think I've ever had a comment that I had to just edit before...nice. -Ryan

March 26, 2015 | 02:36 PM - Posted by collie

there it is!

March 26, 2015 | 02:46 PM - Posted by Anonymous (not verified)

lol expect this from AMD fanboys since this article poops on AMD again.

Love them bad AMD DX11 drivers, but Mantle was worth it right? :D

March 26, 2015 | 03:11 PM - Posted by Anonymous (not verified)

It's the fanboys in general, from both sides that are the problem, including these one and the same posters sock-puppeting for an opposing side. That's What WCCF T. is for!

March 26, 2015 | 03:25 PM - Posted by collie

Westchester Community College Federation of Teachers? lol, not being a dick I just have no idea what WCCF T means, please explain.

And I agree, it's fine to chose one side over the other for no logical reason, in hardware or sports or candy or whatever, creating emotional connections to un-feeling things is part of being human, BUT it doesn't help anyone to dismiss anyone/anything without objectively evaluating it's value.

March 26, 2015 | 03:35 PM - Posted by collie

OHHHHHHHHHH wccftech.com, ofcource, never mind....

March 26, 2015 | 06:54 PM - Posted by obababoy

You are what you are describing. This article makes AMD look better. You are lost!

March 27, 2015 | 05:02 PM - Posted by Anonymous (not verified)

You must be the common id1ot who twists facts to his liking? This article is pro AMD if there ever was one on PCPER (lol), since the results speak for themselves.

Sorry but AMD is superior in every way to nvidia in DX12.

March 28, 2015 | 12:07 PM - Posted by collie

now I wish I could remember what the original coment was, lol. Some thing about fellatio I think......

March 26, 2015 | 02:29 PM - Posted by Martin Trautvetter

I wish you didn't need to spend 20% of the video explaining how lower performance of Nvidia's high-end card doesn't mean anything.

Cause that still leaves 80% pretending that this test means something, which seems highly questionable to me.

March 26, 2015 | 04:43 PM - Posted by Searching4Sasquatch (not verified)

Did you even read the Tech Guide and info from Futuremark about this test? They say over and over how it is NOT a gaming workload ad is NOT supposed to showcase DX12 gaming performance. They'll have a new 3DMark with DX12 gaming benchmarks (not feature tests) later this year.

March 26, 2015 | 07:55 PM - Posted by Martin Trautvetter

That's my point, as if 3DMark tests weren't bad enough already, now here is one that even they insist doesn't tell you much, if anything.

So AMD sucks at MTed DX11 drivers and Nvidia hasn't yet concentrated on "optimizing" their DX12 drivers. Quelle surprise!

March 27, 2015 | 05:05 PM - Posted by Anonymous (not verified)

"So AMD sucks at MTed DX11 drivers and Nvidia sucks at DX12 drivers. Quelle surprise!"

FIXED it for you

March 26, 2015 | 02:51 PM - Posted by Fasic (not verified)

What about fx 8xxx vs i5 or i3 vs fx 6xxx...or even 750k vs dual core vs ix vs fx xxxx...
:-D

March 26, 2015 | 05:13 PM - Posted by ROdNEY

What about DX12 performance / price. It's so weird how every web compare intel to AMD in very different price segment.

March 27, 2015 | 03:29 AM - Posted by arbiter

They compare the cpu that are direct competition for each other, 8350 vs 3770k for example.

April 5, 2015 | 09:09 PM - Posted by Fasic (not verified)

Differnt price? If you tolk to my compare... In my country fx 8xxx is between i3 i5...and 750k is more expensive then g3258...so i won to see is amd comming back with this or we+them are lost unitl new arhitecture...
:-D

March 26, 2015 | 02:57 PM - Posted by JohnGR

Congrats for the exclusive tests, and thank you very much for both the article and the video.

PS The results on Anandtech and Star Swarm where showing a much faster Maxwell architecture over GCN on DX12, while in this test things look the opposite, if of course we forget the part where it is said that this is an API comparison test and NOT a gpu comparison test.

March 26, 2015 | 03:25 PM - Posted by Ophelos

This isn't a exclusive tests, anyone can run it on their own systems. If you got the newest version of windows 10.

March 26, 2015 | 04:33 PM - Posted by JohnGR

IF you have the program before everybody else, even if that is a simple day before everybody else, then it is exclusive.

March 27, 2015 | 03:31 AM - Posted by arbiter

Probably reason they used maxwell for test was Nvidia has been working with MS for a while on DX12. SO best to go with what has been working hand and hand. The test wasn't about which GPU was faster as it was to showcase DX12 vs DX11.

March 27, 2015 | 05:31 AM - Posted by SiliconDoc (not verified)

What a bunch of hooey !

amd loses badly, then it's not about competition...

PLEASE STOP TORTURING ALL OF US !

March 27, 2015 | 07:31 AM - Posted by JohnGR

What? What are you talking about?
Have you read what I wrote or you just wanted to tell us about Nvidia and MS working together on Mant... sorry DX12?
And you probably never noticed that both articles are using gpus from both manufacturers and not only Maxwell?

March 26, 2015 | 03:14 PM - Posted by Martin Trautvetter

"A move to a dual-core system falls to 7.22M draw calls per second, resulting in another 74% drop."

Going from 12.55 to 7.22 is a 42.5% drop, not 74%.

"DX12 draw call performance remains basically the same across 8C with HT on, 8C and 6C testing, but it drops by about 33% with the move to a quad-core configuration."

25%, not 33%.

"At stock settings, the GTX 960 shows a 60% drop from the GTX 980 when it comes to peak draw calls"

~38%, not 60%.

March 27, 2015 | 03:39 AM - Posted by arbiter

It is using Intel cpu so, its showing best case performance. Very unlikely games will require so many draw calls for a good many years yet. i doubt games will use even more then 4-7 million for next bunch of years. that is still a lot of calls. But that might be a bit much as amd a10-7850k, amd says they get 2.7million draw calls from it. (before amd fans start ripping on me, that is FROM AMD's OWN press slide.)

March 26, 2015 | 03:19 PM - Posted by Ophelos

Anyone can run this test now on their own Windows 10 systems.

http://www.guru3d.com/news-story/futuremark-updates-3dmark-with-api-over...

March 26, 2015 | 03:27 PM - Posted by Zealotki11ee (not verified)

Is 290X poor DX11 MT has anything to do with AMD cards in general having much worse CPU performance in DX11 then Nvidia?

March 26, 2015 | 04:48 PM - Posted by jts888 (not verified)

AMD's OpenGL/DirectX drivers have never been as polished performance-wise as Nvidia's for whatever combination of reasons.

Mantle and its conceptual successors Vulkan/DX12 were designed to put more state management at the API user's control instead of trying to make a more abstract interface and trying to catch all the possible weird corner cases in the driver.

The R9 290X actually has ~40% more DRAM bandwidth (though with smaller L2 cache) than the 980 and ~20% more texel fill rate, so it should not be a surprise that AMD beats Nvidia in some scenarios where draw call/driver overhead doesn't matter as much as the raw silicon power, like UHD 4k resolution or programs using a bare metal API.

March 26, 2015 | 04:35 PM - Posted by Mandrake

Great article Ryan, thanks. Just for your future consideration though, have you considered using a Kepler GPU like a 780 or 780 Ti in some of these future comparisons? Granted it's no longer the current high-end Nvidia lineup, but there is a very large install base of these cards.

March 26, 2015 | 05:18 PM - Posted by ioio_111 (not verified)

Why you are sying the R9 290X not faster ? in this test it seems way faster than Nvidia 980, so please just be fair and stop taking any side in this.

software optimization playied a huge role in damaging AMD for all these years and it has been done by both Intel and Nvidia.
it is really sad to see benchmarks apps , software , games , and even some of the compilers favor Intel or Nvidia hardware over AMD, TBH AMD was the leaset criminal company in the market.

March 27, 2015 | 03:41 AM - Posted by arbiter

This overhead test, isn't about nvidia vs amd. Its about DX11 vs DX12 overhead.

March 27, 2015 | 05:34 AM - Posted by SiliconDoc (not verified)

ROFL

Yes, Virginia, they do exist.

March 26, 2015 | 05:23 PM - Posted by ioio_111 (not verified)

Thanks for the preview BTW.

March 26, 2015 | 07:40 PM - Posted by BBMan (not verified)

Nice. I'm glad to see this. Must have felt good to have been singled out for this and honestly, they chose a good candidate.

My comment: I DO wonder how long Microsoft has been sitting on this. After reading a while, it also occurred to me that this is coming on the heels of Mantle and the Steam Machine. So I'm in the company of those who see this as a "timely" development that seems to be so ... coincidental. I could see where Linux might have been able to enter the gaming segment with with a little thunder behind it.

I think M$ saw something there too and ....

March 27, 2015 | 04:17 AM - Posted by StephanS (not verified)

Nice preview.

Now wondering minds want to know... what CPU does it take to saturate a GTX 980 or R9-290x with Dx12 games?

From those numbers, I can see why nvidia was in no rush to go toward a modern API. nvidia dx11 driver where tweaked out compared to AMDs, given them a clear advantage.
But it seem Dx12 will level the playing field.

In this test the GTX 980 is 2.6x faster using Dx11 over a R9-290x,
but in Dx12 mode the r9-290x is now 22% faster with this driver/api limited scenario.

This wont change the power efficiency, but I wouldn't be surprised if the R9-290x gain enough to be on equal ground with a GTX 980 in future titles.

March 27, 2015 | 05:38 AM - Posted by SiliconDoc (not verified)

Wow, that was your brain on drugs...

AMD has mantle, a perf booster, now their comp has it, dx12, so AMD falls behind where they were.

Simple there feller. Now how amd mind warped you to where you were I don't know.

March 27, 2015 | 05:41 AM - Posted by SiliconDoc (not verified)

So, nVidia will soon have DX12, their mantle, EVERYWHERE !

Goodnight AMD, you already burned your wad with mantle.

Let's also notice it appears nVidia is getting HUGE HUGE gains.

Sleep well solemn princess amd, your mirror has broken.

March 31, 2015 | 06:32 PM - Posted by remon (not verified)

What kind of a moron are you? Did you read the charts?

March 27, 2015 | 05:41 AM - Posted by SiliconDoc (not verified)

So, nVidia will soon have DX12, their mantle, EVERYWHERE !

Goodnight AMD, you already burned your wad with mantle.

Let's also notice it appears nVidia is getting HUGE HUGE gains.

Sleep well solemn princess amd, your mirror has broken.

March 27, 2015 | 07:17 AM - Posted by RS84 (not verified)

and no body say thanks to AMD ?

March 27, 2015 | 07:47 AM - Posted by PeterBradshaw (not verified)

It would not surprise me to find out that this API overhead test is saturating the Command Processors on the GPUs.

Overclocking the core of the 960 would likely reduce this bottleneck ;)

March 27, 2015 | 09:06 AM - Posted by Klyde (not verified)

Ryan, there's an uproar on techreport cause you didn't do 4C/HT enabled. Everyone is sitting with their 47x0k wondering how it compares to the 5960x

March 27, 2015 | 11:18 AM - Posted by Anonymous (not verified)

Brought to you by the nVidia advanced commercial force :
wherever nvidia loses, it's not something that matters
wherever they win, it's what's important for today

Ryan Shrout, please rename the site.
It's nothing more than commercials to make ppl buy low end at 200€, middle at 550 and high at 1250.

March 27, 2015 | 03:15 PM - Posted by BBMan (not verified)

Time to feed the troll: RTFA

March 27, 2015 | 01:20 PM - Posted by Anonymous (not verified)

Not a fanboi either way but...

Obvious win for AMD drivers...

Who would have guessed...

Hey, even the R9 285 beats the 980...

Kudos to DICE for wanting it and AMD for building it! Mantle for Windows and Linux... :-)

March 27, 2015 | 01:24 PM - Posted by Anonymous (not verified)

Credit where credit is due!

AMD for the win!

March 27, 2015 | 03:15 PM - Posted by Crazycanukk

This is making me very happy :)

This is making my investment in building my 5960x system more and more validated as the right choice.

March 27, 2015 | 03:19 PM - Posted by BBMan (not verified)

I'm delighted with mine. I had not expected that it would make that much difference, but it does. I think the future is bright for DDR4 and 2011- and it's only just begun.

March 27, 2015 | 08:25 PM - Posted by Hz

Very interesting results. It's a shame that there is no DX9 test, since so many games are still using that API.

If anything, this seems to highlight that AMD's cards have been held back by CPU overheads for quite some time now - particularly when it comes to multi-threaded performance.

The main thing that seems to kill performance in current games is draw calls (primarily the view distance setting) regardless of the GPU that you're using. Even a relatively "basic" looking game like Borderlands 2 suffered from severe performance issues in certain areas as a result of this.

While I was not enamored with the game, I went back and tested it when I upgraded from a 570 to a 970 (but kept my 4.5GHz 2500K) and though the maximum/average framerates shot up, the minimum framerate in these areas was unchanged - and well below 60 FPS.

So I wonder what this means for AMD and DX9/DX11 games. I have been considering a switch once the 390X is released, but if AMD are only going to focus on moving forward with DX12/Vulkan, it seems as though I might be better off sticking with NVIDIA cards and getting a Titan instead.

I don't care about whether one card reaches 300 FPS vs another at 250 at the upper limit, I am more concerned about my minimum never dropping below 60.

Adding more GPU power doesn't seem to help that when the API/drivers are the bottleneck.

And this has me hoping that Intel actually push the single-threaded performance forward soon, rather than being more efficient and adding more cores, because that still seems like it's going to be a limiting factor until everything is running on DX12/Vulkan. And even then, DX12's usefulness seems limited to only 6 cores.

March 27, 2015 | 10:40 PM - Posted by Anonymous (not verified)

"Late last week, Microsoft approached me to see if I would be interested in working with them.."

I never did care for how Ryan phrases some of his articles. Makes it seem like it was a pcper exclusive. Which it is not.

March 29, 2015 | 10:46 PM - Posted by Allyn Malventano

It pretty much comes across exactly as it happened, which was that MS approached Ryan to see if he would be interested in working with them. We had the software early so we could prepare a review in conjunction with the launch. It's how these things work. 

March 28, 2015 | 03:02 AM - Posted by Anonymous (not verified)

This test is certainly not a way to compare specific graphics cards against one another.
BUT it is certainly a benchmark metric to compare software or hardware architectures. Maybe both.

Looks like AMD wins this round in DX12.

March 28, 2015 | 06:02 PM - Posted by OneB1t (not verified)

CPU: FX-8320
GPU: R9 290X 1090/1550
FSB: 200Mhz
CPU-NB: 2600Mhz
HT: 2600Mhz
PCI-E: 16x150Mhz
RAM: 4x2GB 1333Mhz 9-9-9-18-1T

1500Mhz(200x7.5) 6 753 922
2000Mhz(200x10) 8 596 821
2500Mhz(200x12.5) 10 05 0496
3000Mhz(200x15) 11 292 631
3500Mhz(200x17.5) 11 914 490
4000Mhz(200x20) 12 197 349
4500Mhz(200x22.5) 13 238 318
4750Mhz(256x18.5) 14 422 377 (RAM 1364mhz, CPU-NB 2560Mhz, HT 3072Mhz)

core scaling
4 (FX-43xx) 9 190 216
6 (FX-63xx) 12 126 372
8 (FX-83xx) 14 422 377

March 28, 2015 | 06:02 PM - Posted by OneB1t (not verified)

forgot to add mantle results with 15.3 on win 10041

March 28, 2015 | 06:31 PM - Posted by Keven Harvey (not verified)

The fact that jumps out the most to me is that a single core in DX12 beats, by a significant margin, anything in DX11.

March 30, 2015 | 10:29 PM - Posted by Brett from Australia (not verified)

Excellent in depth review Ryan. We get an early and interesting preview of what's possible and achievable with DX 12 and a nice mix of GPUs which should make for another fascinating set of numbers once windows 10 RTM.

May 18, 2015 | 03:29 PM - Posted by drbaltazar (not verified)

Why does a gpu need interrupt?(irq or msi/x)me I found the bottlneck was the default timer used by today's system(lapic for msi/x)did Ms default the gpu interrupt msi/x timer to invariant time stamp counter instead of lapic?or is gpu doing it internally now?last I checked gpu had external interrupt (via msi/x wich use lapic as its default timer)?

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.