Things are about to get…complicated

The latest Ashes DX12 benchmark brought much more to light than just performance on GPUs.

Earlier this week, the team behind Ashes of the Singularity released an updated version of its early access game, which updated its features and capabilities. With support for DirectX 11 and DirectX 12, and adding in multiple graphics card support, the game featured a benchmark mode that got quite a lot of attention. We saw stories based on that software posted by Anandtech, Guru3D and ExtremeTech, all of which had varying views on the advantages of one GPU or another.

That isn’t the focus of my editorial here today, though.

Shortly after the initial release, a discussion began around results from the Guru3D story that measured frame time consistency and smoothness with FCAT, a capture based testing methodology much like the Frame Rating process we have here at PC Perspective. In that post on ExtremeTech, Joel Hruska claims that the results and conclusion from Guru3D are wrong because the FCAT capture methods make assumptions on the output matching what the user experience feels like.  Maybe everyone is wrong?

First a bit of background: I have been working with Oxide and the Ashes of the Singularity benchmark for a couple of weeks, hoping to get a story that I was happy with and felt was complete, before having to head out the door to Barcelona for the Mobile World Congress. That didn’t happen – such is life with an 8-month old. But, in my time with the benchmark, I found a couple of things that were very interesting, even concerning, that I was working through with the developers.

FCAT overlay as part of the Ashes benchmark

First, the initial implementation of the FCAT overlay, which Oxide should be PRAISED for including since we don’t have and likely won’t have a DX12 universal variant of, was implemented incorrectly, with duplication of color swatches that made the results from capture-based testing inaccurate. I don’t know if Guru3D used that version to do its FCAT testing, but I was able to get some updated EXEs of the game through the developer in order to the overlay working correctly. Once that was corrected, I found yet another problem: an issue of frame presentation order on NVIDIA GPUs that likely has to do with asynchronous shaders. Whether that issue is on the NVIDIA driver side or the game engine side is still being investigated by Oxide, but it’s interesting to note that this problem couldn’t have been found without a proper FCAT implementation.

With all of that under the bridge, I set out to benchmark this latest version of Ashes and DX12 to measure performance across a range of AMD and NVIDIA hardware. The data showed some abnormalities, though. Some results just didn’t make sense in the context of what I was seeing in the game and what the overlay results were indicating. It appeared that Vsync (vertical sync) was working differently than I had seen with any other game on the PC.

For the NVIDIA platform, tested using a GTX 980 Ti, the game seemingly randomly starts up with Vsync on or off, with no clear indicator of what was causing it, despite the in-game settings being set how I wanted them. But the Frame Rating capture data was still working as I expected – just because Vsync is enabled doesn’t mean you can look at the results in capture formats. I have written stories on what Vsync enabled captured data looks like and what it means as far back as April 2013. Obviously, to get the best and most relevant data from Frame Rating, setting vertical sync off is ideal. Running into more frustration than answers, I moved over to an AMD platform.

Testing the Radeon R9 Fury X proved even more confusing. Try as I might, I could not get Vsync to turn off with Ashes of the Singularity, regardless of what the settings menu claimed I had asked for. I have been quite good at knowing at a glance whether or not Vsync is on or off on a game with the overlay enabled, and capturing video and playing it back frame by frame showed there was never any of the expected horizontal tearing associated with Vsync being disabled.

So what is going on? Is AMD screwing things up? Is FCAT simply an outdated tool that is not properly measuring what it is supposed to? As it turns out, neither of those assertions is true.

What we are seeing is the first implications of a new pipeline for graphics and compositing. WDDM 2.0 (Windows Display Driver Model) is a very big shift from what existed previously with WDDM 1.3. The days of exclusive fullscreen gaming may be on the way out as Microsoft shifts developers and hardware vendors into a standard path through the OS compositor rather than bypassing it. Implications for this change are only beginning to be understood, but let’s see how it affects Ashes of the Singularity today.

Even though Ashes of the Singularity is not a Windows Store application, the behavior we are seeing is part of the push that Microsoft is making to sell games through that store with a unified platform. The debate of app store based games versus free standing and open gaming has been a debate in the community since MS first starting discussing it – we just happen to have a real-world implication of it in front of us today.

Back on topic to our specific testing scenario, the AMD platform is more closely emulating what Microsoft would like to see done as DX12 gaming progresses. They never want applications to enter into anything that resembles a “Vsync off” state, which front buffer flips that lead to horizontal tearing. Instead, the “Vsync” option in the Ashes settings switches the game engine between two states:

  • Vsync on: Render rate is capped at the refresh rate of the monitor (60Hz is where we’ll discuss this at today) and thus the maximum benchmarked result is 60 FPS.
     
  • Vsync off: Render rate is uncapped, able to go as high as the hardware will allow. The draw rate to the monitor however is capped at the maximum refresh rate of the monitor. Only the most recent frame rendered is shown at the Vsync interval – all other frames dropped from the pipeline.

In fact, this is exactly what Frame Rating and FCAT told us was happening, we just didn’t know why at the time.

AMD Fury X Capped / Vsync On

This result shows Vsync enabled Ashes testing where the frame times displayed are either 16.6ms or 33.3ms, which equate to 60 FPS or 30 FPS, respectively.

AMD Fury X Uncapped / Vsync Off

This graph looks very similar to the graph above, though we see frame times hit the 0ms mark. At those points, frames are missing from the FCAT overlay pattern and thus indicate that a frame was dropped from the output queue after the GPU had rendered it. Manually stepping through the recorded output verifies this assertion and as I learned this week, is the expected behavior for a game running under DX12 with these modes enabled.

Ashes result showing FPS higher than 60, despite running with Vsync

This process of rendering at an unthrottled rate but tossing out any frames that are rendered unnecessarily is how Ashes of the Singularity can self-report frame rates higher than 60Hz (or the maximum refresh of your screen) even though what is being shown on the screen is actually only running at 60 FPS. It is should be noted that this method still introduces judder into the animation, just as you would see with a capped, Vsync enabled scenario.

NVIDIA GeForce GTX 980 Ti Uncapped / Vsync Off

NVIDIA, on the other hand, doesn’t enter into this state at all and, with the latest version of Ashes, behaves much more like previous gaming titles on the PC have in the past. When you set the game to Vsync on, you get a capped 60 FPS result without dropping any frames as the backpressure from Vsync forces the game engine into the proper stepping. If you set Vsync to off, the game enters into a fullscreen mode that has horizontal tearing and an uncapped render rate, and it displays at that rate with all the standard pros and cons that go along with it. (Side note: if you Alt-tab out of the NVIDIA Vsync off scenario the game will actually switch into the uncapped with Vsync on state, unable to revert until you restart the game.)

(Updated note: I quickly tested this Ashes of the Singularity benchmark with Skylake integrated graphics on the Core i7-6700K and it behaves like the NVIDIA platform, with horizontal tearing and an exclusive fullscreen mode with Vsync disabled.)

Despite an uncapped frame rate, AMD's Fury X always runs with a synced frame

The obvious question is…why? Why does this behavior we are seeing on the AMD platform with Ashes behave in this way, and why does the NVIDIA/Intel behavior not? The answers are so complex at this point that I have had 4 conversations with different parties from different companies and I’m still not sure I have the full story.

Here’s what I know so far.

Microsoft is pushing DX12 games (and maybe not just those sold in the app store) to render through a standardized pipeline that uses the Windows compositing engine. In fact, from what I can tell, any game that is sold through the Windows App Store will be required to do so. Rendering through the Windows compositing engine is very similar to running in the borderless windowed mode that we have today (in that tearing is impossible but uncapped frame rates are also difficult to deal with). Microsoft has several reasons for this, most of which involve support for the various overlays and integrations that the company would like to integrate with Windows games. MS wants to have an Action bar, a recording bar, on-screen keyboard support for running games on tablets and 2-in-1s and more, all of which require overlay support and pushing games through the Windows compositing engine will allow them to do that in a standardized way.

Down the road, it appears that Microsoft thinks that running all games through the compositing engine will allow for unique features and additions to PC games, including multi-plane overlays. Multi-plane overlays allow two different render screens, one with the 3D game and another with the UI, for example, to be rendered at different resolutions or even updated at different rates, merging together through the Windows engine. Pushing games through the MS Windows engine will also help to improve on power efficiency, a trait that is more important as PCs mobile into the realm of mobile devices. It is laudable that MS wants to improve the PC gaming experience and bring some unique features from the Xbox to the PC – we just have questions on how it will be done and if they will be sacrificing some of what makes the PC, "the PC" to get it done.

But, if that is the direction MS is going, why are seeing the AMD and NVIDIA platforms behaving differently in Ashes of the Singularity? As it turns out, depending on who you ask, you are likely to get a different answer. No one wants to go against the wishes of Microsoft and no one wants to speak for them, but I have reached out to many people in the industry to try to figure out what’s going on. (Everyone wanted to remain anonymous in these discussions, FYI.)

One person told me that the reason NVIDIA’s results show a standard horizontal tearing behavior when Vsync is turned off in the game options is that it enumerates a DirectX feature called FlipEx, which refers to exclusive fullscreen. It is part of DX12 but was introduced prior to it; you can find background reading on it in Microsoft’s API documentation. Based on this person’s information though, the AMD drivers do not enumerate support for that capability in DirectX 12. If it did, the behavior of the AMD hardware would match that of the NVIDIA hardware with Ashes of the Singularity.

Another viewpoint suggests a different direction. This person suggests that AMD’s driver is behaving as Microsoft has laid out DX12 to work in general, not just for universal apps, and that NVIDIA is implementing a workaround of sorts, to get Vsync off status to function as it has in the past, claiming exclusive fullscreen status in DX12.

What is 100% clear is that we are seeing the confusion surrounding a brand new API with very little specific direction to developers (or the media/community) on it. It also doesn’t help that everything we have been discussing has been a moving target since the days of the Windows 10 RTM. In fact, as I was told by several people this week, had we run this test with Ashes of the Singularity prior to the November 10th update to Windows 10, this entire situation would have been handled differently, with no ability for the game engine to run at uncapped frame rates. And it doesn’t appear to be finished yet.

Other Concerns and Observations

In my discussions with various people about DirectX, I also learned several interesting tidbits that don’t necessarily overlap with the debate about refresh rates and vertical sync (as we have discussed above).

First, it should be noted that most game developers actually support the kind of moves that Microsoft is making, at least when it comes to improving the experience and image quality of the games they are building. Tearing looks bad, no one is denying that, and removing it from PC games is definitely a goal to strive for. Talking with a handful of people, off the record, on what Microsoft is attempting to do with DX12 and unified games, the intent to improve the ecosystem seems legitimate, though the implementation and messaging seems to be half-baked at best.

The Windows Store on Windows 10

Benchmarking is likely going to see a dramatic shift with move to these app-style games on Windows 10, as the sandboxed nature will keep anything from “hooking” into executable as we have seen in the past. This means that overlays, like Fraps, EVGA Precision X, MSI Afterburner and even the FCAT overlay we would like to use for our capture-based Frame Rating testing, are kind of at a standstill. Measuring the performance of each game will necessitate the game developer writing an in-game benchmark mode that exports the kind of information that we want to see measured, and that we trust them to do it correctly, and that it will properly represent the experience the user sees. To its credit, the team at Oxide have done an excellent job of this with the Ashes benchmark, though I still have concerns over how the in-game data output matches up with the experience of watching the benchmark play thanks to those uncapped frame rates and dropped frames we detailed.

It also means NVIDIA is in a tight spot with GeForce Experience – as of now I know of no way that NVIDIA could circumvent the Microsoft App Store system and get GFE to offer the same kind of experiences that it does today with PC games. That includes setting in-game settings, doing gaming captures and integrating Twitch live streaming. Much of the advantage that NVIDIA has over AMD on the software side comes through GeForce Experience and the cohesiveness of the total software package, which could be lost if games distributed in this method really take hold. Obviously, the same restrictions would take place on AMD’s Gaming Evolved Software program.

Rise of the Tomb Raider, a DX11 game, already has exclusive fullscreen issues from the Windows Store.

How this works for variable refresh rate monitors, including those using AMD’s FreeSync and NVIDIA’s G-Sync, is also still a question mark. Based on a couple of talks I have had, it seems like that Microsoft would like to see that capability “owned” by the operating system as well, which would make sense if games use the unified compositing pipeline that MS would like them too. Chances are that AMD would eat this up – they have continued to push FreeSync as the open standard for VRR and the company could benefit from being the only VRR technology support on MS app store games. NVIDIA, on the other hand, very much likes to keep its technologies to itself, particularly G-Sync. The company has often teased upcoming additional features for G-Sync iterations down the road, which may not be possible or beneficial to develop if NVIDIA has to share the technology with its primary competitor. As for today, does FreeSync and G-Sync work? Well, let’s find out tomorrow shall we?

Also, though maybe not as apparent, multi-GPU technologies like SLI and CrossFire will not work the same way they do today with MS app store games, even if they are not using DX12. Because the executable files are being sandboxed, much of the work that goes into properly doing AFR, including the many game specific tricks from each company, will be unusable. We knew that this new version of Direct X would require game developers to integrate their own multi-GPU workloads, but it seems that even if a game is using DX11 and is sold through the app store, the same requirement will apply.

This post over at the PC Master Race subreddit gives even more examples of things that are going to change for games that are released through Microsoft’s Store implementation. No modding, no custom mouse bindings, no controller support outside of Xbox controllers; clearly this is going to shake up our lives as PC gamers. I also don’t think that you’ll be able to live stream out your games through XSplit and OBS either.

Closing Thoughts

This is clearly a discussion that is just at its beginning. My gut tells me that Ashes of the Singularity is just the tip of the iceberg, even if the AMD exclusive fullscreen issue gets ironed out with another driver update or game patch. Starting this week, you’ll see games hitting the Microsoft Store that are not going to be available anywhere else, giving gamers no option other than diving into this storm headfirst should they want to get their Gears on. At least for now, we still have Steam, Origin and dare I say it, Uplay, to help us create a more open PC gaming ecosystem.