Civilization: Beyond Earth Performance: Maxwell vs. Hawaii, DX11 vs. Mantle
A Civ for a New Generation
Turn-based strategy games have long been defined by the Civilization series. Civ 5 took up hours and hours of the PC Perspective team's non-working hours (and likely the working ones too) and it looks like the new Civilization: Beyond Earth has the chance to do the same. Early reviews of the game from Gamespot, IGN, and Polygon are quite positive, and that's great news for a PC-only release; they can sometimes get overlooked in the games' media.
For us, the game offers an interesting opportunity to discuss performance. Beyond Earth is definitely going to be more CPU-bound than the other games that we tend to use in our benchmark suite, but the fact that this game is new, shiny, and even has a Mantle implementation (AMD's custom API) makes interesting for at least a look at the current state of performance. Both NVIDIA and AMD sent have released drivers with specific optimization for Beyond Earth as well. This game is likely to be popular and it deserves the attention it gets.
Civilization: Beyond Earth, a turn-based strategy game that can take a very long time to complete, ships with an integrated benchmark mode to help users and the industry test performance under different settings and hardware configurations. To enable it, you simple add "-benchmark results.csv" to the Steam game launch options and then start up the game normally. Rather than taking you to the main menu, you'll be transported into a view of a map that represents a somewhat typical gaming state for a long term session. The game will use the last settings you ran the game at to measure your system's performance, without the modified launch options, so be sure to configure that before you prepare to benchmark.
The output of this is the "result.csv" file, saved to your Steam game install root folder. In there, you'll find a list of numbers, separated by commas, representing the frame times for each frame rendering during the run. You don't get averages, a minimum, or a maximum without doing a little work. Fire up Excel or Google Docs and remember the formula:
1000 / Average (All Frame Times) = Avg FPS
It's a crude measurement that doesn't take into account any errors, spikes, or other interesting statistical data, but at least you'll have something to compare with your friends.
Our testing settings
Just as I have done in recent weeks with Shadow of Mordor and Sniper Elite 3, I ran some graphics cards through the testing process with Civilization: Beyond Earth. These include the GeForce GTX 980 and Radeon R9 290X only, along with SLI and CrossFire configurations. The R9 290X was run in both DX11 and Mantle.
- Core i7-3960X
- ASUS Rampage IV Extreme X79
- 16GB DDR3-1600
- GeForce GTX 980 Reference (344.48)
- ASUS R9 290X DirectCU II (14.9.2 Beta)
Mantle Additions and Improvements
AMD is proud of this release as it introduces a few interesting things alongside the inclusion of the Mantle API.
- Enhanced-quality Anti-Aliasing (EQAA): Improves anti-aliasing quality by doubling the coverage samples (vs. MSAA) at each AA level. This is automatically enabled for AMD users when AA is enabled in the game.
- Multi-threaded command buffering: Utilizing Mantle allows a game developer to queue a much wider flow of information between the graphics card and the CPU. This communication channel is especially good for multi-core CPUs, which have historically gone underutilized in higher-level APIs. You’ll see in your testing that Mantle makes a notable difference in smoothness and performance high-draw-call late game testing.
- Split-frame rendering: Mantle empowers a game developer with total control of multi-GPU systems. That “total control” allows them to design an mGPU renderer that best matches the design of their game. In the case of Civilization: Beyond Earth, Firaxis has selected a split-frame rendering (SFR) subsystem. SFR eliminates the latency penalties typically encountered by AFR configurations.
EQAA is an interesting feature as it improves on the quality of MSAA (somewhat) by doubling the coverage sample count while maintaining the same color sample count as MSAA. So 4xEQAA will have 4 color samples and 8 coverage samples while 4xMSAA would have 4 of each. Interestingly, Firaxis has decided the EQAA will be enabled on Beyond Earth anytime a Radeon card is detected (running in Mantle or DX11) and AA is enabled at all. So even though in the menus you might see 4xMSAA enabled, you are actually running at 4xEQAA. For NVIDIA users, 4xMSAA means 4xMSAA. Performance differences should be negligible though, according to AMD (who would actually be "hurt" by this decision if it brought down FPS).
The added performance capability of the multi-threaded command buffer allows the game to take better advantage of the CPU cores. This was the key tenet of the Mantle API to begin with, but we have previously focused on lower end processors as the main benefactor. In this case, AMD claims that R9 290X cards running in CrossFire might see as much as a 20% increase when going from a 6-core to an 8-core processor.
A reintroduction of split-frame rendering for CrossFire is perhaps the most interesting of these three Mantle additions. Even though all current SLI and CrossFire implementations are using AFR (alternate frame rendering), it wasn't always the case. When NVIDIA and ATI were bringing back the world of multi-GPU gaming, SFR and AFR were competing solutions but, as game engines got more complex, splitting the per-frame workload was problematic and we settled on AFR. Since Mantle's release though, we knew that multi-GPU processing required a lot more work by the developer to implement as it needed to be done on a per-game, per-engine basis. That can be problematic for some games and game engines, but Firaxis used this as an opportunity to offer SFR again for a specific goal. AMD's Robert Hallock explains:
Mantle empowers game developers with full control of a multi-GPU array and the ability to create or implement unique mGPU solutions that fit the needs of the game engine. In Civilization: Beyond Earth, Firaxis designed a “split-frame rendering” (SFR) subsystem. SFR divides each frame of a scene into proportional sections, and assigns a rendering slice to each GPU in AMD CrossFire™ configuration. The “master” GPU quickly receives the work of each GPU and composites the final scene for the user to see on his or her monitor.
If you don’t see 70-100% GPU scaling, that is working as intended, according to Firaxis. Civilization: Beyond Earth’s GPU-oriented workloads are not as demanding as other recent PC titles. However, Beyond Earth’s design generates a considerable amount of work in the producer thread. The producer thread tracks API calls from the game and lines them up, through the CPU, for the GPU’s consumer thread to do graphics work. This producer thread vs. consumer thread workload balance is what establishes Civilization as a CPU-sensitive title (vs. a GPU-sensitive one).
Because the game emphasizes CPU performance, the rendering workloads may not fully utilize the capacity of a high-end GPU. In essence, there is no work leftover for the second GPU. However, in cases where the GPU workload is high and a frame might take a while to render (affecting user input latency), the decision to use SFR cuts input latency in half, because there is no long AFR queue to work through. The queue is essentially one frame, each GPU handling a half. This will keep the game smooth and responsive, emphasizing playability, vs. raw frame rates.
Let me provide an example. Let’s say a frame takes 60 milliseconds to render, and you have an AFR queue depth of two frames. That means the user will experience 120ms of lag between the time they move the map and that movement is reflected on-screen. Firaxis’ decision to use SFR halves the queue down to one frame, reducing the input latency to 60ms. And because each GPU is working on half the frame, the queue is reduced by half again to just 30ms.
In this way the game will feel very smooth and responsive, because raw frame rate scaling was not the goal of this title. Smooth, playable performance was the goal. This is one of the unique approaches to mGPU that AMD has been extolling in the era of Mantle and other similar APIs.
Interesting stuff. The staff at PC Perspective is always interested in ways game developers and hardware vendors can improve game smoothness and how it feels. Firaxis had similar commentary on its blog about the release:
What does this have to do with multi-GPU? Current multi-GPU solutions are implemented in the driver, without knowledge of, or help from, the game rendering engine. With the limited information available drivers are almost forced to implement AFR, or Alternate Frame Rendering, which is an approach where individual frames are rendered entirely on a single GPU. By alternating the GPU used each frame, rendering for a given frame can be overlapped with rendering of previous frames, resulting in higher overall frame rates. The cost, however, is an extra frame of latency for each GPU past the first one. This means that AFR multi-GPU solutions have worse response time than a single GPU capable of similar frame rates.
Rather than trying to maximize frame rates while lowering quality [with AFR], we asked ourselves a question: How fast can we get a dual-GPU solution without lowering quality at all? In order to answer this question, we implemented a split-screen (SFR) multi-GPU solution for the Mantle version of the game. Unlike AFR, SFR breaks a single frame into multiple parts, one per GPU, and processes the parts in parallel, gathering them into the final image at the end of the frame. As you might expect, SFR has very different characteristics than AFR, and our choice was heavily motivated by our design of the Civilization rendering engine, which fits the more demanding requirements of SFR well. Playing the game with SFR enabled will provide exactly the same quality of experience as playing with a single, more powerful GPU.
This feature is only available when running Beyond Earth on a Mantle-capable graphics and with the game in Mantle mode. One note that found its way to us somewhat late: ensure you are testing Mantle mGPU by setting “Enable MGPU” to 1 in the graphics initialization file. That file is found in the My Documents/My Games folder. Apparently, enabling CrossFire in the driver control panel isn't quite enough to get the job done.