Asynchronous Atomic Ghandi coming in Civ VI

Subject: General Tech | July 13, 2016 - 02:23 PM |
Tagged: gaming, dx12, civilization VI, asynchronous compute, amd

AMD, 2K and Firaxis Games have been working together to bring the newest DX12 features to Civilization IV and today they have announced their success.  The new game will incorporate Asynchronous Compute in the engine as well as support for Explicit Multi-Adapter for those with multiple GPUs.  This should give AMD cards a significant performance boost when running the game, at least until NVIDIA can catch up with their support for the new technologies ... HairWorks is not going to have much as effect on your units as Async Compute will.

If you haven't been keeping an eye out, we have seen the video of Egypt's leader which also talk about terrain adjacency bonuses and England's leader and civilization specific units and buildings.

View Full Size

"Complete with support for advanced DirectX 12 features like asynchronous compute and explicit multi-adapter, PC gamers the world over will be treated to a high-performance and highly-parallelized game engine perfectly suited to sprawling, complex civilizations."

Here is some more Tech News from around the web:



July 13, 2016 | 04:15 PM - Posted by Butthurt Beluga

Basically AMD is doing with DX12 and Vulkan what they should have been doing with their own API, Mantle, from the start.
Now their cards will be showing their full potential from the get go, rather than three years down the line.

With basically every AAA and even AA games supporting DX12/Vulkan, as well as 3DMark having a DX12 Bench... oh man, things are not looking good for Nvidia especially after the bloodbath that was Doom running Vulkan.

I think AMD is finally showing their teeth.

But then again supposedly there is nothing between Polaris and Vega, and Vega being set to launch in 2017, AMD not only doesn't have a 390X/Fury/Fury X replacement but doesn't have anything to compete with the GTX 1070/1080 in non-DX12/Vulkan titles.

July 13, 2016 | 04:26 PM - Posted by JohnGR

They doesn't have a Fury X replacement, but they do have Fury X. There is one Fury X model at newegg selling for $400
- others are priced higher -

and in the latest Doom benchmarks with Vulkan, Fury X is far ahead from 1070.

AMD just needs 1-2 more titles where Fury cards score at least as good as a 1070 and a price cut to keep selling those cards until Vega. They are not going to be saved by doing something like this, but in their financial position, they don't have much choices, and they definitely don't have a GPU between Polaris 10 and Vega to show.

July 13, 2016 | 08:41 PM - Posted by patrickjp93 (not verified)

No, the Fury X is only ahead with AA at 1440p and 4K, and that's because AA invokes a 32-queue Asynchronous Compute kernel where Pascal one gains performance up to 16 and then goes back to no gains up to 32 before nose-diving beyond that. The only reason the Fury X is doing well in those benchmarks is the AA based on Asynchronous Compute.

July 14, 2016 | 02:20 AM - Posted by Anonymous (not verified)

That's like saying, "The only reason the 980/780/680/480/whatever is doing well in those benchmarks is the x64 tessellation."

But something tells me you used to think that argument was a joke.

July 17, 2016 | 03:38 PM - Posted by Anonymous Nvidia User (not verified)

I'll show you another thing the Furyx is by far ahead of 1070. It's power consumption. 348 watts with it under stress test. Yikes.,4196-7.html

1070 FE tops out at around 180 stressed.

Used same source but we all know that watt for watt the 1070 is way beyond a Furyx no matter which one is used.

July 13, 2016 | 04:28 PM - Posted by RushLimbaughisyourdaddy (not verified)

Maybe the rumors of Vega launching early are partially true. If vega IS both the 490 and flagship, then it is possible that 490 is released at or about the rumored early vega release date, and then the flagship vega is released later on.

July 13, 2016 | 04:42 PM - Posted by Shambles (not verified)

Why make a DX11 and DX12 branch when you can just make a single Vulkan branch? There are many people out there who aren't running windows 10 or don't have DX12 GPU's, myself being one of them.

Granted they have lots of time to work on it. I don't even bother playing new releases of CIV now until all the expansions have been completed.

July 13, 2016 | 04:58 PM - Posted by Anonymous (not verified)

Vulkan will get the majority of development investment because Vulkan will be on the majority of devices from phone/tablets to PCs/Laptops and above. Vulkan support will become much more important that any DX12/other DX version as Vulkan will be on many more billions of devices than DX12's current install base of windows 10, which is around 300 million and going nowhere fast.

July 13, 2016 | 09:41 PM - Posted by remc86007

That same argument has been made about OpenGL vs DirectX for years...and yet 97% of the games I play run DirectX.

July 15, 2016 | 06:11 AM - Posted by Stefem (not verified)

True, but one of the main reason for that was that one of the major graphics vendor ever refused to make a good driver implementation at the expense of pain for developers who liked to use OGL

July 13, 2016 | 09:52 PM - Posted by DerekJpDev (not verified)

I feel like I heard this before with Beta-max vs VHS and again with Blu-Ray vs. HD-DVD. Who really knows at this point in time. Both APIs have their strong and weak points, and they both have a large install base.

Right now there is a vibrant DirectX development community as well as a large mobile contingent all on OpenGL ES and Metal. I think they are all viable. Ultimately time will tell.

July 13, 2016 | 04:46 PM - Posted by Anonymous (not verified)

Nvidia can try to implement async compute in software/middleware but the latency is going to be much worse for Nvidia until they can get some fully in the GPU’s hardware support for async compute. AMD is ahead with their GCN acync compute fully implemented in AMD's ACE units and hardware schedulers. Nvidia can not expect that any non hardware based GPU thread Scheduling/Dispatch implemented in software will ever be as responsive as GPU thread Scheduling/Dispatch managed by the GPU’s hardware as is done with AMD’s GCN SKUs.

Nvidia will have to get the full hardware support for all async compute into their Volta consumer SKUs and hope that their software implementation is enough in Pascal to at least not leave Nvidia too far behind while Nvidia frantically gets their full hardware support for async compute online for Volta.

AMD’s is now starting to enjoy the fruits of their long labor with AMD’s Mantle project leading to both the Vulkan and DX12 close to the metal graphics API designs, and even AMD’s older GCN SKUs are showing a marked improvement under Vulkan and DX12 over DX11 and OpenGL. AMD is already 4 generations in with GCN and Async compute and Vega will have more tweaks in its hardware over and above what Polaris offers, so AMD will be much further along in their refinement processes for full hardware based async compute on the GPU.

AMD will be going after the GPU accelerator market also, along with its Zen server SKUs and its Zen/Vega HPC/workstation APUs on an interposer module SKUs. So AMD is in line to have some very price/performance competitive Zen/Vega server/HPC APU SKUs that have some very wide coherent Zen CPU die to Big Fat Vega GPU die connection fabrics etched out on the silicon interposer that will host the ZEN cores die, the large Vega GPU die, and the HBM2 die stacks. So look for some HBM like transfer speeds/high effective bandwidth between the Zen cores die and the fat Vega GPU accelerator die, in addition to whatever the HBM2 will offer, and some of these server/HPC SKUs will be getting 6 HBM2 stacks for even more effective bandwidth. The JEDEC standard only specifies in its HBM/HBM2 standard what is related to operating one HBM stack, so users of HBM can have from one to as many HBM/HBM2 stacks as will fit on an interposer and can be handled by the processors memory controller.

July 17, 2016 | 04:04 PM - Posted by Anonymous Nvidia User (not verified)

Yeah that latency help of async on Steam VR really benefits AMD. Not.

Best for last

Wow. Note how much the fidelity drops and picks up on the graphs. Ideal is a straight line like on Titan x which is no drop in visual fidelity. Which brand of cards is likely to give a better VR experience? Hint it's the one's without a lick of async Maxwells.

The Pascals of course get perfect 11 with straight line across too. But Maxwells being better without async is criminal right.

July 13, 2016 | 04:51 PM - Posted by Anonymous (not verified)

Could anyone explain why AMD benefits from low-level APIs while Nvidia does not? I personally see three options:
1. Nvidia DX11 drivers are so good their GPUs are fully utilized so there is no hidden potential DX12 might access. AMD drivers, on the contrary, are terrible hence the performance boost in DX12 when the job of the driver team is mostly handled by game developer
2. All Nvidia architectures including Pascal are poorly designed and can't really handle DX12. I find that hard to believe. Also, don't forget Async Compute boosts performance by 5-10% MAX, confirmed by Hitman devs
3. Game developers optimize games only for AMD hardware. Unlikely since we've already seen at least four games from different teams (Doom, Hitman, AoC, Rise of Tomb Raider) that perform better on AMD at DX12

Sorry for mistakes, English is not my first language.

July 13, 2016 | 05:16 PM - Posted by Paul EFT (not verified)

I think this guy has a decent enough answer.

Source: Reddit (Alarchy)

July 13, 2016 | 05:22 PM - Posted by Anonymous (not verified)

Game developers optimize for the Graphics APIs(DX12 and Vulkan) and AMD's GPU hardware has full hardware support for async compute while Nvidia does not have the hardware support in its GPUs for async compute. So any Processor CPU/GPU/other that does not have any fully in the processor’s hardware support for async compute and hardware based processor thread dispatch/scheduling is going to be at a disadvantage relative to a GPU/CPU/other processor that has full in hardware support for async compute and hardware based processor thread dispatch/scheduling.

The new graphics APIs and the simpler GPU hardware drivers that go along with using DX12/Vulkan allow for games developers to have much lower level control over the GPU's hardware and if the GPU's hardware lacks the hardware support for async compute and hardware based processor thread dispatch/scheduling then that GPU's driver has to call on a software emulation layer to emulate in software what the GPU lacks in its hardware, and software emulation is always slower and inefficient relative to GPUs with full hardware support for async compute and hardware based processor thread dispatch/scheduling.

Hardware based async compute and hardware based processor thread dispatch/scheduling in any CPU/GPU/Other processor’s hardware is always going to be faster and more efficient than any software based solution.

July 13, 2016 | 07:28 PM - Posted by Anonymous (not verified)

That reddit link explains it better imo

July 13, 2016 | 11:36 PM - Posted by Anonymous (not verified)

In the Reddit post Alarchy states: "Async shaders are a FEATURE of GCN, but they are not necessary to do async compute"

But he also admits that "So yes, Nvidia can "do" async compute, no they don't have async shaders"

So Nvidia has no async-compute Shaders in its hardware, so Nvidia has to emulate async-compute shaders in software!

That's software based async-compute for Pascal managed by software and not as quick to respond to asynchronous events as AMD's GCN asynchronous compute Shaders. Asynchronous compute Shaders fully implemented in hardware is what is giving AMD's GCN based GPU's that sizable improvment in DX12 and Vulkan and what will give AMD's shaders the low latency advantage over any software managed "async-compute".

Alarchy(favors Nvidia) is good at making async-comute appear like splitting hairs with but having asynchronous Shaders implemented fully in hardware is a big difference just look at AMD's GCN GPUs improvments with DX12 and Vulkan. The VR games designers are looking for the low latency that asynchronous shaders in hardware will provide.

July 14, 2016 | 12:20 PM - Posted by Stefem (not verified)

"So Nvidia has no async-compute Shaders in its hardware, so Nvidia has to emulate async-compute shaders in software!"

Hit the brakes, that is just your own conclusion...

Alarchy also add:
"People that don't understand will say things like "hardware is ALWAYS faster" or "software is EMULATING" (implying it's bad)"

People need to clear its mind otherwise will end up distorting reality to match it's own ideas.

July 14, 2016 | 12:54 PM - Posted by Anonymous (not verified)

Yes it's bad if any processor management CODE implemented in software has to be itself fetched decoded and executed on a processor to do its work, and there is no way in hell that software can be used to manage a processor's hardware execution components that themselves can execute many instructions in the time is takes to even fetch from cache memory one software based processor management instruction.

Does Intel implement their version of SMT(HyperThreading) in software and not in the CPU’s hardware! Oh what a train wreck that would be if Intel would have to attempt to implement any of its SMT capabilities on its CPU cores in software. And a CPU’s usage of fully in hardware SMT is a very in hardware async-compute form of CPU processor thread asynchronous management, and GPU/processors are no different than CPUs with respect to having their own GPU processor thread management scheduling/dispatch functionality fully implemented in the GPU/processor’s hardware.

July 14, 2016 | 07:19 AM - Posted by Anonymous (not verified)

It's mostly 1 with a little bit of 2. Devs are going to profile on nVidia hardware almost exclusively, 3 is completely an afterthought. The API of choice really comes down to ease of development, performance, and audience; DirecX wins overall.

July 14, 2016 | 12:31 PM - Posted by Anonymous (not verified)

Direct X/DX12 on 300 million devices, and Vulkan will be on billions of devices, and what API is going to get the most R&D investment. I'm sure seeing a whole lot of Mobile gaming ads on TV, relative to any PC/laptop only gaming titles. And Vulkan is the winner for device installs from phones to supercomputers. DX12 is a restricted API(locked down to windows 10), and DX11 is on its way out. Vulkan will be on all OSs across all devices markets, and that's a whole lot more devices getting support and R&D from their OEMs who are and will continue to be members of the Khronos group's many development committees, both the Vulkan and other Khronos development committees!

Vulkan will be the API with the most development/investment across the most devices, there is no way for M$ to get the numbers all by itself, Vulkan will be on most all devices big and small. Tremble in fear ye lords of Redmond, the Vulkan API is erupting and spreading world wide!

July 13, 2016 | 07:24 PM - Posted by Anonymous (not verified)

So new Civ VI DX12 will have the SUPER QUICK turns like Civ. BE with MANTLE, even lategame?

OH YEEEEEEAH! No more waiting! Bring it AMD!

July 14, 2016 | 06:15 AM - Posted by Anonymous (not verified)

uh just read the article and was confused why Civ IV would get the DX12 treatment.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.