… But Is the Timing Right?

What we’re waiting for could have been done for years… but wasn’t. Why?

Windows 10 is about to launch and, with it, DirectX 12. Apart from the massive increase in draw calls, Explicit Multiadapter, both Linked and Unlinked, has been the cause of a few pockets of excitement here and there. I am a bit concerned, though. People seem to find this a new, novel concept that gives game developers the tools that they've never had before. It really isn't. Depending on what you want to do with secondary GPUs, game developers could have used them for years. Years!

Before we talk about the cross-platform examples, we should talk about Mantle. It is the closest analog to DirectX 12 and Vulkan that we have. It served as the base specification for Vulkan that the Khronos Group modified with SPIR-V instead of HLSL and so forth. Some claim that it was also the foundation of DirectX 12, which would not surprise me given what I've seen online and in the SDK. Allow me to show you how the API works.

Mantle is an interface that mixes Graphics, Compute, and DMA (memory access) into queues of commands. This is easily done in parallel, as each thread can create commands on its own, which is great for multi-core processors. Each queue, which are lists leading to the GPU that commands are placed in, can be handled independently, too. An interesting side-effect is that, since each device uses standard data structures, such as IEEE754 decimal numbers, no-one cares where these queues go as long as the work is done quick enough.

Since each queue is independent, an application can choose to manage many of them. None of these lists really need to know what is happening to any other. As such, they can be pointed to multiple, even wildly different graphics devices. Different model GPUs with different capabilities can work together, as long as they support the core of Mantle.

DirectX 12 and Vulkan took this metaphor so their respective developers could use this functionality across vendors. Mantle did not invent the concept, however. What Mantle did is expose this architecture to graphics, which can make use of all the fixed-function hardware that is unique to GPUs. Prior to AMD's usage, this was how GPU compute architectures were designed. Game developers could have spun up an OpenCL workload to process physics, audio, pathfinding, visibility, or even lighting and post-processing effects… on a secondary GPU, even from a completely different vendor.

Vista's multi-GPU bug might get in the way, but it was possible in 7 and, I believe, XP too.

Game developers didn't do this, however. It was a hassle to develop, and I'd assume QA would be a nightmare too, if someone bothered. I believe id Software would use a secondary GPU for CUDA texture processing in RAGE, but that is all I know of. It was more popular outside of gaming software, such as those people who attach a half-dozen GPUs, which may or may not be the same model or vendor, to mine as many Bitcoins as possible.

These new APIs are coming out in a better time, though.

Back then, it was unlikey that a gaming device would have a second, unmatched graphics card available to access. NVIDIA gave it a whirl with PhysX offloading, where their users could get a boost when processing large physics loads by leaving an old GeForce graphics card installed. It did not catch on too much, although I was one of the ones to try.

On-processor graphics is more common, though. For the last couple of years, it was difficult to purchase a new, consumer CPU without getting a GPU in the same package. Windows did not expose this as a compute device by default, though. The hardware would not appear in Device Manager unless you enabled it in your BIOS and a monitor was detected on it. Those who did, however, would have no problem accessing its OpenCL driver if it had one. It could be used as a secondary compute device, while the primary GPU did graphics. As far as I can tell, Windows 10 enables on-processor graphics all the time, even without a display.

Beyond the small available market, a second problem arose: the consoles.

Neither the Xbox 360 nor the PlayStation 3 had a graphics processor that was capable of OpenCL. The first manufacturer to support Compute Shaders in a console at all was Nintendo with the Wii U. There were third-party efforts to make OpenCL run on the PlayStation 3's Cell processor, but I believe those only applied to the few Linux developers that Sony wasn't able to completely chase off of the PS3. For titles that were ported from those platforms, taking advantage of secondary graphics as a compute device would be a significant burden. Even pure PC developers, including software as well as games, avoid OpenCL. They like compute shaders, but they don't like accessing them through OpenCL.

Hey, it'sa U… for once. Image Credit: LoFi Gaming

That is where Vulkan and DirectX 12 could shine. They grab much of the performance and flexibility from OpenCL and wrap it in a graphics API that developers already want to use. Their existence might lead to more variety in how AI is calculated or lighting is performed. Any modern GPU in your system is enumerated and can be attached to a stream of commands, regardless of whatever else is in your system and active.

But they didn't do it first.