A New Perspective on Multi-GPU Memory Management

Manufacturer: PC Perspective

Why Two 4GB GPUs Isn't Necessarily 8GB

We're trying something new here at PC Perspective. Some topics are fairly difficult to explain cleanly without accompanying images. We also like to go fairly deep into specific topics, so we're hoping that we can provide educational cartoons that explain these issues.

This pilot episode is about load-balancing and memory management in multi-GPU configurations. There seems to be a lot of confusion around what was (and was not) possible with DirectX 11 and OpenGL, and even more confusion about what DirectX 12, Mantle, and Vulkan allow developers to do. It highlights three different load-balancing algorithms, and even briefly mentions what LucidLogix was attempting to accomplish almost ten years ago.

View Full Size

If you like it, and want to see more, please share and support us on Patreon. We're putting this out not knowing if it's popular enough to be sustainable. The best way to see more of this is to share!

Open the expanded article to see the transcript, below.


Crossfire and SLI allow games to load-balance across multiple GPUs. It is basically impossible to do this in OpenGL and DirectX 11 otherwise. Vulkan and DirectX 12 provide game developers with the tools to implement it themselves, but they do not address every limitation. Trade-offs always exist.

In the older APIs, OpenGL and DirectX 11, games and other applications attach geometry buffers, textures, materials, and compute tasks to the API's one, global interface. After, a draw function is called to submit that request to the primary graphics driver. This means that work can only be split from within the driver, and only to the devices that driver controls, which prevents cross-vendor compatibility. Lucidlogix created software and hardware that pretended to be the primary graphics driver, loading GPUs from mismatched vendors behind the scenes. It never took off.

With Vulkan and DirectX 12, rather than binding data and tasks to a global state, applications assemble commands and push them onto lists. Not only does this allow multiple CPU threads to create work independently, but these lists can also point to any GPU. This is how OpenCL and other compute APIs are modeled, but Mantle was the first to extend it to graphics. Developers can load-balance GPUs by managing multiple lists with different destinations. This also means that the developer can control what each GPU stores in its memory, and ignore the data it doesn't need.

That said, even though the game developer has full control over tasks and memory, it doesn't mean that it will be any more efficient than SLI and Crossfire. To load-balance, some algorithm must be chosen that can split work between multiple GPUs, and successfully combine the results. The Alternate Frame Rendering algorithm, or AFR, separates draw calls by the frames  they affect. If you have three GPUs, then you can just draw ahead three frames at a time. It's easy to implement, and performance scales very well when you add a nearly-identical card (provided the extra frames add to the experience).

Memory, on the other hand, does not scale well. Neighboring frames will likely draw the exact same list of objects, just with slightly adjusted data, such as camera and object positions. As a result, each GPU will need their own copy of this data in their individual memory pools. If you have two, four-gigabyte cards, they will each store roughly the same four gigabytes of data. This is a characteristic of the algorithm itself, not just the limited information that Crossfire and SLI needed to deal with on OpenGL and DirectX 11.

Other algorithms exist, however. For comparison, imagine a fighting game or a side-scroller. In these titles, objects are often separated into layers by depth, such as the background and the play area. If these layers are rendered by different GPUs into separate images, they could be combined later with transparency or z-sorting. In terms of memory, each GPU would only need to store its fraction of the scene's objects (and a few other things, like the layer it draws). A second benefit is that work does not need to split evenly between the processors. Non-identical pairings, such as an integrated GPU with a discrete GPU, or an old GPU with a new GPU, could also work together, unlike AFR. I say could, because the difference in performance would need to be known before the tasks are split. To compensate, the engine could vary each layer's resolution, complexity, quality settings, and even refresh rate, depending on what the user can notice. This would be similar to what RAGE did to maintain 60 FPS, and it would likely be a QA disaster outside of special cases. Who wouldn't want to dedicate a Titan graphics card to drawing Street Fighter characters, though?

Then again, video memory is large these days. It might be better, for quality and/or performance, to waste RAM in exchange for other benefits. AFR is very balanced for multiple, identical GPUs, and it's easy to implement; unfortunately, it could also introduce latency and stutter, and it is inefficient with video memory. Layer-based methods, on the other hand, are complicated to implement, especially for objects that mutually overlap, but it allows for more control in how tasks and memory are divided. VR and stereoscopic 3D could benefit from another algorithm, where two similar GPUs render separate eyes. Like AFR, this is inefficient with memory, because both eyes will see roughly the same things, but it will load-balance almost perfectly for two identical GPUs. Unlike AFR, it doesn't introduce latency or stutter, but it is useless outside of the two nearly-identical GPUs. Other GPUs will either idle, or be used for something else in the system, like physics or post-processing.

In any case, the developer knows what their game needs to render. They can now choose the best algorithm for themselves.

August 23, 2016 | 10:35 AM - Posted by BrightCandle (not verified)

So far games companies haven't done much to support SLI/Crossfire on average. Much of the advancement has come from Nvidia/AMD creating profiles to work around game issues to support more than 1 card for AFR. With DX12 and games this year we are seeing a lot less support than we have done in previous years.

So while DX12 could be the lift off dual cards have always needed for a great experience in practice its such a small part of the market that I can't see it being catered to. I suspect DX12 is the end of dual cards.

For VR we already have a lot of games and 99.9% of them do not support dual cards, despite the clear lack of GPU rendering performance we have today. The Nvidia funhouse is a technical demo showing off SMP and SLI but no one else is using these speed up technologies.

I just don't see where the incentive is for games developers and publishers to expend resources developing the technology especially after how long it took AoS to get dual vendor working right and how little it aided the experience in the end. The GPU manufacturers have a clear incentive as it sells more GPUs but games developers aren't going to sell more games to support dual card PC users. So without the games having support and subsequently less people buying more than 1 card and even Nvidia writing off anything other than dual SLI for games well its not looking good.

August 23, 2016 | 12:49 PM - Posted by renz (not verified)

this. and yet some people touting DX12 will spark more games to use multi gpu. multi gpu is gpu maker interest so they can sell more gpu. for game developer it did not makes their sales better in fact they only creating more problem for themselves because multi gpu often brings in issues that did not exist in single gpu only operation.

August 23, 2016 | 01:15 PM - Posted by Scott Michaud

The Oxide Engine is actually quite interesting. Their rendering algorithm allows another way to split tasks between multiple GPUs. They did it in Mantle, but have not yet implemented it in DirectX 12. I've been told it's coming to a future engine version, though. Hopefully, it will be back-ported to the game, too. I'm planning on doing a follow-up animation to deep-dive their algorithm.

August 23, 2016 | 04:06 PM - Posted by Anonymous (not verified)

Do it in Vulkan and it will work across many different OSs and device markets and not be as limited as DX12 is to mostly some windows 10(Serf) PC/Laptops.

August 23, 2016 | 01:56 PM - Posted by Jann5s

For me the most interesting use case would be to put the now idling iGPU which is in 90% of the gaming desktops to work and have it render some clouds or whatever. If there is no iGPU you can always offload the work to the dGPU or any extra CPU cores.

August 24, 2016 | 04:56 AM - Posted by Anonymous (not verified)

That may have a negative impact on your CPU cores. if the die gets too hot, you may find your CPU being throttled

August 24, 2016 | 12:17 PM - Posted by Jann5s


August 23, 2016 | 09:30 PM - Posted by Anonymous (not verified)

I disagree. DX12/Vulkan is the BEGINNING of multi-GPU, not the end.

Changing how things work is not simple, and certainly not fast. The game engine, such as Unreal 4 Engine, will add in support for things like SFR to make it easier for game developers.

SFR will trickle into a few games, and get more support every year. It's even conceivable a PS5 and "XBox 2" will have a multi-GPU setup since by the time they are released (if they are) software should have pretty good support for this.

It's CHEAPER to use multiple, smaller cheaps, provided the software supports this.

Again though, saying DX12 is the "end" because you don't see results already is shortsighted. It's similar to assuming SteamOS is dead in the water because it's not popular yet.

Both have long term strategies, though multi-GPU is the only certainty.

August 23, 2016 | 09:31 PM - Posted by Anonymous (not verified)

"chips" (not cheaps)

August 24, 2016 | 02:55 AM - Posted by renz (not verified)

why did many people thinking that SFR will solve multi gpu issue? SFR is not new. they exist as long as AFR did. looking at CIV BE result from using SFR in Mantle it is obvious that game developer nor gpu maker have solved the issue surrounding SFR. that issue alone makes going multi gpu are not worth it with SFR.

August 23, 2016 | 11:09 AM - Posted by Anonymous (not verified)

I think that DX12 and Vulkan GPU load balancing has the most potential for innovation with the larger gaming industry/developers contributing to development of GPU load balancing algorithms and other software/API SDKs and software/middleware solutions to help automate the load balancing on multi-GPU based PC/Laptop/other systems. The CF/SLI propitary solutions always have had limited development resources utilized towards the development of better gaming support for the CF/SLI optimized games, while with Vulkan/DX12 and Milti-GPU adaptor, games developers will have control over that aspect of the GPUs via a more standardized graphics API solution from Vulkan/DX12.

So is should not take much time for the games industry/developers to create a gaming engine/middle ware/standard graphics API solutions to profile all the GPUs in use and develop solutions to allow for the most efficient utilization of any and all GPUs in a standardized and simple to implement way. If all the gaming Industry, games developers/Gaming engine developers, API(Vulkan, DX12) pool their resources to make the management of multi-GPU load balancing more standarized across the entire games/graphics software industries.

PCIe 4.0 is going to offer more inter-GPU bandwidth for making better use of the VRAM across more than one GPU, so that and maybe some latency improvments can be had. Maybe there needs to be some more work towards a standardized the way that hardware/APIs and GPUs can communicate with each other Via PCIe that can be brought into the Vulkan/DX12 API and let the games developers get at that functionality Via the graphics APIs also. AMD is using XDMA over PCIe and Nvidia is using its SLI bridge.

August 23, 2016 | 12:53 PM - Posted by renz (not verified)

except most game developer are not interested to deal with multi gpu which only a very small subset of pc gamer.

August 23, 2016 | 01:56 PM - Posted by BillDStrong

With the caveat that Vulkan does not currently support multi GPU memory sharing, so it is dead on arrival. They are working on implementing it now, but no games can be made with support for multi GPU until that work is done.

August 23, 2016 | 03:56 PM - Posted by Scott Michaud

Yeah. The Khronos Group knows they're lacking in a few areas of multi-GPU support. It's a top priority for Vulkan Next.

August 23, 2016 | 04:12 PM - Posted by Anonymous (not verified)

It's in Vulkan but it needs to be improved, and it will be, as many will not be moving to windows 10. And Vulkan will have a much larger install base across many more devices. The money will be there for any Vulkan Multi-GPU, with Valve and VR gaming supporting plenty of development for Vulkan and Multi-GPU.

August 23, 2016 | 12:25 PM - Posted by Anonymous (not verified)

I have a 1200 watt corsair power supply, whenever i enable sli and play bf4 my system shuts down. When sli is disabled bf4 works perfect. Tried a few games but only bf4 does this and of course its my main game so sli is always disabled.
2x 780 watercooled temps are 24-26c idle and load 45c. I've tried many different drivers same result. Would the power supply be the cause of the instant shut downs? Cpu is overclocked but ive tried at default same thing.

August 23, 2016 | 12:51 PM - Posted by Morry Teitelman

Sounds like it could a power issue, but those type of issues are very tricky to track down.   Does your board have an aux PCIe power connector and do you have a power lead from your PSU connected to it?

You could try a new PSU and see if it helps.  If problem still occurs, it may point to a board power-related issue as well.

Good luck...

August 23, 2016 | 03:17 PM - Posted by Anonymous (not verified)

Thanks morry, I have an evga power booster in now. X99 asus deluxe board no pcie power. Same problem. I thought maybe an outlet problem switch to a different room same problem. reseated the cards same problem. Switched power connectors to different gpu's same problem. If i have power surge enabled in bios thats the error i get so i disabled the power surge option in bios and instead of an error it just shuts down. Really weird problem.

August 23, 2016 | 01:30 PM - Posted by BlackDove (not verified)

Excellent visuals and descriptions. Please do more videos like this.

August 23, 2016 | 01:36 PM - Posted by John H (not verified)

+1 - Can't say it better.

August 23, 2016 | 01:36 PM - Posted by Scott Michaud

Thanks! : D Best way to see more is to share. As you can guess, it takes a LOT of effort, so our targets are pretty high. We were swinging for the fences.

August 23, 2016 | 01:52 PM - Posted by Jann5s

+1, amazing stuff,

August 23, 2016 | 04:32 PM - Posted by Anonymous (not verified)

I enjoyed this as well. Keep 'em coming.

August 23, 2016 | 01:44 PM - Posted by remc86007

I loved this! If you guys made these regularly I think you would see your YouTube subs grow exponentially.

August 23, 2016 | 04:01 PM - Posted by Thedarklord

Really enjoyed this format, brings something different.

I also hope you can use animations like this for more in depth tech run downs in the future.

August 23, 2016 | 07:38 PM - Posted by Tim (not verified)

That is our hope! We definitely need everyone's help to get the word out because the animations take a ton of time and effort!

The best way to help us bring you more animations is to share and/or donate to our Patreon :).

August 23, 2016 | 04:17 PM - Posted by Anonymously Anonymous (not verified)

It would be awesome if multi-gpu support across architectures became widespread. I could make use of that 4 yr old GPU on the shelf at home and not save up for a matching GPU that is currently in my gaming pc. It seems to me that having mixed architecture multi-gpu support would drive used GPU sales up and new GPU sales down some.
But that is if it ever takes off.

August 23, 2016 | 07:43 PM - Posted by Anonymous (not verified)

Scott, the graphics animation/explanation is AWESOME! What program/set of programs did you use for that?

And yes, that was excellent content in my opinion. Thank you.

August 23, 2016 | 08:44 PM - Posted by Scott Michaud


It was almost entirely done in Blender, although stitching the frames together and a bit of post processing was done in Adobe After Effects CC. A couple of textures (ie: the PCIe pins) were done in Photoshop CC, and the audio was edited in Audition CC.

But yeah, it was like 90% Blender with Cycles.

August 27, 2016 | 08:34 PM - Posted by Anonymous (not verified)

Thank you so much man.

August 24, 2016 | 02:15 AM - Posted by Anonymous (not verified)

Great Video. Pretty good at explaining what DX12\Vulkan is as well.

August 24, 2016 | 09:23 AM - Posted by Anonymous (not verified)

Personally I think multi GPU is the way forward. Not many people can realistically warrant spending £700+ on a single card purchase. Developers however are constantly pushing the envelope when it comes to how their product looks and what effects are used. That leaves us, the end user, either scaling back the settings so as to make the game playable and missing out on all the latest wizardry or turning all the latest effects up to max and trying to play at sub 30 fps. Hairwork is a prime example. Would love to enable it but doing so makes the game a bit too sluggish. The sweet spot is just where AMD are positioning themselves with the 480. Affordable to most people who actually do game and come Christmas or birthday can add another to significantly increase their enjoyment of the latest titles. We all see the latest releases being shown off by their very proud revs, how stunning they look, so smooth... Problem is the system they highlight it on is way out of reach of your typical non lottery winning PC gamer.

August 24, 2016 | 12:22 PM - Posted by Erich Swafford (not verified)

Fantastic video, please do more!

August 24, 2016 | 02:24 PM - Posted by killurconsole

Great work , hope you get more support.

August 24, 2016 | 07:55 PM - Posted by Scott Michaud

Thanks! We'd want keep doing this indefinitely if it's sustainable!

August 24, 2016 | 04:59 PM - Posted by TheMonotoner

I feel that this was an excellent way of portraying this message! I would definitely like to see more video shorts like this in the future.

August 24, 2016 | 07:58 PM - Posted by Scott Michaud


August 28, 2016 | 12:03 AM - Posted by Anon (not verified)

My impression is that only the biggest AAA-developers will be able to put resources towards supporting multi-GPU setups in their DX12-/Vulkan-engines in the future - and even then most will only support 2 GPUs at most.

3- or 4-GPU systems aren't widespread enough in the PC gaming-user base to justify the effort and money needed to make it work.

Mid- and small-sized (and esp. indie-)studios will have to rely on pre-made solutions from MS or Khronos in the API itself or support from the big 3 (Unreal Engine, Unity, CryEngine) engine developers.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.