Investigating Bandwidth on the SLI X16 Chipset
Our Own Testing
Without access to each of the tests used to provide the results we just showed you from ATI and NVIDIA, I decided I needed to find a way to put ATI's theory to the test. Unfortunately, since we can't run NVIDIA's 7800 GTX 512s in SLI on the RD580 or ATI's X1900 XT in CrossFire on the SLI X16 we have to improvise something different.
Since we can't use dual GPUs in our tests, we had to resort to testing performance on a single GPU on both the primary and the secondary GPU slots. By comparing the results on the primary x16 PCIe slot, that comes straight from the north bridge to connections to the memory and processor, against the results from the same video card in the secondary x16 PCIe slot, that must get and send all of its data through the south bridge MCP, through the questioned connection to the north bridge and then to memory and/or the processor, we might be able to see the bandwidth limits that ATI indicated to us.
In addition to simply testing primary vs. secondary performance, I also adjusted the north bridge to south bridge settings in the BIOS between runs to get an idea of how performance would differ on a slower, more narrow HT connection. I ran tests with both 16-bit and 8-bit connection widths and in HT multiplier settings of 5x thru 2x. The benchmark I chose was 3DMark06 due to its high repeatability and the ease of running multiple instances; the benchmark was run at 2048x1536 to increase stress on the GPU.
In addition to just 3DMark06 testing, we also ran an identical set of benchmarks with the same settings but added in quite a bit of stress on the south bridge. In order to try and stress the bandwidth between the north and south bridge chips I wanted to put the peripherals that go through this connection working as well. The on-board NVIDIA Gigabit Ethernet was maxed out using our standard NTTTCP benchmark, our second attached SATA drive was being benchmarked with a full test with HDTach and our external USB/Firewire drive was being slammed with another instance of HDTach. This should effectively max out the feature set on the NVIDIA chipset and also provide a worst-case scenario as no gamer would actually have all of this running while trying to kill some Nazis in Call of Duty 2. :)
I picked up the most popular SLI X16 AMD motherboard, the Asus A8N32-SLI, which turns out to be the same board ATI and NVIDIA used in their testing, and set it up on the test bench with an Athlon X2 4800+ and 2 GB of system memory. Then I took our X1900 XTX reference graphics card and installed it, booted windows, installed the drivers and was ready to go.
Results — Primary vs. Secondary x16 PCIe
As I mentioned before, these tests are comparing the primary x16 PCIe slot to the secondary x16 PCIe slot with a single X1900 XTX GPU installed. Our 3DMark06 setup looks like this:
Our testing first included testing both slots with a full 5x HT multiplier and a 16-bit wide bus with the below BIOS settings for verification.
We then modified the NB to SB frequency and link width in settings of: 4x 16-bit, 5x 8-bit, 4x 8-bit, 3x 8—bit and 2x 8-bit.
That's a lot of graphs to indiciate a single result: the bandwidth between the north and south bridge chips on the nForce4 SLI X16 does not seem to be limiting the performance of the secondary graphics card in any tangible way. The score differences between the best settings (5x HT and 16-bit bus width) differ only by 1.5% with the worst settings (2x HT and 8-bit bus width).