NVIDIA GF100 Architecture Preview - Fermi brings DX11 to the desktop
Fashionably Late, but Dressed to the Nines
The last 15 years have been absolutely incredible in terms of where we have come from in the realm of 3D graphics. Dusting off my memories of that time, we can see exactly how fast this particular segment of the computer market has grown. The original Voodoo Graphics, which was hand’s down the most fully featured and fastest offering at the time, was comprised of two chips. The first chip was a raster unit that featured around 1 million transistors, and it included a triangle setup engine. The second chip was a texture chip, and it also was around the 1 million transistor mark. These two chips ran at 50 MHz (though it was easy to overclock them to a blazing 57 MHz). This was simply gaming nirvana in 1996. Games such as Mechwarrior 2 showed off how good true 3D graphics could look, even at 640x480 by 16 bit graphics.
Fast forward to 2010 and we can look at the most advanced consumer graphics chip on the market; the AMD Cypress. This chip is comprised of 2.15 BILLION transistors running at a brisk 850 MHz. We are looking at a chip that is basically over 1000 times more complex than the Voodoo Graphics, running at a clockspeed 17 times faster. The AMD HD 5870 is by far the most advanced graphics card on the market at this time, and the first to bring DirectX 11 functionality. While AMD has a virtual stranglehold on DX11 cards with their Evergreen family of chips and cards, NVIDIA has not been standing still.
The naked GF100 die. All 3 billion transistors. And just think in 10 years we will be talking about 300 billion transistors.
The previous DirectX 10 generations of cards were dominated by NVIDIA in terms of features and performance. NVIDIA torpedoed AMD with the excellent G80 chip, which was the basis for the GeForce 8800 GTX/GTS cards. AMD/ATI stumbled mightily with the R600 chip (HD 2900 XT), and only slightly regained ground with the shrunk and slightly improved RV670 (HD 3870). The G80 chips held the high ground for an astounding 1 year and 8 months. During this time we saw a very interesting shift in philosophy between NVIDIA and AMD/ATI. Previously ATI would work hard on making the most feature complete cards, and were seemingly overengineered in terms of what was available with games. The X1800 and X1900 series of chips were a lot more advanced in shader design and functionality than the competing GeForce 7800 and 7900 series of chips that they competed against.
The change occurred with G80. The G80 was a surprise to many in the industry, as NVIDIA had not hinted at having a unified shader architecture for the DX10 generation of parts. The G80 was a tremendous product, and it beat ATI’s R600 to market by a good 8 months. NVIDIA followed this up with the GTX 280/260 parts, which again were performance leaders in both gaming and GPGPU applications. AMD went back to making simpler, more focused products for the gaming market. The RV 780 chip that powered the HD 4800 series of parts was not nearly as complex as the GTX 280, but it provided enough performance in games to be comparable to what NVIDIA had.
This is the logical diagram of the GF100. 4 GPCs, 16 SMs, and 512 CUDA cores comprise the cutting edge design for NVIDIA.
AMD truly did score a significant coup against NVIDIA with the release of their Evergreen family of products. AMD kept with a more focused design that would benefit gaming far more than GPGPU applications, even though there were some nods to improving performance and capabilities in that growing market. The result of that is the 334 mm square Cypress chip, which is now being produced in full swing after a bit of a rocky start with TSMC’s 40 nm process. Cypress is comprised of 1600 stream units, but the overall architecture is still highly reminiscent of the earlier HD 3870/HD4870 chips. AMD increased the sizes of the caches to allow for greater throughput of GPGPU functions, texturing, and memory locality for pixel operations. Full DX11 support was big, as well as including support for tessellation, and the addition of IEEE 754-2008 to the stream units. Looking from a high level point of view though, the changes are not that extreme from previous parts. This is likely the biggest reason AMD has such a lead over NVIDIA in the DX11 market.
Now we come to the GF100 chip from NVIDIA. This will be the first iteration of the new Fermi architecture, and the changes it brings to the table are significantly greater than what many were again imagining. Not only will this be a better performing chip from a gaming perspective, but NVIDIA has added enough functional units and switched around how data is handled in the chip, that it could very well be a game changer in both the gaming and GPGPU market.