Next Gen Graphics and Process Migration: 20 nm and Beyond
The Really Good Times are Over
We really do not realize how good we had it. Sure, we could apply that to budget surpluses and the time before the rise of global terrorism, but in this case I am talking about the predictable advancement of graphics due to both design expertise and improvements in process technology. Moore’s law has been exceptionally kind to graphics. We can look back and when we plot the course of these graphics companies, they have actually outstripped Moore in terms of transistor density from generation to generation. Most of this is due to better tools and the expertise gained in what is still a fairly new endeavor as compared to CPUs (the first true 3D accelerators were released in the 1993/94 timeframe).
The complexity of a modern 3D chip is truly mind-boggling. To get a good idea of where we came from, we must look back at the first generations of products that we could actually purchase. The original 3Dfx Voodoo Graphics was comprised of a raster chip and a texture chip, each contained approximately 1 million transistors (give or take) and were made on a then available .5 micron process (we shall call it 500 nm from here on out to give a sense of perspective with modern process technology). The chips were clocked between 47 and 50 MHz (though often could be clocked up to 57 MHz by going into the init file and putting in “SET SST_GRXCLK=57”… btw, SST stood for Sellers/Smith/Tarolli, the founders of 3Dfx). This revolutionary graphics card at the time could push out 47 to 50 megapixels and had 4 MB of VRAM and was released in the beginning of 1996.
My first 3D graphics card was the Orchid Righteous 3D. Voodoo Graphics was really the first successful consumer 3D graphics card. Yes, there were others before it, but Voodoo Graphics had the largest impact of them all.
In 1998 3Dfx released the Voodoo 2, and it was a significant jump in complexity from the original. These chips were fabricated on a 350 nm process. There were three chips to each card, one of which was the raster chip and the other two were texture chips. At the top end of the product stack was the 12 MB cards. The raster chip had 4 MB of VRAM available to it while each texture chip had 4 MB of VRAM for texture storage. Not only did this product double performance from the Voodoo Graphics, it was able to run in single card configurations at 800x600 (as compared to the max 640x480 of the Voodoo Graphics). This is the same time as when NVIDIA started to become a very aggressive competitor with the Riva TnT and ATI was about to ship the Rage 128.
Process technology at this time improved in leaps and bounds. Intel was always at or near the lead with others like IBM and Motorola keeping pace. TSMC was the first Pure-Play foundry selling line space to 3rd parties and others such as Chartered and UMC were competitive across all of their lines. TSMC has traditionally been the go-to foundry for the graphics industry, but around this time UMC was a close second. Within one and a half years from the introduction of the Voodoo 2 and TnT class of graphics adapters, TSMC was offering 250 nm lines for willing customers. NVIDIA was one of the first with the TnT 2 products, followed closely by 3dfx and the Voodoo 3. ATI was a little bit behind with the Rage 128 Pro, but they were making progress in keeping up.
Right after this we were introduced to the half-step for process nodes. TSMC released their 220 nm process for production and NVIDIA jumped on board with the original GeForce 256. We did not see the big jump in power and die size benefits that a full process node can give, but it did provide a quick transition for designers going to the next advanced node. Moving along we see the introduction of the 180 nm node and the GeForce 2 class of products. The GeForce 2 GTS was a 25 million transistor chip that was running at 200 MHz. Go back to the 2 million transistor Voodoo Graphics and we see that the chip design of the GeForce 2 GTS is 12.5x more complex running at four times the speed. Between the Voodoo Graphics and GeForce 2 GTS we see only a span of four years between these developments.
The NVIDIA Riva TnT was the first serious competitor for 3Dfx's lineup of cards, including the then new Voodoo 2.
The pace did not slow down there. Next up was the 150 nm half node from TSMC and the GeForce 3 series. This chip was a monster for the time. It was one of the first consumer level products that had a transistor count of around 57 million. The GeForce 4, which was released a year after the GeForce 3 and still using the 150 nm process bumped that count up to around 67 million. Then came the monster from ATI. The R300, which powered the Radeon 9700 Pro, was an astonishing 107 million transistors on the same 150 nm process. In the two years between 2000 and 2002 we see another quadrupling of transistor counts between two process nodes (and a half node at that) and another 100 to 150 MHz of speed for a complex GPU.
Around 2004 things started to slow down a bit, but that is a relative term as compared to the first eight years in 3D graphics. I had written an article at my old site that covered what I had expected to be a problem in the years following. “Slowing Down the Process Migration” discussed the inevitable slowing of process node transitions due to issues in materials, design strategies, and plain old physics. Little did I know some of the major issues that plagued the 130 nm jump (migrating voids, design rule changes midstream, etc.) would be solved and we again returned to a very regular cadence of process improvements. 130 nm lead to 110, 90, 80, 65, 55, 45, 40, 32, and now 28 nm. Graphics products did not inhabit every node, but they hit all of the major ones (45 and 32 nm were absent from most graphics platforms).
So where are we at now? In 2003 the top end product was the Radeon 9800 XT running at 412 MHz and was comprised of 117 million transistors using TSMC’s highly optimized 150 nm process. Today we are looking at the GTX TITAN based on the NVIDIA GK110 processor that weighs in at 7 billion transistors and around 850 MHz. This represents twice the raw clockspeed and an astonishing 70 times more complex in transistor design in the span of ten years. It is absolutely no wonder that we are spoiled by the constant stream of new products that advance the state of the art on a yearly basis with a major process node improvement every 18 months or so.
With this highly aggressive pace from year to year, why are we in graphics name only refresh-land right now? I am starting to see a lot of commenters discussing their displeasure at both NVIDIA and AMD for their lack of a true, next-generation GPU. The GK104 that originally powered the GTX 680 has morphed into a variety of products including the GTX 770 and GTX 760. The GTX TITAN based on GK110 was released last year and it has been repurposed for the GTX 780. AMD refreshed their lineups with last year’s Tahiti and Pitcairn chips, and the top end Hawaii chip (R9 290X) only reaches the complexity of last year’s GK110. These parts are all based on TSMC’s 28 nm process. Where exactly are the new chips and why aren’t we at 20 nm yet?