Review Index:
Feedback

Next Gen Graphics and Process Migration: 20 nm and Beyond

Author:
Subject: Editorial
Manufacturer:

The Really Good Times are Over

We really do not realize how good we had it.  Sure, we could apply that to budget surpluses and the time before the rise of global terrorism, but in this case I am talking about the predictable advancement of graphics due to both design expertise and improvements in process technology.  Moore’s law has been exceptionally kind to graphics.  We can look back and when we plot the course of these graphics companies, they have actually outstripped Moore in terms of transistor density from generation to generation.  Most of this is due to better tools and the expertise gained in what is still a fairly new endeavor as compared to CPUs (the first true 3D accelerators were released in the 1993/94 timeframe).

The complexity of a modern 3D chip is truly mind-boggling.  To get a good idea of where we came from, we must look back at the first generations of products that we could actually purchase.  The original 3Dfx Voodoo Graphics was comprised of a raster chip and a texture chip, each contained approximately 1 million transistors (give or take) and were made on a then available .5 micron process (we shall call it 500 nm from here on out to give a sense of perspective with modern process technology).  The chips were clocked between 47 and 50 MHz (though often could be clocked up to 57 MHz by going into the init file and putting in “SET SST_GRXCLK=57”… btw, SST stood for Sellers/Smith/Tarolli, the founders of 3Dfx).  This revolutionary graphics card at the time could push out 47 to 50 megapixels and had 4 MB of VRAM and was released in the beginning of 1996.

View Full Size

My first 3D graphics card was the Orchid Righteous 3D.  Voodoo Graphics was really the first successful consumer 3D graphics card.  Yes, there were others before it, but Voodoo Graphics had the largest impact of them all.

In 1998 3Dfx released the Voodoo 2, and it was a significant jump in complexity from the original.  These chips were fabricated on a 350 nm process.  There were three chips to each card, one of which was the raster chip and the other two were texture chips.  At the top end of the product stack was the 12 MB cards.  The raster chip had 4 MB of VRAM available to it while each texture chip had 4 MB of VRAM for texture storage.  Not only did this product double performance from the Voodoo Graphics, it was able to run in single card configurations at 800x600 (as compared to the max 640x480 of the Voodoo Graphics).  This is the same time as when NVIDIA started to become a very aggressive competitor with the Riva TnT and ATI was about to ship the Rage 128.

Read the entire editorial here!

Process technology at this time improved in leaps and bounds.  Intel was always at or near the lead with others like IBM and Motorola keeping pace.  TSMC was the first Pure-Play foundry selling line space to 3rd parties and others such as Chartered and UMC were competitive across all of their lines.  TSMC has traditionally been the go-to foundry for the graphics industry, but around this time UMC was a close second.  Within one and a half years from the introduction of the Voodoo 2 and TnT class of graphics adapters, TSMC was offering 250 nm lines for willing customers.  NVIDIA was one of the first with the TnT 2 products, followed closely by 3dfx and the Voodoo 3.  ATI was a little bit behind with the Rage 128 Pro, but they were making progress in keeping up.

Right after this we were introduced to the half-step for process nodes.  TSMC released their 220 nm process for production and NVIDIA jumped on board with the original GeForce 256.  We did not see the big jump in power and die size benefits that a full process node can give, but it did provide a quick transition for designers going to the next advanced node.  Moving along we see the introduction of the 180 nm node and the GeForce 2 class of products.  The GeForce 2 GTS was a 25 million transistor chip that was running at 200 MHz.  Go back to the 2 million transistor Voodoo Graphics and we see that the chip design of the GeForce 2 GTS is 12.5x more complex running at four times the speed.  Between the Voodoo Graphics and GeForce 2 GTS we see only a span of four years between these developments.

View Full Size

The NVIDIA Riva TnT was the first serious competitor for 3Dfx's lineup of cards, including the then new Voodoo 2.

The pace did not slow down there.  Next up was the 150 nm half node from TSMC and the GeForce 3 series.  This chip was a monster for the time.  It was one of the first consumer level products that had a transistor count of around 57 million.  The GeForce 4, which was released a year after the GeForce 3 and still using the 150 nm process bumped that count up to around 67 million.  Then came the monster from ATI.  The R300, which powered the Radeon 9700 Pro, was an astonishing 107 million transistors on the same 150 nm process.  In the two years between 2000 and 2002 we see another quadrupling of transistor counts between two process nodes (and a half node at that) and another 100 to 150 MHz of speed for a complex GPU.

Around 2004 things started to slow down a bit, but that is a relative term as compared to the first eight years in 3D graphics.  I had written an article at my old site that covered what I had expected to be a problem in the years following.  “Slowing Down the Process Migration” discussed the inevitable slowing of process node transitions due to issues in materials, design strategies, and plain old physics.  Little did I know some of the major issues that plagued the 130 nm jump (migrating voids, design rule changes midstream, etc.) would be solved and we again returned to a very regular cadence of process improvements.  130 nm lead to 110, 90, 80, 65, 55, 45, 40, 32, and now 28 nm.  Graphics products did not inhabit every node, but they hit all of the major ones (45 and 32 nm were absent from most graphics platforms).

So where are we at now?  In 2003 the top end product was the Radeon 9800 XT running at 412 MHz and was comprised of 117 million transistors using TSMC’s highly optimized 150 nm process.  Today we are looking at the GTX TITAN based on the NVIDIA GK110 processor that weighs in at 7 billion transistors and around 850 MHz.  This represents twice the raw clockspeed and an astonishing 70 times more complex in transistor design in the span of ten years.  It is absolutely no wonder that we are spoiled by the constant stream of new products that advance the state of the art on a yearly basis with a major process node improvement every 18 months or so.

With this highly aggressive pace from year to year, why are we in graphics name only refresh-land right now?  I am starting to see a lot of commenters discussing their displeasure at both NVIDIA and AMD for their lack of a true, next-generation GPU.  The GK104 that originally powered the GTX 680 has morphed into a variety of products including the GTX 770 and GTX 760.  The GTX TITAN based on GK110 was released last year and it has been repurposed for the GTX 780.  AMD refreshed their lineups with last year’s Tahiti and Pitcairn chips, and the top end Hawaii chip (R9 290X) only reaches the complexity of last year’s GK110.  These parts are all based on TSMC’s 28 nm process.  Where exactly are the new chips and why aren’t we at 20 nm yet?

October 22, 2013 | 03:12 PM - Posted by snook

thanks Josh. Lets hope there is that one guy who says "how about trying this?", and he changes everything.

October 22, 2013 | 06:50 PM - Posted by Josh Walrath

Make no mistake, there is a lot of research in a LOT of different areas to overcome the issues that the industry is running into.  The challenges have always been there (breaking the 1 micron barrier was seemingly huge), but now the challenges are just bigger, more complex, and more expensive.

October 25, 2013 | 08:56 AM - Posted by Anonymous (not verified)

Or maybe one of the foundrys will have a happy accident Bob Ross style.

They'll come in one monring to find their equipment had slipped around a new nm during the night and everything is a little out of whack. They're about to toss out the batch when someone grabs a wafer for the fun of it and runs a test and BAM! breakthrough!

Guy can dream, can't he.

October 22, 2013 | 03:26 PM - Posted by Evo01

Thanks Josh. Fantastic article.

October 22, 2013 | 04:46 PM - Posted by Fishbait

Very awesome and informative article Josh. What implications could this have with Moore’s law? Does this effectively stop it before the theoretical quantum limit in 2036? These will be an interesting few years for pure-play foundries and their clients indeed.

October 22, 2013 | 06:52 PM - Posted by Josh Walrath

Well, things will be necessarily slowing down.  There simply are hurdles that need lots of time and lots of money to solve.  10 nm shouldn't be that bad, 7 nm is hitting some interesting limits, and sub 7 nm is going to be really rough.  Litho, materials, and electrical characterisitcs at that size will be sorta crazy.

October 22, 2013 | 05:26 PM - Posted by Anonymous (not verified)

Just to add a few points to this excellent article:

- The 14nm/16nm nodes for GloFo and TSMC, respectively, are going to be utilizing a 20nm back-end-of-line. This means that while density won't increase, they'll improve power characteristics (these are the two FinFET nodes).

- The time-to-market for the above two nodes from both foundries should be more painless than if they were to attempt a shrink + FinFETs. As a result, if I were to guess I'd say we see the 14nm/16nm nodes a bit earlier than some had anticipated. Early tape-outs for 14nm and 20nm have been close together so that definitely adds some credence to that line of thought. Though not certainty ;P

- These node names (eg., 14nm) don't actually accurately describe the half-pitch. Unless I'm recalling incorrectly, the current tools would only allow something like 18nm(?). Intel's current 22nm FinFETs has been described in papers as 26nm. Whether that's true or not, I have no idea, but the point is that the half-pitch is only a single detail in a long list of attributes that defines a new "node." The takeaway is that you shouldn't get too caught up in the XXnm numbers and remember that it's the power, leakage, density, and performance of the node that actually matters.

October 22, 2013 | 05:43 PM - Posted by Josh Walrath

Thanks for the comments.

About the node names... Intel's 22 nm describes the smallest feature, but you are correct in that a certain other feature (I think it has to do with SRAM) is 26 nm.  There was some thought that AMD with GF's 28 nm would be able to get fairly close to the transistor density of Intel's 22 nm in certain aspects due to this size variance.

October 22, 2013 | 06:29 PM - Posted by Krewell (not verified)

Nice summary Josh. As the person above noted above, the node numbers are not strictly related to feature size (e.g. TSMC 16nm is FinFET transistors on 20nm backend).

Nvidia likes to talk about how GPUs have better than Moore's Law scaling, but with die sizes already at 550mm2 (GK110), that will not be true going forward - die sizes are already close to the limits of fab reticles (~600mm2).

I just had this same conversation with AMD's Raja Kudari. Raja's response is that it will take new architectures to improve performance, not just process shrinks and die area growth. It's going to take improvements in architecture efficiency and effectiveness. It also means that the GPU designers need to work closer with game engine developers to find efficiency improvements - Mantle is one example.

October 22, 2013 | 06:45 PM - Posted by Josh Walrath

If you have some spare minutes, you should read that old article I linked.  Some interesting stuff there (considering it was written in 2004 and issues at 130 nm were just being solved).

Thanks for reading!  The next few years are going to be very interesting considering the challenges ahead!

October 22, 2013 | 10:32 PM - Posted by Fiberton (not verified)

Reason I found it interesting that last year AMD replaced CPU architects that were the creators of the Athlon. We all really need AMD to do well to drive pricing down for everyone and to also push technology forward.

October 22, 2013 | 11:03 PM - Posted by Fiberton (not verified)

I wish I could edit :) They replaced the bulldozer architect. They have hired people who worked on the Athlon projects. 1 am sorry :).

October 23, 2013 | 10:09 AM - Posted by Josh Walrath

Yeah, some of the old guys came back.  Jim Keller is the big name.  Raja Koduri on the GPU side is back.  There is a lot of uplift in what they are trying to do, and I think overall they are heading in the right direction.  I like Dirk Meyer, but while he was a great CPU architect, he almost missed the major mobile transition that his product stack would not be able to address.

October 22, 2013 | 06:32 PM - Posted by Whayne (not verified)

Wonderful and informative post Josh. As a theoretical physicist with some background in solid state physics I've been aware of a few of the issues facing the industry especially the lithography. I cannot even begin to imagine how hard it's going to be to get 7-5nm process nodes operational. I expect quantum effects to come in earlier maybe even 10nm will be very tough. Quantum tunnelling will no doubt be a huge issue when line traces are so small.

Interesting times ahead. It looks like either an R9 290x or a GTX 780 Ti will be my friend until well into 2015, but that's okay, as they are still going to be pretty darn good cards.

October 22, 2013 | 06:42 PM - Posted by Josh Walrath

I find it interesting that pretty much the entire industry is heavily invested in EUV... and from what I understand, the risk there is still very high that it will even work out.

October 24, 2013 | 11:30 AM - Posted by Frenchy2k1 (not verified)

The industry has been working on EUV for over 10 years already and still seems quite far from its target (which has been moving during that time too).

Those are interesting times indeed at the process level.

The question you have not breached is about the economics of it. We have been seeing a lot of consolidation in the semiconductor for the past ten years and it is accelerating. Each process node cost exponentially more than the last one and THIS is the reason for pure play foundries: few companies can afford their own fabs anymore. Intel is of course the exception, but even they are starting to open their fabs to other companies (which nobody would expect just a few years ago). That means that even intel has too much capacity and cannot fill its fabs anymore.

Semiconductor is so far the pinnacle of human ingenuity, taking so much efforts from so many people to keep on track and follow Moore's Law. All those people are hard at work on EUV and backup plans (multiple patterning, they are doing dual, but 3 or 4 are definitely possible, 3D transistors are also coming, first in Flash at Samsung). We have not yet seen the end of semiconductor growth.

October 22, 2013 | 07:12 PM - Posted by Anonymous (not verified)

Looks like someone got influenced by their trip to Montreal.

October 22, 2013 | 07:17 PM - Posted by Josh Walrath

Heh, I didn't go to Montreal with Ryan and Ken.  Oddly enough, I started researching and writing this before that event.  I was sorta cranky when Carmack started talking about this subject... day late and a dollar short for me (or rather many millions of dollars short).

October 22, 2013 | 10:38 PM - Posted by Fiberton (not verified)

Your article really reminds of the past seeing the names of all the cards and players in the market. There was so much more excitement back then. Thanks for the read 0/

October 22, 2013 | 08:18 PM - Posted by Jerald Tapalla (not verified)

This is the first time I just sit and read a long article with my focus just on it. Very nice article, very informative especially for me who is new to this kind of stuff. Thanks for this.

October 22, 2013 | 08:38 PM - Posted by Josh Walrath

Thanks for reading the entire thing!  Ryan will thank you as well!

October 22, 2013 | 08:57 PM - Posted by Onehourleft

Great writing Josh. Also it was nice to visit your archives for the first time. I enjoyed both articles.

October 22, 2013 | 10:29 PM - Posted by MeezyATL (not verified)

Some great articles from the PC Per staff this week. Keep up the good work.

October 23, 2013 | 02:06 AM - Posted by thinkbiggar (not verified)

Can you run that thing in SLI? Also did it cause you to loose all your hair?
I bet the prices stayed the same but not with inflation. Don't tell marketing people about inflation. Once they learn about it we are all screwed.

October 23, 2013 | 10:11 AM - Posted by Josh Walrath

I lost my hair because I got married and had kids.

October 23, 2013 | 04:01 AM - Posted by WantT100 (not verified)

This is a stunning article, this is why I visit Pcper every day. Ryan give the man a bonus!

This article is up there with Scott's "The Windows You Love is Gone" stunner a year ago.
http://www.pcper.com/reviews/Editorial/Windows-You-Love-Gone

October 23, 2013 | 05:13 AM - Posted by Anonymous (not verified)

These kind of delays are to be expected. The smaller you go, the indiviual effects become that much ore pronounced. Instead of treating the design as a whole or in smaller but relatively large units, more research needs to be done in examining each and every change occuring within the system. Very time consuming and expensive. I will not be surprised if there are further delays. The break-neck speed of development had to come to an end some time.

Not disappointed about delays but very much expected. Can´t keep throwing money and expect it to pay dividends immediatly. My 2 cents.

October 23, 2013 | 09:46 AM - Posted by Josh Walrath

Yup, you are likely correct.  What we often don't hear about is how closely the fab engineers work with the designers.  The amount of back and forth work and information they do is pretty staggering, especially with these next generation nodes.  This simply isn't a "we are finished with the design, send it to the Fab guys and they can figure out the rest!" situation anymore.

We are also seeing the pure-play guys working to amortize their investments in current process nodes... because the next gen stuff is so expensive.  Gotta pay those bills.  They only hope that Intel will slow down, cause those guys don't clear $3 billion a quarter like Intel does.

October 23, 2013 | 07:25 AM - Posted by blitzio

Amazing article Josh, thank you for an informative read. I feel thoroughly educated.

October 23, 2013 | 07:29 AM - Posted by Max KreFey

Great article Josh! Thank you from cold mother Russia! :D

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.