28HPCU: Cost Effective and Power Efficient
Have you ever been approached about something and upon first hearing about it, the opportunity just did not seem very exciting? Then upon digging into things, it became much more interesting? This happened to me with this announcement. At first blush, who really cares that ARM is partnering with UMC at 28 nm? Well, once I was able to chat with the people at ARM, it is much more interesting than initially expected.
The new hotness in fabrication is the latest 14 nm and 16 nm processes from Samsung/GF and TSMC respectively. It has been a good 4+ years since we last had a new process node that actually performed as expected. The planar 22/20 nm products just were not entirely suitable for mass production. Apple was one of the few to actually develop a part for TSMC’s 20 nm process that actually sold in the millions. The main problem was a lack of power and speed scaling as compared to 28 nm processes. Planar was a bad choice, but the development of FinFET technologies hadn’t been implemented in time for it to show up at this time by 3rd party manufacturers.
There is a problem with the latest process generations, though. They are new, expensive, and are production constrained. Also, they may not be entirely appropriate for the applications that are being developed. There are several strengths with 28 nm as compared. These are mature processes with an excess of line space. The major fabs are offering very competitive pricing structures for 28 nm as they see space being cleared up on the lines with higher end SOCs, GPUs, and assorted ASICs migrating to the new process nodes.
TSMC has typically been on the forefront of R&D with advanced nodes. UMC is not as aggressive with their development, but they tend to let others do some of the heavy lifting and then integrate the new nodes when it fits their pricing and business models. TSMC is on their third generation of 28 nm. UMC is on their second, but that generation encompasses many of the advanced features of TSMC’s 3rd generation so it is actually quite competitive.
Process Technology Overview
We have been very spoiled throughout the years. We likely did not realize exactly how spoiled we were until it became very obvious that the rate of process technology advances hit a virtual brick wall. Every 18 to 24 months we were treated to a new, faster, more efficient process node that was opened up to fabless semiconductor firms and we were treated to a new generation of products that would blow our hair back. Now we have been in a virtual standstill when it comes to new process nodes from the pure-play foundries.
Few expected the 28 nm node to live nearly as long as it has. Some of the first cracks in the façade actually came from Intel. Their 22 nm Tri-Gate (FinFET) process took a little bit longer to get off the ground than expected. We also noticed some interesting electrical features from the products developed on that process. Intel skewed away from higher clockspeeds and focused on efficiency and architectural improvements rather than staying at generally acceptable TDPs and leapfrogging the competition by clockspeed alone. Overclockers noticed that the newer parts did not reach the same clockspeed heights as previous products such as the 32 nm based Sandy Bridge processors. Whether this decision was intentional from Intel or not is debatable, but my gut feeling here is that they responded to the technical limitations of their 22 nm process. Yields and bins likely dictated the max clockspeeds attained on these new products. So instead of vaulting over AMD’s products, they just slowly started walking away from them.
Samsung is one of the first pure-play foundries to offer a working sub-20 nm FinFET product line. (Photo courtesy of ExtremeTech)
When 28 nm was released the plans on the books were to transition to 20 nm products based on planar transistors, thereby bypassing the added expense of developing FinFETs. It was widely expected that FinFETs were not necessarily required to address the needs of the market. Sadly, that did not turn out to be the case. There are many other factors as to why 20 nm planar parts are not common, but the limitations of that particular process node has made it a relatively niche process node that is appropriate for smaller, low power ASICs (like the latest Apple SOCs). The Apple A8 is rumored to be around 90 mm square, which is a far cry from the traditional midrange GPU that goes from 250 mm sq. to 400+ mm sq.
The essential difficulty of the 20 nm planar node appears to be a lack of power scaling to match the increased transistor density. TSMC and others have successfully packed in more transistors into every square mm as compared to 28 nm, but the electrical characteristics did not scale proportionally well. Yes, there are improvements there per transistor, but when designers pack in all those transistors into a large design, TDP and voltage issues start to arise. As TDP increases, it takes more power to drive the processor, which then leads to more heat. The GPU guys probably looked at this and figured out that while they can achieve a higher transistor density and a wider design, they will have to downclock the entire GPU to hit reasonable TDP levels. When adding these concerns to yields and bins for the new process, the advantages of going to 20 nm would be slim to none at the end of the day.