Subject: General Tech | November 12, 2015 - 02:47 PM | Ken Addison
Tagged: podcast, video, qualcomm, snapdragon 820, Lenovo, yoga 900, be quiet!, amd, r9 380x, GLOBALFOUNDRIES, 14nm, FinFET, nvidia, asus, Maximus VIII Extreme, Thrustmaster, T300
PC Perspective Podcast #375 - 11/12/2015
Join us this week as we discuss the Snapdragon 820, Lenovo Yoga 900, R9 380X and more!
The URL for the podcast is: http://pcper.com/podcast - Share with your friends!
- iTunes - Subscribe to the podcast directly through the Store
- RSS - Subscribe through your regular RSS reader
- MP3 - Direct download link to the MP3 file
Hosts: Ryan Shrout, Jeremy Hellstrom, Josh Walrath, and Sebastian Peak
Program length: 1:22:11
Week in Review:
0:29:30 This week’s podcast is brought to you by Casper. Use code PCPER at checkout for $50 towards your order!
News item of interest:
Hardware/Software Picks of the Week:
GPU Enthusiasts Are Throwing a FET
NVIDIA is rumored to launch Pascal in early (~April-ish) 2016, although some are skeptical that it will even appear before the summer. The design was finalized months ago, and unconfirmed shipping information claims that chips are being stockpiled, which is typical when preparing to launch a product. It is expected to compete against AMD's rumored Arctic Islands architecture, which will, according to its also rumored numbers, be very similar to Pascal.
This architecture is a big one for several reasons.
Image Credit: WCCFTech
First, it will jump two full process nodes. Current desktop GPUs are manufactured at 28nm, which was first introduced with the GeForce GTX 680 all the way back in early 2012, but Pascal will be manufactured on TSMC's 16nm FinFET+ technology. Smaller features have several advantages, but a huge one for GPUs is the ability to fit more complex circuitry in the same die area. This means that you can include more copies of elements, such as shader cores, and do more in fixed-function hardware, like video encode and decode.
That said, we got a lot more life out of 28nm than we really should have. Chips like GM200 and Fiji are huge, relatively power-hungry, and complex, which is a terrible idea to produce when yields are low. I asked Josh Walrath, who is our go-to for analysis of fab processes, and he believes that FinFET+ is probably even more complicated today than 28nm was in the 2012 timeframe, which was when it launched for GPUs.
It's two full steps forward from where we started, but we've been tiptoeing since then.
Image Credit: WCCFTech
Second, Pascal will introduce HBM 2.0 to NVIDIA hardware. HBM 1.0 was introduced with AMD's Radeon Fury X, and it helped in numerous ways -- from smaller card size to a triple-digit percentage increase in memory bandwidth. The 980 Ti can talk to its memory at about 300GB/s, while Pascal is rumored to push that to 1TB/s. Capacity won't be sacrificed, either. The top-end card is expected to contain 16GB of global memory, which is twice what any console has. This means less streaming, higher resolution textures, and probably even left-over scratch space for the GPU to generate content in with compute shaders. Also, according to AMD, HBM is an easier architecture to communicate with than GDDR, which should mean a savings in die space that could be used for other things.
Third, the architecture includes native support for three levels of floating point precision. Maxwell, due to how limited 28nm was, saved on complexity by reducing 64-bit IEEE 754 decimal number performance to 1/32nd of 32-bit numbers, because FP64 values are rarely used in video games. This saved transistors, but was a huge, order-of-magnitude step back from the 1/3rd ratio found on the Kepler-based GK110. While it probably won't be back to the 1/2 ratio that was found in Fermi, Pascal should be much better suited for GPU compute.
Image Credit: WCCFTech
Mixed precision could help video games too, though. Remember how I said it supports three levels? The third one is 16-bit, which is half of the format that is commonly used in video games. Sometimes, that is sufficient. If so, Pascal is said to do these calculations at twice the rate of 32-bit. We'll need to see whether enough games (and other applications) are willing to drop down in precision to justify the die space that these dedicated circuits require, but it should double the performance of anything that does.
So basically, this generation should provide a massive jump in performance that enthusiasts have been waiting for. Increases in GPU memory bandwidth and the amount of features that can be printed into the die are two major bottlenecks for most modern games and GPU-accelerated software. We'll need to wait for benchmarks to see how the theoretical maps to practical, but it's a good sign.
Subject: Processors | September 30, 2015 - 09:55 PM | Josh Walrath
Tagged: TSMC, Samsung, FinFET, apple, A9, 16 nm, 14 nm
So the other day the nice folks over at Chipworks got word that Apple was in fact sourcing their A9 SOC at both TSMC and Samsung. This is really interesting news on multiple fronts. From the information gleaned the two parts are the APL0898 (Samsung fabbed) and the APL1022 (TSMC).
These process technologies have been in the news quite a bit. As we well know, it has been a hard time for any foundry to go under 28 nm in an effective way if your name is not Intel. Even Intel has had some pretty hefty issues with their march to sub 32 nm parts, but they have the resources and financial ability to push through a lot of these hurdles. One of the bigger problems that affected the foundries was the idea that they could push back FinFETs beyond what they were initially planning. The idea was to hit 22/20 nm and use planar transistors and push development back to 16/14 nm for FinFET technology.
The Chipworks graphic that explains the differences between Samsung's and TSMC's A9 products.
There were many reasons why this did not work in an effective way for the majority of products that the foundries were looking to service with a 22/20 nm planar process. Yes, there were many parts that were fabricated using these nodes, but none of them were higher power/higher performance parts that typically garner headlines. No CPUs, no GPUs, and only a handful of lower power SOCs (most notably Apple's A8, which was around 89 mm squared and consumed up to 5 to 10 watts at maximum). The node just did not scale power very effectively. It provided a smaller die size, but it did not increase power efficiency and switching performance significantly as compared to 28 nm high performance nodes.
The information Chipworks has provided also verifies that Samsung's 14 nm FF process is more size optimized than TSMC's 16 nm FF. There was originally some talk about both nodes being very similar in overall transistor size and density, but Samsung has a slightly tighter design. Neither of them are smaller than Intel's latest 14 nm which is going into its second generation form. Intel still has a significant performance and size advantage over everyone else in the field. Going back to size we see the Samsung chip is around 96 mm square while the TSMC chip is 104.5 mm square. This is not huge, but it does show that the Samsung process is a little tighter and can squeeze more transistors per square mm than TSMC.
In terms of actual power consumption and clock scaling we have nothing to go on here. The chips are both represented in the 6S and 6S+. Testing so far has not shown there to be significant differences between the two SOCs so far. In theory one could be performing better than the other, but in reality we have not tested these chips at a low enough level to discern any major performance or power issue. My gut feeling here is that Samsung's process is more mature and running slightly better than TSMC's, but the differences are going to be minimal at best.
The next piece of info that we can glean from this is that there just isn't enough line space for all of the chip companies who want to fabricate their parts with either Samsung or TSMC. From a chip standpoint a lot of work has to be done to port a design to two different process nodes. While 14 and 16 are similar in overall size and the usage of FinFETS, the standard cells and design libraries for both Samsung and TSMC are going to be very different. It is not a simple thing to port over a design. A lot of work has to be done in the design stage to make a chip work with both nodes. I can tell you that there is no way that both chips are identical in layout. It is not going to be a "dumb port" where they just adjust the optics with the same masks and magically make these chips work right off the bat. Different mask sets for each fab, verification of both designs, and troubleshooting the yields by metal layer changes will be different for each manufacturer.
In the end this means that there just simply was not enough space at either TSMC or Samsung to handle the demand that Apple was expecting. Because Apple has deep pockets they contracted out both TSMC and Samsung to produce two very similar, but still different parts. Apple also likely outbid and locked down what availability to process wafers that Samsung and TSMC have, much to the dismay of other major chip firms. I have no idea what is going on in the background with people like NVIDIA and AMD when it comes to line space for manufacturing their next generation parts. At least for AMD it seems that their partnership with GLOBALFOUNDRIES and their version of 14 nm FF is having a hard time taking off. Eventually more space will be made in production and yields and bins will improve. Apple will stop taking up so much space and we can get other products rolling off the line. In the meantime, enjoy that cutting edge iPhone 6S/+ with the latest 14/16 nm FF chips.
Process Technology Overview
We have been very spoiled throughout the years. We likely did not realize exactly how spoiled we were until it became very obvious that the rate of process technology advances hit a virtual brick wall. Every 18 to 24 months we were treated to a new, faster, more efficient process node that was opened up to fabless semiconductor firms and we were treated to a new generation of products that would blow our hair back. Now we have been in a virtual standstill when it comes to new process nodes from the pure-play foundries.
Few expected the 28 nm node to live nearly as long as it has. Some of the first cracks in the façade actually came from Intel. Their 22 nm Tri-Gate (FinFET) process took a little bit longer to get off the ground than expected. We also noticed some interesting electrical features from the products developed on that process. Intel skewed away from higher clockspeeds and focused on efficiency and architectural improvements rather than staying at generally acceptable TDPs and leapfrogging the competition by clockspeed alone. Overclockers noticed that the newer parts did not reach the same clockspeed heights as previous products such as the 32 nm based Sandy Bridge processors. Whether this decision was intentional from Intel or not is debatable, but my gut feeling here is that they responded to the technical limitations of their 22 nm process. Yields and bins likely dictated the max clockspeeds attained on these new products. So instead of vaulting over AMD’s products, they just slowly started walking away from them.
Samsung is one of the first pure-play foundries to offer a working sub-20 nm FinFET product line. (Photo courtesy of ExtremeTech)
When 28 nm was released the plans on the books were to transition to 20 nm products based on planar transistors, thereby bypassing the added expense of developing FinFETs. It was widely expected that FinFETs were not necessarily required to address the needs of the market. Sadly, that did not turn out to be the case. There are many other factors as to why 20 nm planar parts are not common, but the limitations of that particular process node has made it a relatively niche process node that is appropriate for smaller, low power ASICs (like the latest Apple SOCs). The Apple A8 is rumored to be around 90 mm square, which is a far cry from the traditional midrange GPU that goes from 250 mm sq. to 400+ mm sq.
The essential difficulty of the 20 nm planar node appears to be a lack of power scaling to match the increased transistor density. TSMC and others have successfully packed in more transistors into every square mm as compared to 28 nm, but the electrical characteristics did not scale proportionally well. Yes, there are improvements there per transistor, but when designers pack in all those transistors into a large design, TDP and voltage issues start to arise. As TDP increases, it takes more power to drive the processor, which then leads to more heat. The GPU guys probably looked at this and figured out that while they can achieve a higher transistor density and a wider design, they will have to downclock the entire GPU to hit reasonable TDP levels. When adding these concerns to yields and bins for the new process, the advantages of going to 20 nm would be slim to none at the end of the day.
Subject: General Tech | October 13, 2014 - 11:57 PM | Scott Michaud
Tagged: processors, microprocessor, FinFET, fab
Ah, Solid State Physics. Semiconductors are heavily based on this branch, because it explains the physical (mechanical, electrical, thermal, etc.) properties of solids based on how their atoms are organized. These properties lead into how transistors function, and why.
Put it back, Allyn.
Anandtech has published a seven-page article that digs into physics and builds upon itself. It starts with a brief explanation of conductivity and what makes up the difference between a conductor, an insulator, and a semiconductor. It uses that to build a simple transistor. From there it explains logic gates, wafers, and lithography. It works up to FinFETs and then keeps going into the future. It is definitely not an article for beginners, but it can be progressed from start to finish given enough effort on the part of the reader.
While this was not mentioned in the article, at least not that I found, you can derive the number of atoms per "feature" by dividing its size by the lattice-distance of the material. For silicon, that is about half of a nanometer at room temperature. For instance, 14nm means that we are manufacturing features that are defined by less than 30 atoms (up to rounding error). The article speculates a bit about what will happen after the era of silicon. This is quite interesting to me, particularly since I did my undergraduate thesis (just an undergrad thesis) on photonic crystals, which route optical light across manufactured defects in an otherwise opaque solid to make an optical integrated circuit. It has the benefit of, with a mixture of red, orange, and maybe green lasers, being able to "go plaid".
If you are interested, be sure to read the article. It is a bit daunting, but much more manageable than most sources. Congratulations to Joshua Ho and anyone else who might have been involved.
Subject: General Tech | October 6, 2014 - 12:30 PM | Jeremy Hellstrom
Tagged: arm, TSMC, 10nm, FinFET, armv8-a
ARM and TSMC are moving ahead at an impressive pace, now predicting 10nm FinFET designs taping out possibly in the fourth quarter of 2015. That could even be possible considering how quickly they incorporated FinFET to move from 20nm SoC to 16nm. The the ARMv8-A processor architecture will have a few less transistors than a high end CPU which does help their process adoption move more quickly than AMD or Intel but with AMD partnering up with ARM there is the possibility of seeing this new ARM architecture in AMD chips in the not too distant future. As DigiTimes points out, there are many benefits that have come from this partnership between ARM and TSMC.
"ARM and Taiwan Semiconductor Manufacturing Company (TSMC) have announced a new multi-year agreement that will deliver ARMv8-A processor IP optimized for TSMC 10nm FinFET process technology. Because of the success in scaling from 20nm SoC to 16nm FinFET, ARM and TSMC have decided to collaborate again for 10FinFET."
Here is some more Tech News from around the web:
- Desktop, schmesktop: Microsoft reveals next WINDOWS SERVER @ The Register
- Apple updates malware definitions to protect OS X users from iWorm Botnet @ The Inquirer
- IBM goes gunning for Intel with Nvidia GPU-charged Power8 servers @ The Register?
- Android Wear can now boot Windows 95 @ The Inquirer
- A Look at Adobe’s Creative Cloud Fall 2014 Update @ Techgage
- Tech ARP 2014 Mega Giveaway Contest
Subject: General Tech | September 16, 2014 - 02:38 PM | Jeremy Hellstrom
Tagged: FinFET, flexible
We've seen a few examples of OLEDs being used to create flexible displays but they are much slower than their unbending silicon rivals. With conductive ink and thread it is possible to make wearable technology but again the silicon components remain solid and immobile. Researchers in Saudi Arabia have been working on flexible technology which retains the speed of silicon transistors but is able to flex up to 0.5 mm which may sound large until you remember the size of a transistor. They have created these FinFETs by putting a thin layer of a polymer on top of the material they will be etching the transistors into and gently removing the polymer once the process has completed. This results in a FinFET which retains the power saving and performance attributes common to the 3D transistor but with the ability to bend. This won't be marketed for a while yet but in the mean time read all about it on Nanotechweb.
"Researchers at the King Abdullah University of Science and Technology in Saudia Arabia are continuing with their experiments to transform traditional rigid electronic wafers made from silicon into mechanically flexible and transparent ones."
Here is some more Tech News from around the web:
- Tunnelling electrons make new type of transistor @ Nanotechweb
- IBM brings Watson Analytics to all with freemium model @ The Inquirer
- Seagate's triple-headed Cerberus could SAVE the DISK WORLD @ The Register
- Amazon Kindle vulnerability lets hackers take over your account @ The Inquirer
- be quiet! Straight Power 10 competition @ Kitguru
Coming in 2014: Intel Core M
The era of Broadwell begins in late 2014 and based on what Intel has disclosed to us today, the processor architecture appears to be impressive in nearly every aspect. Coming off the success of the Haswell design in 2013 built on 22nm, the Broadwell-Y architecture will not only be the first to market with a new microarchitecture, but will be the flagship product on Intel’s new 14nm tri-gate process technology.
The Intel Core M processor, as Broadwell-Y has been dubbed, includes impressive technological improvements over previous low power Intel processors that result in lower power, thinner form factors, and longer battery life designs. Broadwell-Y will stretch into even lower TDPs enabling 9mm or small fanless designs that maintain current battery lifespans. A new 2nd generation FIVR with modified power delivery design allows for even thinner packaging and a wider range of dynamic frequencies than before. And of course, along with the shift comes an updated converged core design and improved graphics performance.
All of these changes are in service to what Intel claims is a re-invention of the notebook. Compared to 2010 when the company introduced the original Intel Core processor, thus redirecting Intel’s direction almost completely, Intel Core M and the Broadwell-Y changes will allow for some dramatic platform changes.
Notebook thickness will go from 26mm (~1.02 inches) down to a small as 7mm (~0.27 inches) as Intel has proven with its Llama Mountain reference platform. Reductions in total thermal dissipation of 4x while improving core performance by 2x and graphics performance by 7x are something no other company has been able to do over the same time span. And in the end, one of the most important features for the consumer, is getting double the useful battery life with a smaller (and lighter) battery required for it.
But these kinds of advancements just don’t happen by chance – ask any other semiconductor company that is either trying to keep ahead of or catch up to Intel. It takes countless engineers and endless hours to build a platform like this. Today Intel is sharing some key details on how it was able to make this jump including the move to a 14nm FinFET / tri-gate transistor technology and impressive packaging and core design changes to the Broadwell architecture.
Intel 14nm Technology Advancement
Intel consistently creates and builds the most impressive manufacturing and production processes in the world and it has helped it maintain a market leadership over rivals in the CPU space. It is also one of the key tenants that Intel hopes will help them deliver on the world of mobile including tablets and smartphones. At the 22nm node Intel was the first offer 3D transistors, what they called tri-gate and others refer to as FinFET. By focusing on power consumption rather than top level performance Intel was able to build the Haswell design (as well as Silvermont for the Atom line) with impressive performance and power scaling, allowing thinner and less power hungry designs than with previous generations. Some enthusiasts might think that Intel has done this at the expense of high performance components, and there is some truth to that. But Intel believes that by committing to this space it builds the best future for the company.
The Really Good Times are Over
We really do not realize how good we had it. Sure, we could apply that to budget surpluses and the time before the rise of global terrorism, but in this case I am talking about the predictable advancement of graphics due to both design expertise and improvements in process technology. Moore’s law has been exceptionally kind to graphics. We can look back and when we plot the course of these graphics companies, they have actually outstripped Moore in terms of transistor density from generation to generation. Most of this is due to better tools and the expertise gained in what is still a fairly new endeavor as compared to CPUs (the first true 3D accelerators were released in the 1993/94 timeframe).
The complexity of a modern 3D chip is truly mind-boggling. To get a good idea of where we came from, we must look back at the first generations of products that we could actually purchase. The original 3Dfx Voodoo Graphics was comprised of a raster chip and a texture chip, each contained approximately 1 million transistors (give or take) and were made on a then available .5 micron process (we shall call it 500 nm from here on out to give a sense of perspective with modern process technology). The chips were clocked between 47 and 50 MHz (though often could be clocked up to 57 MHz by going into the init file and putting in “SET SST_GRXCLK=57”… btw, SST stood for Sellers/Smith/Tarolli, the founders of 3Dfx). This revolutionary graphics card at the time could push out 47 to 50 megapixels and had 4 MB of VRAM and was released in the beginning of 1996.
My first 3D graphics card was the Orchid Righteous 3D. Voodoo Graphics was really the first successful consumer 3D graphics card. Yes, there were others before it, but Voodoo Graphics had the largest impact of them all.
In 1998 3Dfx released the Voodoo 2, and it was a significant jump in complexity from the original. These chips were fabricated on a 350 nm process. There were three chips to each card, one of which was the raster chip and the other two were texture chips. At the top end of the product stack was the 12 MB cards. The raster chip had 4 MB of VRAM available to it while each texture chip had 4 MB of VRAM for texture storage. Not only did this product double performance from the Voodoo Graphics, it was able to run in single card configurations at 800x600 (as compared to the max 640x480 of the Voodoo Graphics). This is the same time as when NVIDIA started to become a very aggressive competitor with the Riva TnT and ATI was about to ship the Rage 128.
Taking a Fresh Look at GLOBALFOUNDRIES
It has been a while since we last talked about GLOBALFOUNDRIES, and it is high time to do so. So why the long wait between updates? Well, I think the long and short of it is a lack of execution from their stated roadmaps from around 2009 on. When GF first came on the scene they had a very aggressive roadmap about where their process technology will be and how it will be implemented. I believe that GF first mentioned a working 28 nm process in a early 2011 timeframe. There was a lot of excitement in some corners as people expected next generation GPUs to be available around then using that process node.
Fab 1 is the facility where all 32 nm SOI and most 28 nm HKMG are produced.
Obviously GF did not get that particular process up and running as expected. In fact, they had some real issues getting 32 nm SOI running in a timely manner. Llano was the first product GF produced on that particular node, as well as plenty of test wafers of Bulldozer parts. Both were delayed from when they were initially expected to hit, and both had fabrication issues. Time and money can fix most things when it comes to process technology, and eventually GF was able to solve what issues they had on their end. 32 nm SOI/HKMG is producing like gangbusters. AMD has improved their designs on their end to make things a bit easier as well at GF.
While shoring up the 32 nm process was of extreme importance to GF, it seemingly took resources away from further developing 28 nm and below processes. While work was still being done on these products, the roadmap was far too aggressive for what they were able to accomplish. The hits just kept coming though. AMD cut back on 32nm orders, which had a financial impact on both companies. It was cheaper for AMD to renegotiate the contract and take a penalty rather than order chips that it simply could not sell. GF then had lots of line space open on 32 nm SOI (Dresden) that could not be filled. AMD then voided another contract in which they suffered a larger penalty by opting to potentially utilize a second source for 28 nm HKMG production of their CPUs and APUs. AMD obviously was very uncomfortable about where GF was with their 28 nm process.
During all of this time GF was working to get their Luther Forest FAB 8 up and running. Building a new FAB is no small task. This is a multi-billion dollar endeavor and any new FAB design will have complications. Happily for GF, the development of this FAB has gone along seemingly according to plan. The FAB has achieved every major milestone in construction and deployment. Still, the risks involved with a FAB that could reach around $8 billion+ are immense.
2012 was not exactly the year that GF expected, or hoped for. It was tough on them and their partners. They also had more expenses such as acquiring Chartered back in 2009 and then acquiring the rather significant stake that AMD had in the company in the first place. During this time ATIC has been pumping money into GF to keep it afloat as well as its aspirations at being a major player in the fabrication industry.