Subject: Processors | February 9, 2017 - 02:38 AM | Josh Walrath
Tagged: Zen, Skylake, Samsung, ryzen, kaby lake, ISSCC, Intel, GLOBALFOUNDRIES, amd, AM4, 14 nm FinFET
Yesterday EE Times posted some interesting information that they had gleaned at ISSCC. AMD released a paper describing the design process and advances they were able to achieve with the Zen architecture manufactured on Samsung’s/GF’s 14nm FinFETT process. AMD went over some of the basic measurements at the transistor scale and how it compares to what Intel currently has on their latest 14nm process.
The first thing that jumps out is that AMD claimes that their 4 core/8 thread x86 core is about 10% smaller than what Intel has with one of their latest CPUs. We assume it is either Kaby Lake or Skylake. AMD did not exactly go over exactly what they were counting when looking at the cores because there are some significant differences between the two architectures. We are not sure if that 44mm sq. figure includes the L3 cache or the L2 caches. My guess is that it probably includes L2 cache but not L3. I could be easily wrong here.
Going down the table we see that AMD and Samsung/GF are able to get their SRAM sizes down smaller than what Intel is able to do. AMD has double the amount of L2 cache per core, but it is only about 60% larger than Intel’s 256 KB L2. AMD also has a much smaller L3 cache as well than Intel. Both are 8 MB units but AMD comes in at 16 mm sq. while Intel is at 19.1 mm sq. There will be differences in how AMD and Intel set up these caches, and until we see L3 performance comparisons we cannot assume too much.
(Image courtesy of ISSCC)
In some of the basic measurements of the different processes we see that Intel has advantages throughout. This is not surprising as Intel has been well known to push process technology beyond what others are able to do. In theory their products will have denser logic throughout, including the SRAM cells. When looking at this information we wonder how AMD has been able to make their cores and caches smaller. Part of that is due to the likely setup of cache control and access.
One of the most likely culprits of this smaller size is that the less advanced FPU/SSE/AVX units that AMD has in Zen. They support AVX-256, but it has to be done in double the cycles. They can do single cycle AVX-128, but Intel’s throughput is much higher than what AMD can achieve. AVX is not the end-all, be-all but it is gaining in importance in high performance computing and editing applications. David Kanter in his article covering the architecture explicitly said that AMD made this decision to lower the die size and power constraints for this product.
Ryzen will undoubtedly be a pretty large chip overall once both modules and 16 MB of L3 cache are put together. My guess would be in the 220 mm sq. range, but again that is only a guess once all is said and done (northbridge, southbridge, PCI-E controllers, etc.). What is perhaps most interesting of it all is that AMD has a part that on the surface is very close to the Broadwell-E based Intel i7 chips. The i7-6900K runs at 3.2 to 3.7 GHz, features 8 cores and 16 threads, and around 20 MB of L2/L3 cache. AMD’s top end looks to run at 3.6 GHz, features the same number of cores and threads, and has 20 MB of L2/L3 cache. The Intel part is rated at 140 watts TDP while the AMD part will have a max of 95 watts TDP.
If Ryzen is truly competitive in this top end space (with a price to undercut Intel, yet not destroy their own margins) then AMD is going to be in a good position for the rest of this year. We will find out exactly what is coming our way next month, but all indications point to Ryzen being competitive in overall performance while being able to undercut Intel in TDPs for comparable cores/threads. We are counting down the days...
Subject: Processors | September 13, 2016 - 10:51 PM | Tim Verry
Tagged: GLOBALFOUNDRIES, FD-SOI, 12FDX, process technology
In addition to the company’s efforts to get its own next generation FinFET process technology up and running, GlobalFoundries announced that will continue to pursue FD-SOI process technology with the addition of a 12nm FD-SOI (FDX in GlobalFoundries parlance) node to its roadmap with a slated release of 2019 at the earliest.
FD-SOI stands for Fully Depleted Silicon On Insulator and is a planar process technology that uses a thin insulator on top of the base silicon which is then covered by a very thin layer of silicon that is used as the transistor channel. The promise of FD-SOI is that it offers the performance of a FinFET node with lower power consumption and cost than other bulk processes. While the substrate is more expensive with FD-SOI, it uses 50% of the lithography layers and companies can take advantage of reportedly easy-to-implement body biasing to design a single chip that can fulfill multiple products and roles. For example, in the case of 22FDX – which should start rolling out towards the end of this year – GlobalFoundries claims that it offers the performance of 14 FinFET at the 28nm bulk pricing. 22FDX is actually a 14nm front end (FEOL) and 28nm back end of line (BEOL) combined. Notably, it purportedly uses 70% lower power than 28nm HKMG.
A GloFo 22nm FD-SOI "22FDX" transistor.
The FD-SOI design offers lower static leakage and allows chip makers to use body biasing (where substrate is polarized) to balance performance and leakage. Forward Body Biasing allows the transistor to switch faster and/or operate at much lower voltages. On the other hand, Reverse Body Biasing further reduces leakage and frequency to improves energy efficiency. Dynamic Body Biasing (video link) allows for things like turbo modes whereby increasing voltage to the back gate can increase transistor switching speed or reducing voltage can reduce switching speeds and leakage. For a process technology that is aimed at battery powered wearables, mobile devices, and various Internet of Things products, energy efficiency and being able to balance performance and power depending on what is needed is important.
22FDX offers body biasing.
While the process node numbers are not as interesting as the news that FD-SOI will continue itself (thanks to marketing mucking up things heh), GlobalFoundries did share that 12FDX (12nm FD-SOI) will be a true full node shrink that will offer the performance of 10nm FinFET (presumably its own future FinFET tech though they do not specify) with better power characteristics and lower cost than 16nm FinFET. I am not sure if GlobalFoundries is using theoretical numbers or compared it to TSMC’s process here since they do not have their own 16nm FinFET process. Further, 12FDX will feature 15% higher performance and up to 50% lower power consumption that today’s FinFET technologies. The future process is aimed at the “cost sensitive mobile market” that includes IoT, automotive (entertainment and AI), mobile, and networking. FD-SOI is reportedly well suited for processors that combine both digital and analog (RF) elements as well.
Following the roll out of 22FDX GlobalFoundries will be preparing its Fab 1 facility in Dresden, Germany for the 12nm FD-SOI (12FDX) process. The new process is slated to begin tapping out products in early 2019 which should mean products using chips will hit the market in 2020.
The news is interesting because it indicates that there is still interest and research/development being made on FD-SOI and GlobalFoundries is the first company to talk about next generation process plans. Samsung and STMicroelectronics also support FD-SOI but have not announced their future plans yet.
If I had to guess, Samsung will be the next company to talk about future FD-SOI as the company continues to offer both FinFET and FD-SOI to its customers though they certainly do not talk as much about the latter. What are your thoughts on FD-SOI and its place in the market?
Also read: FD-SOI Expands, But Is It Disruptive? @ EETimes
Clean Sheet and New Focus
It is no secret that AMD has been struggling for some time. The company has had success through the years, but it seems that the last decade has been somewhat bleak in terms of competitive advantages. The company has certainly made an impact in throughout the decades with their 486 products, K6, the original Athlon, and the industry changing Athlon 64. Since that time we have had a couple of bright spots with the Phenom II being far more competitive than expected, and the introduction of very solid graphics performance in their APUs.
Sadly for AMD their investment in the “Bulldozer” architecture was misplaced for where the industry was heading. While we certainly see far more software support for multi-threaded CPUs, IPC is still extremely important for most workloads. The original Bulldozer was somewhat rushed to market and was not fully optimized, while the “Piledriver” based Vishera products fixed many of these issues we have not seen the non-APU products updated to the latest Steamroller and Excavator architectures. The non-APU desktop market has been served for the past four years with 32nm PD-SOI based parts that utilize a rebranded chipset base that has not changed since 2010.
Four years ago AMD decided to change course entirely with their desktop and server CPUs. Instead of evolving the “Bulldozer” style architecture featuring CMT (Core Multi-Threading) they were going to do a clean sheet design that focused on efficiency, IPC, and scalability. While Bulldozer certainly could scale the thread count fairly effectively, the overall performance targets and clockspeeds needed to compete with Intel were just not feasible considering the challenges of process technology. AMD brought back Jim Keller to lead this effort, an industry veteran with a huge amount of experience across multiple architectures. Zen was born.
Hot Chips 28
This year’s Hot Chips is the first deep dive that we have received about the features of the Zen architecture. Mike Clark is taking us through all of the changes and advances that we can expect with the upcoming Zen products.
Zen is a clean sheet design that borrows very little from previous architectures. This is not to say that concepts that worked well in previous architectures were not revisited and optimized, but the overall floorplan has changed dramatically from what we have seen in the past. AMD did not stand still with their Bulldozer products, and the latest Excavator core does improve upon the power consumption and performance of the original. This evolution was simply not enough considering market pressures and Intel’s steady improvement of their core architecture year upon year. Zen was designed to significantly improve IPC and AMD claims that this product has a whopping 40% increase in IPC (instructions per clock) from the latest Excavator core.
AMD also has focused on scaling the Zen architecture from low power envelopes up to server level TDPs. The company looks to have pushed down the top end power envelope of Zen from the 125+ watts of Bulldozer/Vishera into the more acceptable 95 to 100 watt range. This also has allowed them to scale Zen down to the 15 to 25 watt TDP levels without sacrificing performance or overall efficiency. Most architectures have sweet spots where they tend to perform best. Vishera for example could scale nicely from 95 to 220 watts, but the design did not translate well into sub-65 watt envelopes. Excavator based “Carrizo” products on the other hand could scale from 15 watts to 65 watts without real problems, but became terribly inefficient above 65 watts with increased clockspeeds. Zen looks to address these differences by being able to scale from sub-25 watt TDPs up to 95 or 100. In theory this should allow AMD to simplify their product stack by offering a common architecture across multiple platforms.
GlobalFoundries Will Allegedly Skip 10nm and Jump to Developing 7nm Process Technology In House (Updated)
Subject: Processors | August 20, 2016 - 07:06 PM | Tim Verry
Tagged: Semiconductor, lithography, GLOBALFOUNDRIES, global foundries, euv, 7nm, 10nm
UPDATE (August 22nd, 11:11pm ET): I reached out to GlobalFoundries over the weekend for a comment and the company had this to say:
"We would like to confirm that GF is transitioning directly from 14nm to 7nm. We consider 10nm as more a half node in scaling, due to its limited performance adder over 14nm for most applications. For most customers in most of the markets, 7nm appears to be a more favorable financial equation. It offers a much larger economic benefit, as well as performance and power advantages, that in most cases balances the design cost a customer would have to spend to move to the next node.
As you stated in your article, we will be leveraging our presence at SUNY Polytechnic in Albany, the talent and know-how gained from the acquisition of IBM Microelectronics, and the world-class R&D pipeline from the IBM Research Alliance—which last year produced the industry’s first 7nm test chip with working transistors."
An unexpected bit of news popped up today via TPU that alleges GlobalFoundries is not only developing 7nm technology (expected), but that the company will skip production of the 10nm node altogether in favor of jumping straight from the 14nm FinFET technology (which it licensed from Samsung) to 7nm manufacturing based on its own in house design process.
Reportedly, the move to 7nm would offer 60% smaller chips at three times the design cost of 14nm which is to say that this would be both an expensive and impressive endeavor. Aided by Extreme Ultraviolet (EUV) lithography, GlobalFoundries expects to be able to hit 7nm production sometime in 2020 with prototyping and small usage of EUV in the year or so leading up to it. The in house process tech is likely thanks to the research being done at the APPC (Advanced Patterning and Productivity Center) in Albany New York along with the expertise of engineers and design patents and technology (e.g. ASML NXE 3300 and 3300B EUV) purchased from IBM when it acquired IBM Microelectronics. The APPC is reportedly working simultaneously on research and development of manufacturing methods (especially EUV where extremely small wavelengths of ultraviolet light (14nm and smaller) are used to etch patterns into silicon) and supporting production of chips at GlobalFoundries' "Malta" fab in New York.
Advanced Patterning and Productivity Center in Albany, NY where Global Foundries, SUNY Poly, IBM Engineers, and other partners are forging a path to 7nm and beyond semiconductor manufacturing. Photo by Lori Van Buren for Times Union.
Intel's Custom Foundry Group will start pumping out ARM chips in early 2017 followed by Intel's own 10nm Cannon Lake processors in 2018 and Samsung will be offering up its own 10nm node as soon as next year. Meanwhile, TSMC has reportedly already tapped out 10nm wafers and will being prodction in late 2016/early 2017 and claims that it will hit 5nm by 2020. With its rivals all expecting production of 10nm chips as soon as Q1 2017, GlobalFoundries will be at a distinct disadvantage for a few years and will have only its 14nm FinFET (from Samsung) and possibly its own 14nm tech to offer until it gets the 7nm production up and running (hopefully!).
Previously, GlobalFoundries has stated that:
“GLOBALFOUNDRIES is committed to an aggressive research roadmap that continually pushes the limits of semiconductor technology. With the recent acquisition of IBM Microelectronics, GLOBALFOUNDRIES has gained direct access to IBM’s continued investment in world-class semiconductor research and has significantly enhanced its ability to develop leading-edge technologies,” said Dr. Gary Patton, CTO and Senior Vice President of R&D at GLOBALFOUNDRIES. “Together with SUNY Poly, the new center will improve our capabilities and position us to advance our process geometries at 7nm and beyond.”
If this news turns out to be correct, this is an interesting move and it is certainly a gamble. However, I think that it is a gamble that GlobalFoundries needs to take to be competitive. I am curious how this will affect AMD though. While I had expected AMD to stick with 14nm for awhile, especially for Zen/CPUs, will this mean that AMD will have to go to TSMC for its future GPUs or will contract limitations (if any? I think they have a minimum amount they need to order from GlobalFoundries) mean that GPUs will remain at 14nm until GlobalFoundries can offer its own 7nm? I would guess that Vega will still be 14nm, but Navi in 2018/2019? I guess we will just have to wait and see!
- To 7nm And Beyond (Interview @ Semiconductor Engineering)
- GloFo Looks For 7nm Leadership @ Electronics Weekly
- GlobalFoundries develops 7nm and 10nm technologies in-house @ KitGuru
- SUNY Poly and GLOBALFOUNDRIES Announce New $500M R&D Program in Albany To Accelerate Next Generation Chip Technology @ GlobalFoundries (PR)
- AMD GPU Roadmap: Capsaicin Names Upcoming Architectures @ PC Perspective
- Next Gen Graphics and Process Migration: 20 nm and Beyond @ PC Perspective
Bristol Ridge Takes on Mobile: E2 Through FX
It is no secret that AMD has faced an uphill battle since the release of the original Core 2 processors from Intel. While stayed mostly competitive through the Phenom II years, they hit some major performance issues when moving to the Bulldozer architecture. While on paper the idea of Chip Multi-Threading sounded fantastic, AMD was never able to get the per thread performance up to expectations. While their CPUs performed well in heavily multi-threaded applications, they just were never seen in as positive of a light as the competing Intel products.
The other part of the performance equation that has hammered AMD is the lack of a new process node that would allow it to more adequately compete with Intel. When AMD was at 32 nm PD-SOI, Intel had introduced its 22nm TriGate/FinFET. AMD then transitioned to a 28nm HKMG planar process that was more size optimized than 32nm, but did not drastically improve upon power and transistor switching performance.
So AMD had a double whammy on their hands with an underperforming architecture and limitted to no access to advanced process nodes that would actually improve their power and speed situation. They could not force their foundry partners to spend billions on a crash course in FinFET technology to bring that to market faster, so they had to iterate and innovate on their designs.
Bristol Ridge is the fruit of that particular labor. It is also the end point to the architecture that was introduced with Bulldozer way back in 2011.
Subject: Processors | April 26, 2016 - 09:32 PM | Jeremy Hellstrom
Tagged: Wraith, Godavari, GLOBALFOUNDRIES, FM2+, amd, X4 880K
Remember that FM2+ refresh which Josh informed you about back in March? The APUs have started arriving on test benches and can be benchmarked independently to see what this ~$100 processor and the Wraith cooler are capable of. Neoseeker compares the new 880K against the older FX-4350 in a long series of benchmarks which show the 880K to be the better part in most cases. There are some interesting exceptions to this, in which the FX-4350's slightly higher frequency allows it to pull ahead by a small margin so there are cases where the less expensive chip would make sense. Read the full review to see which chip makes more sense for you.
"Today we take a look at the AMD Athlon X4 880K, a quad-core FM2+ processor with 4.0/4.2GHz base/Turbo clocks and unlocked multiplier priced at under $100 USD. It's designed for enthusiasts on a budget looking for the fastest multi-core Athlon processor yet without any integrated GPU to add to the cost. It even shares the 95W TDP of AMD's higher-end APUs for optimized power consumption that further leads to more overclocking headroom."
Here are some more Processor articles from around the web:
- The AMD Athlon X4 880K Review @ Hardware Canucks
- A10-7870K vs. Core i3-6100 CPU @ Hardware Secrets
- £150 Gaming CPU: AMD FX 8370 (w/ Wraith) vs Intel Core i5-6400 @ Kitguru
Subject: General Tech | April 7, 2016 - 07:43 PM | Jeremy Hellstrom
Tagged: GLOBALFOUNDRIES, IBM, power9
IBM's Power9 processor is scheduled to appear on the scene just over a year from now and finally we have some details about what it will be. Firstly the core count is to be two higher than Intel, 24 cores and is optimized for use in two socket servers. The chips are 14nm FinFETs fabbed by GLOBALFOUNDRIES which will be compatible with modern industry standards including DDR4, PCIe 4.0 and NVLink 2.0 so you can even take advantage of Jen-Hsun's latest products.
The list of customers is quite impressive, Google has moved to Power8 already and described changing to the infrastructure as simple as flipping a switch, the US Department of Energy will build their next HPCs using Power9 and Rackspace is currently working with Google to develop Power9 server blueprints for the Open Compute Project.
Several Chinese companies will take advantage of those OpenPower blueprints to develop their own 'partner chips', Power8 and 9 architecture which will be using 10nm gates in 2018 to 2020. This is somewhat amusing considering the shipping of Xeon processors to China has been banned by the US Government. Check out more of the slides from IBM's presentation at The Register.
"IBM's Power9 processor, due to arrive in the second half of next year, will have 24 cores, double that of today's Power8 chips, it emerged today.
Meanwhile, Google has gone public with its Power work – confirming it has ported many of its big-name web services to the architecture, and that rebuilding its stack for non-Intel gear is a simple switch flip."
Here is some more Tech News from around the web:
- Intel restructures financial reporting to 'provide visibility' @ The Register
- Torvalds Hasn't Given Up On Linux Desktop Domination, Will 'Wear Them Down' @ Slashdot
- Dude makes Raspberry Pi Zero Nintendo Game Boy emulator @ The Inquirer
- FBI Telling Congress How It Hacked iPhone @ Slashdot
- Windows 10: first Insider Build with Bash for Linux arrives @ The Inquirer
- Error Correction of 3D Printers @ Hack a Day
Clockspeed Jump and More!
On March 1st AMD announced the availability of two new processors as well as more information on the A10 7860 APU.
The two new units are the A10-7890K and the Athlon X4 880K. These are both Kaveri based parts, but of course the Athlon has the GPU portion disabled. Product refreshes for the past several years have followed a far different schedule than the days of yore. Remember back in time when the Phenom II series and the competing Core 2 series would have clockspeed updates that were expected yearly, if not every half year with a slightly faster top end performer to garner top dollar from consumers?
Things have changed, for better or worse. We have so far seen two clockspeed bumps for the Kaveri /Godavari based APU. Kaveri was first introduced over two years ago with the A10-7850K and the lower end derivatives. The 7850K has a clockspeed that ranges from 3.7 GHz to the max 4 GHz with boost. The GPU portion is clocked at 720 MHz. This is a 95 watt TDP part that is one of the introductory units from GLOBALFOUNDRIES 28 nm HKMG process.
Today the new top end A10-7890K is clocked at 4.1 GHz to 4.3 GHz max. The GPU receives a significant boost in performance with a clockspeed of 866 MHz. The combination of CPU and GPU clockspeed increases push the total performance of the part exceeding 1 TFLOPs. It features the same dual module/quad core Godavari design as well as the 8 GCN Units. The interesting part here is that the APU does not exceed the 95 watt TDP that it shares with the older and slower 7850K. It is also a boost in performance from last year’s refresh of the A10-7870K which is clocked 200 MHz slower on the CPU portion but retains the 866 MHz speed of the GPU. This APU is fully unlocked so a user can easily overclock both the CPU and GPU cores.
The Athlon X4 880K is still based on the Godavari family rather than the Carizzo update that the X4 845 uses. This part is clocked from 4.0 to 4.2 GHz. It again retains the 95 watt TDP rating of the previous Athlon X4 CPUs. Previously the X4 860K was the highest clocked unit at 3.7 GHz to 4.0, but the 880K raises that to 4 to 4.2 GHz. A 300 MHz gain in base clock is pretty significant as well as stretching that ceiling to 4.2 GHz. The Godavari modules retain their full amount of L2 cache so the 880K has 4 MB available to it. These parts are very popular with budget enthusiasts and gaming builds as they are extremely inexpensive and perform at an acceptable level with free overclocking thrown in.
AMD Polaris Architecture Coming Mid-2016
In early December, I was able to spend some time with members of the newly formed Radeon Technologies Group (RTG), which is a revitalized and compartmentalized section of AMD that is taking over all graphics work. During those meetings, I was able to learn quite a bit about the plans for RTG going forward, including changes for AMD FreeSync and implementation of HDR display technology, and their plans for the GPUOpen open-sourced game development platform. Perhaps most intriguing of all: we received some information about the next-generation GPU architecture, targeted for 2016.
Codenamed Polaris, this new architecture will be the 4th generation of GCN (Graphics Core Next), and it will be the first AMD GPU that is built on FinFET process technology. These two changes combined promise to offer the biggest improvement in performance per watt, generation to generation, in AMD’s history.
Though the amount of information provided about the Polaris architecture is light, RTG does promise some changes to the 4th iteration of its GCN design. Those include primitive discard acceleration, an improved hardware scheduler, better pre-fetch, increased shader efficiency, and stronger memory compression. We have already discussed in a previous story that the new GPUs will include HDMI 2.0a and DisplayPort 1.3 display interfaces, which offer some impressive new features and bandwidth. From a multimedia perspective, Polaris will be the first GPU to include support for h.265 4K decode and encode acceleration.
This slide shows us quite a few changes, most of which were never discussed specifically that we can report, coming to Polaris. Geometry processing and the memory controller stand out as potentially interesting to me – AMD’s Fiji design continues to lag behind NVIDIA’s Maxwell in terms of tessellation performance and we would love to see that shift. I am also very curious to see how the memory controller is configured on the entire Polaris lineup of GPUs – we saw the introduction of HBM (high bandwidth memory) with the Fury line of cards.
Podcast #377 - AMD Radeon Software Crimson, our Holiday Gift Guide, Scott Wasson moving to AMD and more!
Subject: General Tech | December 3, 2015 - 08:40 PM | Ken Addison
Tagged: podcast, video, amd, radeon software, crimson, holiday gift guide, ATIC, GLOBALFOUNDRIES, raspberry pi zero, scott wasson, tech report, Thinkpad, yoga p40
PC Perspective Podcast #377 - 12/03/2015
Join us this week as we discuss AMD Radeon Software Crimson, our Holiday Gift Guide, Scott Wasson moving to AMD and more!
The URL for the podcast is: http://pcper.com/podcast - Share with your friends!
- iTunes - Subscribe to the podcast directly through the Store
- RSS - Subscribe through your regular RSS reader
- MP3 - Direct download link to the MP3 file
Hosts: Jeremy Hellstrom, Josh Walrath, and Sebastian Peak
Program length: 0:54:20
Week in Review:
News item of interest:
Hardware/Software Picks of the Week:
Ryan: Emergency Underwear
Sebastian: Quad-core CPU for $70!!