IBM Prepares Power9 CPUs to Power Servers and Supercomputers In 2018

Subject: Processors | September 2, 2016 - 01:39 AM |
Tagged: IBM, power9, power 3.0, 14nm, global foundries, hot chips

Earlier this month at the Hot Chips symposium, IBM revealed details on its upcoming Power9 processors and architecture. The new chips are aimed squarely at the data center and will be used for massive number crunching in big data and scientific applications in servers and supercomputer nodes.

Power9 is a big play from Big Blue, and will help the company expand its precense in the Intel-ruled datacenter market. Power9 processors are due out in 2018 and will be fabricated at Global Foundries on a 14nm HP FinFET process. The chips feature eight billion transistors and utilize an “execution slice microarchitecture” that lets IBM combine “slices” of fixed, floating point, and SIMD hardware into cores that support various levels of threading. Specifically, 2 slices make an SMT4 core and 4 slices make an SMT8 core. IBM will have Power9 processors with 24 SMT4 cores or 12 SMT8 cores (more on that later). Further, Power9 is IBM’s first processor to support its Power 3.0 instruction set.

IBM Power9.jpg

According to IBM, its Power9 processors are between 50% to 125% faster than the previous generation Power8 CPUs depending on the application tested. The performance improvement is thanks to a doubling of the number of cores as well as a number of other smaller improvements including:

  • A 5 cycle shorter pipeline versus Power8
  • A single instruction random number generator (RNG)
  • Hardware assisted garbage collection for interpreted languages (e.g. Java)
  • New interrupt architecture
  • 128-bit quad precision floating point and decimal math support
    • Important for finance and security markets, massive databases and money math.
    • IEEE 754
  • CAPI 2.0 and NVLink support
  • Hardware accelerators for encryption and compression

The Power9 processor features 120 MB of direct attached eDRAM that acts as an L3 cache (256 GB/s). The chips offer up 7TB/s of aggregate fabric bandwidth which certainly sounds impressive but that is a number with everything added together. With that said, there is a lot going on under the hood. Power9 supports 48 lanes of PCI-E 4.0 (2 GB/s per lane per direction), 48 lanes of proprietary 25Gbps accelerator lanes – these will be used for NVLink 2.0 to connect to NVIDIA GPUs as well as to connect to FPGAs, ASICs, and other accelerators or new memory technologies using CAPI 2.0 (Coherent Accelerator Processor Interface) – , and four 16Gbps SMP links (NUMA) used to combine four quad socket Power9 boards into a single 16 socket “cluster.”

These are processors that are built to scale and tackle the big data problems. In fact, not only is Google interested in Power9 to power its services, but the US Department of Energy will be building two supercomputers using IBM’s Power9 CPUs and NVIDI’s Volta GPUs. Summit and Sierra will offer between 100 to 300 Petaflops of computer power and will be installed at Oak Ridge National Laboratory and Lawrence Livermore National Laboratory respectively. There, some of the projects they will tackle is enabling the researchers to visualize the internals of a virtual light water reactor, research methods to improve fuel economy, and delve further into bioinformatics research.

The Power9 processors will be available in four variants that differ in the number of cores and number of threads each core supports. The chips are broken down into Power9 SO (Scale Out) and Power9 SU (Scale Up) and each group has two processors depending on whether you need a greater number of weaker cores or a smaller number of more powerful cores. Power9 SO chips are intended for multi-core systems and will be used in servers with one or two sockets while Power9 SU chips are for multi-processor systems with up to four sockets per board and up to 16 total sockets per cluster when four four socket boards are linked together. Power9 SO uses DDR4 memory and supports a theoretical maximum 4TB of memory (1TB with today’s 64GB DIMMS) and 120 GB/s of bandwidth while Power9 SU uses IBM’s buffered “Centaur” memory scheme that allows the systems to address a theoretical maximum of 8TB of memory (2TB with 64GB DIMMS) at 230 GB/s. In other words, the SU series is Big Blue’s “big guns.”

Power9 SO Die Shot Photo.jpg

A photo of the 24 core SMT4 Power9 SO die.

Here is where it gets a bit muddy. The processors are further broken down by an SMT4 or SMT8 and both Power9 SO and Power9 SU have both options. There are Power9 CPUs with 24 SMT4 cores and there are CPUs with 12 SMT8 cores. IBM indicated that SMT4 (four threads per core) was suited to systems running Linux and virtualization with emphasis on high core counts. Meanwhile SMT8 (eight threads per core) is a better option for large logical partitions (one big system versus partitioning out the compute cluster into smaller VMs as above) and running IBM’s Hypervisor. In either case (24 SMT4 or 12 SMT8) there is the same number of total threads, but you are able to choose whether you want fewer “stronger” threads on each core or more (albeit weaker) threads per core depending on which you workloads are optimized for.

Servers supporting Power9 are already under development by Google and Rackspace and blueprints are even available from the OpenPower Foundation. Currently, it appears that Power9 SO will emerge as soon as the second half of next year (2H 2017) with Power9 SU following in 2018 which would line up with the expected date for the Summit and Sierra supercomputer launches.

This is not a chip that will be showing up in your desktop any time soon, but it is an interesting high performance processor! I will be keeping an eye on updates from Oak Ridge lab hehe.

GlobalFoundries Will Allegedly Skip 10nm and Jump to Developing 7nm Process Technology In House (Updated)

Subject: Processors | August 20, 2016 - 03:06 PM |
Tagged: Semiconductor, lithography, GLOBALFOUNDRIES, global foundries, euv, 7nm, 10nm

UPDATE (August 22nd, 11:11pm ET): I reached out to GlobalFoundries over the weekend for a comment and the company had this to say:

"We would like to confirm that GF is transitioning directly from 14nm to 7nm. We consider 10nm as more a half node in scaling, due to its limited performance adder over 14nm for most applications. For most customers in most of the markets, 7nm appears to be a more favorable financial equation. It offers a much larger economic benefit, as well as performance and power advantages, that in most cases balances the design cost a customer would have to spend to move to the next node.

As you stated in your article, we will be leveraging our presence at SUNY Polytechnic in Albany, the talent and know-how gained from the acquisition of IBM Microelectronics, and the world-class R&D pipeline from the IBM Research Alliance—which last year produced the industry’s first 7nm test chip with working transistors."

An unexpected bit of news popped up today via TPU that alleges GlobalFoundries is not only developing 7nm technology (expected), but that the company will skip production of the 10nm node altogether in favor of jumping straight from the 14nm FinFET technology (which it licensed from Samsung) to 7nm manufacturing based on its own in house design process.

Reportedly, the move to 7nm would offer 60% smaller chips at three times the design cost of 14nm which is to say that this would be both an expensive and impressive endeavor. Aided by Extreme Ultraviolet (EUV) lithography, GlobalFoundries expects to be able to hit 7nm production sometime in 2020 with prototyping and small usage of EUV in the year or so leading up to it. The in house process tech is likely thanks to the research being done at the APPC (Advanced Patterning and Productivity Center) in Albany New York along with the expertise of engineers and design patents and technology (e.g. ASML NXE 3300 and 3300B EUV) purchased from IBM when it acquired IBM Microelectronics. The APPC is reportedly working simultaneously on research and development of manufacturing methods (especially EUV where extremely small wavelengths of ultraviolet light (14nm and smaller) are used to etch patterns into silicon) and supporting production of chips at GlobalFoundries' "Malta" fab in New York.

APPC in Albany NY.jpg

Advanced Patterning and Productivity Center in Albany, NY where Global Foundries, SUNY Poly, IBM Engineers, and other partners are forging a path to 7nm and beyond semiconductor manufacturing. Photo by Lori Van Buren for Times Union.

Intel's Custom Foundry Group will start pumping out ARM chips in early 2017 followed by Intel's own 10nm Cannon Lake processors in 2018 and Samsung will be offering up its own 10nm node as soon as next year. Meanwhile, TSMC has reportedly already tapped out 10nm wafers and will being prodction in late 2016/early 2017 and claims that it will hit 5nm by 2020. With its rivals all expecting production of 10nm chips as soon as Q1 2017, GlobalFoundries will be at a distinct disadvantage for a few years and will have only its 14nm FinFET (from Samsung) and possibly its own 14nm tech to offer until it gets the 7nm production up and running (hopefully!).

Previously, GlobalFoundries has stated that:

“GLOBALFOUNDRIES is committed to an aggressive research roadmap that continually pushes the limits of semiconductor technology. With the recent acquisition of IBM Microelectronics, GLOBALFOUNDRIES has gained direct access to IBM’s continued investment in world-class semiconductor research and has significantly enhanced its ability to develop leading-edge technologies,” said Dr. Gary Patton, CTO and Senior Vice President of R&D at GLOBALFOUNDRIES. “Together with SUNY Poly, the new center will improve our capabilities and position us to advance our process geometries at 7nm and beyond.” 

If this news turns out to be correct, this is an interesting move and it is certainly a gamble. However, I think that it is a gamble that GlobalFoundries needs to take to be competitive. I am curious how this will affect AMD though. While I had expected AMD to stick with 14nm for awhile, especially for Zen/CPUs, will this mean that AMD will have to go to TSMC for its future GPUs  or will contract limitations (if any? I think they have a minimum amount they need to order from GlobalFoundries) mean that GPUs will remain at 14nm until GlobalFoundries can offer its own 7nm? I would guess that Vega will still be 14nm, but Navi in 2018/2019? I guess we will just have to wait and see!

Also read:

Source: TechPowerUp

GLOBALFOUNDRIES Achieves 14nm FinFET - Coming to New AMD Products

Subject: Processors | November 6, 2015 - 10:09 AM |
Tagged: tape out, processors, GLOBALFOUNDRIES, global foundries, APU, amd, 14 nm FinFET

GlobalFoundries has today officially announced their success with sample 14 nm FinFET production for upcoming AMD products.


(Image credit: KitGuru)

GlobalFoundries licensed 14 nm LPE and LPP technology from Samsung in 2014, and were producing wafers as early as April of this year. At the time a GF company spokesperson was quoted in this report at KitGuru, stating "the early version (14LPE) is qualified in our fab and our lead product is yielding in double digits. Since 2014, we have taped multiple products and testchips and are seeing rapid progress, in yield and maturity, for volume shipments in 2015." Now they have moved past LPE (Low Power Early) to LPP (Low Power Plus), with new products based on the technology slated for 2016:

"AMD has taped out multiple products using GLOBALFOUNDRIES’ 14nm Low Power Plus (14LPP) process technology and is currently conducting validation work on 14LPP production samples.  Today’s announcement represents another significant milestone towards reaching full production readiness of GLOBALFOUNDRIES’ 14LPP process technology, which will reach high-volume production in 2016."

GlobalFoundries was originally the manufacturing arm of AMD, and has continued to produce the companies processors since the spin-off in 2012. AMD's current desktop FX-8350 CPU was manufactured on 32 nm SOI, and more recently APUs such as the A10-7850K have been produced at 28 nm - both at GlobalFoundries. Intel's latest offerings such as the flagship 6700K desktop CPU are produced with Intel's 14nm process, and the success of the 14LPP production at GlobalFoundries has the potential to bring AMD's new processors closer parity with Intel (at least from a lithography standpoint).

Full PR after the break.

TSMC gets AMD's 28nm APU business

Subject: General Tech | June 17, 2011 - 02:24 PM |
Tagged: TSMC, southern islands, northern islands, llano, global foundries, arm, amd, 40nm, 32nm, 28nm

Back in April there was a kerfuffle in the news about a deal penned between AMD, Global Foundries and TSMC.  It is not worth repeating completely as you can follow the story by using the previous link, suffice to say that it did not indicate problems with the relationship between AMD and Global Foundries. 

The previous post was specifically about 40nm and 32nm process chips, however today we hear from DigiTimes that TSMC has scored a deal with AMD for the 28nm Southern Islands APUs of which we have seen much recently.  The 40nm Northern Islands GPUs will also be produced by TSMC.  That leaves a lot of production capabilities free at Global Foundries to work on ARM processors.  


"AMD reportedly has completed the tape-out of its next-generation GPU, codenamed Southern Islands, on Taiwan Semiconductor Manufacturing Company's (TSMC) 28nm process with High-k Metal Gate (HKMG) technology, according to a Chinese-language Commercial Times report. The chip is set to expected to enter mass produciton at the end of 2011.

TSMC will also be AMD's major foundry partner for the 28nm Krishna and Wichita accelerated processing units (APUs), with volume production set to begin in the first half of 2012, the report said.

TSMC reportedly contract manufactures the Ontario, Zacate and Desna APUs for AMD as well as the Northern Island family of GPUs. All of these use the foundry's 40nm process technology.

TSMC was quoted as saying in previous reports that it had begun equipment move-in for the phase one facility of a new 12-inch fab (Fab 15) with volume production of 28nm technology products slated for the fourth quarter of 2011. The foundry previously said it would begin moving equipment into the facility in June, with volume production expected to kick off in the first quarter of 2012."

Here is some more Tech News from around the web:

Tech Talk

Source: DigiTimes

Visiting Global Foundries Fab 8

Subject: General Tech | April 20, 2011 - 12:09 PM |
Tagged: video, global foundries, foundry, fab 8

[H]ard|OCP were gifted with a chance to visit Global Foundries Fab 8 facility which is being built in central New York state.  [H] took along a video camera to let you see some of the facilities as they stand now.  Not only is this impressive because of the size of the Fab, it is also one of the biggest construction jobs, period.


"GLOBALFOUNDRIES was kind enough to let HardOCP into its new Fab 8 facility in Malta, New York. While far from finished, this 28/20nm plant will be ramping to full production in 2012. Check out the video to understand the sheer scale of the largest construction project currently underway in the United States."

Here is some more Tech News from around the web:

Tech Talk

Source: [H]ard|OCP

Much ado about nothing: AMD and Global Foundries supposed tiff

Subject: General Tech | April 4, 2011 - 11:29 AM |
Tagged: TSMC, global foundries, amd

Over the weekend conspiracy theorists perked their ears about an announced change in the way AMD will purchase 32nm chips from Global Foundries.  What seemed to be odd was the inclusion of the term "paying per good chip", something that is not done in the industry, even with horrible yields such as we saw with TSMC's 40nm process.  A call this morning filled in the missing details and SemiAccurate was there to report on it.  The long and short of it has nothing to do with yields, as they are still looking good.  Instead it seems like a way for AMD to ensure they have good supply of 32nm chips no matter how the actual production lays out and are not stuck paying for unusable chips while at the same time giving Global Foundries a way to get some money out of AMD if yields and sales are high.  This is very good news for companies like ATIC and Mubadala which have a stake in both AMD and Global Foundries.

"The AMD (AMD) and Global Foundries Wafer Purchase Agreement (WPA) that was released yesterday made little to no sense. On a conference call today, AMD’s Interim CEO Thomas Seifert filled in the missing pieces, it all makes sense now.

Few things are more beloved by journalists than a 5:30am PST financial conference call, but this one was worth it, especially in light of the questions left hanging by yesterday’s announcement. We stated that on the surface, it sure sounded like AMD was tearing Global Foundries a new reticle for use in debugging their 32nm process. That however contradicted the facts we had heard on the ground, as of late last year, there simply were not 32nm yield problems. So why was the press release written the way it was, and is really going on?"

Here is some more Tech News from around the web:

Tech Talk

Source: SemiAccurate