IBM Prepares Power9 CPUs to Power Servers and Supercomputers In 2018

Subject: Processors | September 2, 2016 - 01:39 AM |
Tagged: IBM, power9, power 3.0, 14nm, global foundries, hot chips

Earlier this month at the Hot Chips symposium, IBM revealed details on its upcoming Power9 processors and architecture. The new chips are aimed squarely at the data center and will be used for massive number crunching in big data and scientific applications in servers and supercomputer nodes.

Power9 is a big play from Big Blue, and will help the company expand its precense in the Intel-ruled datacenter market. Power9 processors are due out in 2018 and will be fabricated at Global Foundries on a 14nm HP FinFET process. The chips feature eight billion transistors and utilize an “execution slice microarchitecture” that lets IBM combine “slices” of fixed, floating point, and SIMD hardware into cores that support various levels of threading. Specifically, 2 slices make an SMT4 core and 4 slices make an SMT8 core. IBM will have Power9 processors with 24 SMT4 cores or 12 SMT8 cores (more on that later). Further, Power9 is IBM’s first processor to support its Power 3.0 instruction set.

IBM Power9.jpg

According to IBM, its Power9 processors are between 50% to 125% faster than the previous generation Power8 CPUs depending on the application tested. The performance improvement is thanks to a doubling of the number of cores as well as a number of other smaller improvements including:

  • A 5 cycle shorter pipeline versus Power8
  • A single instruction random number generator (RNG)
  • Hardware assisted garbage collection for interpreted languages (e.g. Java)
  • New interrupt architecture
  • 128-bit quad precision floating point and decimal math support
    • Important for finance and security markets, massive databases and money math.
    • IEEE 754
  • CAPI 2.0 and NVLink support
  • Hardware accelerators for encryption and compression

The Power9 processor features 120 MB of direct attached eDRAM that acts as an L3 cache (256 GB/s). The chips offer up 7TB/s of aggregate fabric bandwidth which certainly sounds impressive but that is a number with everything added together. With that said, there is a lot going on under the hood. Power9 supports 48 lanes of PCI-E 4.0 (2 GB/s per lane per direction), 48 lanes of proprietary 25Gbps accelerator lanes – these will be used for NVLink 2.0 to connect to NVIDIA GPUs as well as to connect to FPGAs, ASICs, and other accelerators or new memory technologies using CAPI 2.0 (Coherent Accelerator Processor Interface) – , and four 16Gbps SMP links (NUMA) used to combine four quad socket Power9 boards into a single 16 socket “cluster.”

These are processors that are built to scale and tackle the big data problems. In fact, not only is Google interested in Power9 to power its services, but the US Department of Energy will be building two supercomputers using IBM’s Power9 CPUs and NVIDI’s Volta GPUs. Summit and Sierra will offer between 100 to 300 Petaflops of computer power and will be installed at Oak Ridge National Laboratory and Lawrence Livermore National Laboratory respectively. There, some of the projects they will tackle is enabling the researchers to visualize the internals of a virtual light water reactor, research methods to improve fuel economy, and delve further into bioinformatics research.

The Power9 processors will be available in four variants that differ in the number of cores and number of threads each core supports. The chips are broken down into Power9 SO (Scale Out) and Power9 SU (Scale Up) and each group has two processors depending on whether you need a greater number of weaker cores or a smaller number of more powerful cores. Power9 SO chips are intended for multi-core systems and will be used in servers with one or two sockets while Power9 SU chips are for multi-processor systems with up to four sockets per board and up to 16 total sockets per cluster when four four socket boards are linked together. Power9 SO uses DDR4 memory and supports a theoretical maximum 4TB of memory (1TB with today’s 64GB DIMMS) and 120 GB/s of bandwidth while Power9 SU uses IBM’s buffered “Centaur” memory scheme that allows the systems to address a theoretical maximum of 8TB of memory (2TB with 64GB DIMMS) at 230 GB/s. In other words, the SU series is Big Blue’s “big guns.”

Power9 SO Die Shot Photo.jpg

A photo of the 24 core SMT4 Power9 SO die.

Here is where it gets a bit muddy. The processors are further broken down by an SMT4 or SMT8 and both Power9 SO and Power9 SU have both options. There are Power9 CPUs with 24 SMT4 cores and there are CPUs with 12 SMT8 cores. IBM indicated that SMT4 (four threads per core) was suited to systems running Linux and virtualization with emphasis on high core counts. Meanwhile SMT8 (eight threads per core) is a better option for large logical partitions (one big system versus partitioning out the compute cluster into smaller VMs as above) and running IBM’s Hypervisor. In either case (24 SMT4 or 12 SMT8) there is the same number of total threads, but you are able to choose whether you want fewer “stronger” threads on each core or more (albeit weaker) threads per core depending on which you workloads are optimized for.

Servers supporting Power9 are already under development by Google and Rackspace and blueprints are even available from the OpenPower Foundation. Currently, it appears that Power9 SO will emerge as soon as the second half of next year (2H 2017) with Power9 SU following in 2018 which would line up with the expected date for the Summit and Sierra supercomputer launches.

This is not a chip that will be showing up in your desktop any time soon, but it is an interesting high performance processor! I will be keeping an eye on updates from Oak Ridge lab hehe.

Intel Will Release 14nm Coffee Lake To Succeed Kaby Lake In 2018

Subject: Processors | July 28, 2016 - 02:47 PM |
Tagged: kaby lake, Intel, gt3e, coffee lake, 14nm

Intel will allegedly be releasing another 14nm processor following Kaby Lake (which is itself a 14nm successor to Skylake) in 2018. The new processors are code named "Coffee Lake" and will be released alongside low power runs of 10nm Cannon Lake chips. 

Intel Coffe Lake to Coexist With Cannon Lake.jpg

Not much information is known about Coffee Lake outside of leaked slides and rumors, but the first processors slated to launch in 2018 will be mainstream mobile chips that will come in U and HQ mobile flavors which are 15W to 28W and 35W to 45W TDP chips respectively. Of course, these processors will be built on a very mature 14nm process with the usual small performance and efficiency gains beyond Skylake and Kaby Lake. The chips should have a better graphics unit, but perhaps more interesting is that the slides suggest that Coffee Lake will be the first architecture where Intel will bring "hexacore" (6 core) processors into mainstream consumer chips! The HQ-class Coffee Lake processors will reportedly come in two, four, and six core variants with Intel GT3e class GPUs. Meanwhile the lower power U-class chips top out at dual cores with GT3e class graphics. This is interesting because Intel has previous held back the six core CPUs for its more expensive and higher margin HEDT and Xeon platforms.

Of course 2018 is also the year for Cannon Lake which would have been the "tock" in Intel's old tick-tock schedule (which is no more) as the chips will move to a smaller process node and then Intel would improve on the 10nm process from there in future architectures. Cannon Lake is supposed to be built on the tiny 10nm node, and it appears that the first chips on this node will be ultra low power versions for laptops and tablets. Occupying the ULV platform's U-class (15W) and Y-class (4.5W), Cannon Lake CPUs will be dual cores with GT2 graphics. These chips should sip power while giving comparable performance to Kaby and Coffee Lake perhaps even matching the performance of the Coffee Lake U processors!

Stay tuned to PC Perspective for more information!

Qualcomm Announces X16 Modem Featuring Gigabit LTE

Subject: Mobile | February 12, 2016 - 04:26 PM |
Tagged: X16 modem, qualcomm, mu-mimo, modem, LTE, Gigabit LTE, FinFET, Carrier Aggregation, 14nm

Qualcomm’s new X16 LTE Modem is the industry's first Gigabit LTE chipset to be announced, achieving speeds of up to 1 Gbps using 4x Carrier Aggregation. The X16 succeeds the recently announced X12 modem, improving on the X12's 3x Carrier Aggregation and moving from LTE CAT 12 to CAT 16 on the downlink, while retaining CAT 13 on the uplink.

qc_x16_image.png

"In order to make a Gigabit Class LTE modem a reality, Qualcomm added a suite of enhancements – built on a foundation of commercially-proven Carrier Aggregation technology. The Snapdragon X16 LTE modem employs sophisticated digital signal processing to pack more bits per transmission with 256-QAM, receives data on four antennas through 4x4 MIMO, and supports for up to 4x Carrier Aggregation — all of which come together to achieve unprecedented download speeds."

Gigabit speeds are only possible if multiple data streams are connected to the device simultaneously, and with the new X16 modem such aggregation is performed using LTE-U and LAA.

X16_slide.png

(Image via EE Times)

What does all of this mean? Aggregation is a term you'll see a lot as we progress into the next generation of cellular data technology, and with the X16 Qualcomm is emphasizing carrier over link aggregation. Essentially Carrier Aggregation works by combining the carrier LTE data signal (licensed, high transmit power) with a shorter-range, shared spectrum (unlicensed, low transmit power) LTE signal. When the signals are combined at the device (i.e. your smartphone), significantly better throughput is possible with this larger (aggregated) data ‘pipe’.

qc5.png

Qualcomm lists the four main options for unlicensed LTE deployment as follows:

  • LTE-U: Based on 3GPP Rel. 12, LTE-U targets early mobile operators deployments in USA, Korea and India, with coexistence tests defined by LTE-U forum
  • LAA: Defined in 3GPP Rel. 13, LAA (Licensed Assisted Access) targets deployments in Europe, Japan, & beyond.
  • LWA: Defined in 3GPP Rel. 13, LWA (LTE - Wi-Fi link aggregation) targets deployments where the operators already has carrier Wi-Fi deployments.
  • MulteFire: Broadens the LTE ecosystem to new deployment opportunities by operating solely in unlicensed spectrum without a licensed anchor channel

The X16 is also Qualcomm’s first modem to be built on 14nm FinFet process, which Qualcomm says is highly scalable and will enable the company to evolve the modem product line “to address an even wider range of product, all the way down to power-efficient connectivity for IoT devices.”

Qualcomm has already begun sampling the X16, and expects the first commercial products in the second half of 2016.

Source: Qualcomm

CES 2016 Podcast Day 1 - Lenovo, NVIDIA Press Conference, new AMD GPUs and more!

Subject: General Tech | January 5, 2016 - 04:40 AM |
Tagged: podcast, video, CES, CES 2016, Lenovo, Thinkpad, x1 carbon, x1 yoga, nvidia, pascal, amd, Polaris, FinFET, 14nm

CES 2016 Podcast Day 1 - 01/05/16

CES is just beginning. Join us for announcements from Lenovo, NVIDIA Press Conference, new AMD GPUs and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

  • iTunes - Subscribe to the podcast directly through the iTunes Store
  • RSS - Subscribe through your regular RSS reader
  • MP3 - Direct download link to the MP3 file

Hosts: Ryan Shrout, Josh Walrath, Allyn Malventano, Ken Addison and Sebastian Peak

Program length: 1:11:05

Be sure to subscribe to the PC Perspective YouTube channel!!

Author:
Manufacturer: AMD

AMD Polaris Architecture Coming Mid-2016

In early December, I was able to spend some time with members of the newly formed Radeon Technologies Group (RTG), which is a revitalized and compartmentalized section of AMD that is taking over all graphics work. During those meetings, I was able to learn quite a bit about the plans for RTG going forward, including changes for AMD FreeSync and implementation of HDR display technology, and their plans for the GPUOpen open-sourced game development platform.  Perhaps most intriguing of all: we received some information about the next-generation GPU architecture, targeted for 2016.

Codenamed Polaris, this new architecture will be the 4th generation of GCN (Graphics Core Next), and it will be the first AMD GPU that is built on FinFET process technology. These two changes combined promise to offer the biggest improvement in performance per watt, generation to generation, in AMD’s history.

polaris-5.jpg

Though the amount of information provided about the Polaris architecture is light, RTG does promise some changes to the 4th iteration of its GCN design. Those include primitive discard acceleration, an improved hardware scheduler, better pre-fetch, increased shader efficiency, and stronger memory compression. We have already discussed in a previous story that the new GPUs will include HDMI 2.0a and DisplayPort 1.3 display interfaces, which offer some impressive new features and bandwidth. From a multimedia perspective, Polaris will be the first GPU to include support for h.265 4K decode and encode acceleration.

polaris-15.jpg

This slide shows us quite a few changes, most of which were never discussed specifically that we can report, coming to Polaris. Geometry processing and the memory controller stand out as potentially interesting to me – AMD’s Fiji design continues to lag behind NVIDIA’s Maxwell in terms of tessellation performance and we would love to see that shift. I am also very curious to see how the memory controller is configured on the entire Polaris lineup of GPUs – we saw the introduction of HBM (high bandwidth memory) with the Fury line of cards.

Continue reading our overview of the AMD Polaris announcement!!

Samsung adding AMD to their customers?

Subject: General Tech | December 22, 2015 - 02:07 PM |
Tagged: amd, Samsung, 14nm, rumour

The talk around the watercooler includes a rumour that AMD may use Samsung to produce at least some of their 14nm chips in the coming year.  If true this has been a huge year for Samsung who produce NVIDIA chips as well as recently picking up a contract with Apple to produce some of their A9 SoCs.  The rumour still includes GLOBALFOUNDRIES as a source for APUs and GPUs so this would make Samsung a second source for working silicon, which we can hope will alleviate some of AMD's difficulty in maintaining supplies of products.  This could also help fund Samsung's development of their 10nm FinFET node which the claim should be in production by the end of 2016.  As always, take the rumour for what it is but if you want to learn more about what is being said you can pop over to The Inquirer.

Samsung_10_nm_Graphic_Wide.jpg

"A report in South Korea's Electronic Times, which cited unknown sources, said that Samsung Electronics will start making new chips for AMD sometime next year."

Here is some more Tech News from around the web:

Tech Talk

Source: The Inquirer

Apollo Lake is coming next summer

Subject: General Tech | December 1, 2015 - 04:20 PM |
Tagged: Intel, apollo lake, 14nm, rumours

DigiTimes has heard rumours that Intel will be refreshing their processor lineup with Apollo Lake processors in June and August 2016, with devices powered by the new processors in October.  This is rather good news considering how slowly new PC sales have been growing over the past year, it is nice to see that we will still have some new CPUs in the coming year.  Details are rather scarce, the 14nm chips will come in dual and quad-core options and use the new Gen9 GPU which will support Ultra HD output. You can expect 6-10W TDP, these are very much mobile oriented chips.

index.jpg

"Seeing the trend, Intel is scheduled to mass produce its next-generation Apollo Lake-based processors in June-August 2016 with related entry-level PC products becoming available in the market in October 2016, according to sources from the upstream supply chain."

Here is some more Tech News from around the web:

Tech Talk

Source: DigiTimes

Podcast #375 - Snapdragon 820, Lenovo Yoga 900, R9 380X and more!

Subject: General Tech | November 12, 2015 - 02:47 PM |
Tagged: podcast, video, qualcomm, snapdragon 820, Lenovo, yoga 900, be quiet!, amd, r9 380x, GLOBALFOUNDRIES, 14nm, FinFET, nvidia, asus, Maximus VIII Extreme, Thrustmaster, T300

PC Perspective Podcast #375 - 11/12/2015

Join us this week as we discuss the Snapdragon 820, Lenovo Yoga 900, R9 380X and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

  • iTunes - Subscribe to the podcast directly through the iTunes Store
  • RSS - Subscribe through your regular RSS reader
  • MP3 - Direct download link to the MP3 file

Hosts: Ryan Shrout, Jeremy Hellstrom, Josh Walrath, and Sebastian Peak

Subscribe to the PC Perspective YouTube Channel for more videos, reviews and podcasts!!

 

Podcast #369 - Fable Legends DX12 Benchmark, Apple A9 SoC, Intel P3608 SSD, and more!

Subject: General Tech | October 1, 2015 - 02:17 PM |
Tagged: podcast, video, fable legends, dx12, apple, A9, TSMC, Samsung, 14nm, 16nm, Intel, P3608, NVMe, logitech, g410, TKL, nvidia, geforce now, qualcomm, snapdragon 820

PC Perspective Podcast #369 - 10/01/2015

Join us this week as we discuss the Fable Legends DX12 Benchmark, Apple A9 SoC, Intel P3608 SSD, and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

  • iTunes - Subscribe to the podcast directly through the iTunes Store
  • RSS - Subscribe through your regular RSS reader
  • MP3 - Direct download link to the MP3 file

Hosts: Ryan Shrout, Josh Walrath, Jeremy Hellstrom, and Allyn Malventano

Program length: 1:42:35

  1. Week in Review:
  2. 0:54:10 This episode of PC Perspective is brought to you by…Zumper, the quick and easy way to find your next apartment or home rental. To get started and to find your new home go to http://zumper.com/PCP
  3. News item of interest:
  4. Hardware/Software Picks of the Week:
  5. Closing/outro

Subscribe to the PC Perspective YouTube Channel for more videos, reviews and podcasts!!

Check out the architecture at Skylake and Sunrise Point

Subject: Processors | August 5, 2015 - 03:20 PM |
Tagged: sunrise point, Skylake, Intel, ddr4, Core i7-6700K, core i7, 6700k, 14nm

By now you have read through Ryan's review of the new i7-6700 and the ASUS Z170-A as well as the related videos and testing, if not we will wait for you to flog yourself in punishment and finish reading the source material.  Now that you are ready, take a look at what some of the other sites thought about the new Skylake chip and Sunrise Point chipset.  For instance [H]ard|OCP managed to beat Ryan's best overclock, hitting 4.7GHz/3600MHz at 1.32v vCore with some toasty but acceptable CPU temperatures.  The full review is worth looking for and if some of the rumours going around are true you should take H's advice, if you think you want one buy it now.

1438184048QCHM79YbJA_1_6_l.jpg

"Today we finally get to share with you our Intel Skylake experiences. As we like to, we are going to focus on Instructions Per Clock / IPC and overclocking this new CPU architecture. We hope to give our readers a definitive answer to whether or not it is time to make the jump to a new desktop PC platform."

Here are some more Processor articles from around the web:

Processors

Source: [H]ard|OCP