Banning Talos worship might be worth it, POWER9 still lags behind

Subject: Processors | July 6, 2018 - 03:16 PM |
Tagged: IBM, power9, talos 2, EPYC, xeon

Phoronix were recently given access to three servers running three different POWER9 Talos II configurations and compared them to EPYC and Xeon.  On paper these systems look amazing, thanks to the architecture supporting four threads per core; they tested  a dual 4-core Talos II system, a Talos II Lite with a single 22-core CPU and a Talos II with dual 18-core processors with thread counts of 32, 88, and 144 respectively. 

There were certainly usage scenarios where the dual 18 core system could outpace even the EPYC 7601 but could not surpass the dual Xeon Gold 6138 system.  The review covers a fair amount of benchmarks and configurations but doesn't begin to scratch the surface of wide variety of server configurations you need to consider before abandoning POWER9 altogether but the key metric, performance per dollar, shows these architecture solidly in the middle of the pack.

powerpc-1-thumb.jpg

"Back in April we were able to run some IBM POWER9 benchmarks with remote access to the open-source friendly Talos II systems by Raptor Computer Systems. We were recently allowed remote access again to a few different configurations of this libre hardware with three different POWER9 processor combinations. Here are those latest benchmarks compared to Intel Xeon and AMD EPYC server processors."

Here are some more Processor articles from around the web:

Processors

Source: Phoronix

Google tests its Power and takes a shot at Intel

Subject: General Tech | October 17, 2016 - 12:48 PM |
Tagged: google, Intel, power9, zaius

Not too long ago Google revealed it had updated the code that runs behind its popular web based services to make it more hardware agnostic.  With a trivial tweak to the code their software can switch between running on Intel x86, IBM Power or 64-bit ARM cores.  On Friday Google Cloud's technical program manager, John Zipfel, provided more information on the OpenCAPI compliant Zaius P9 server that is in development and revealed it will use an IBM Power 9 chip.  As it will be OpenCAPI, it will use interconnects such as NVIDIA's NVLink or AMD's as yet unnamed fabric interconnect but will not leverage Intel's.  The Register has a bit more information on Google's plans and the Zaius here.

maurice-evans-as-dr-zaius-in-planet-of-the.jpg

"Google has gently increased pressure on Intel – its main source for data-center processors – by saying it is "looking forward" to using chips from IBM and other semiconductor rivals."

Here is some more Tech News from around the web:

Tech Talk

Source: The Register

Podcast #416 - Intel SSD 600p, Leaked Zen Performance, new iPhone and PS4 and more!

Subject: General Tech | September 8, 2016 - 01:28 PM |
Tagged: Zen, VR, video, ssd, sony, qualcomm, ps4 pro, ps4, prodigy, power9, podcast, phanteks, logitech, iPhone 7, Intel, IBM, gtx 1050, geekbench, Enthoo, corsair, carbide, amd, a10, 600p

PC Perspective Podcast #416 - 09/08/16

Join us this week as we discuss the Intel SSD 600p, Leaked Zen Performance, new iPhone and PS4 and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

Hosts:  Ryan Shrout, Allyn Malventano, Josh Walrath and Jeremy Hellstrom

Program length: 1:48:53
  1. Week in Review:
  2. News items of interest:
    1. Razer PAX 2016
  3. Hardware/Software Picks of the Week
  4. Closing/outro

IBM Prepares Power9 CPUs to Power Servers and Supercomputers In 2018

Subject: Processors | September 2, 2016 - 01:39 AM |
Tagged: IBM, power9, power 3.0, 14nm, global foundries, hot chips

Earlier this month at the Hot Chips symposium, IBM revealed details on its upcoming Power9 processors and architecture. The new chips are aimed squarely at the data center and will be used for massive number crunching in big data and scientific applications in servers and supercomputer nodes.

Power9 is a big play from Big Blue, and will help the company expand its precense in the Intel-ruled datacenter market. Power9 processors are due out in 2018 and will be fabricated at Global Foundries on a 14nm HP FinFET process. The chips feature eight billion transistors and utilize an “execution slice microarchitecture” that lets IBM combine “slices” of fixed, floating point, and SIMD hardware into cores that support various levels of threading. Specifically, 2 slices make an SMT4 core and 4 slices make an SMT8 core. IBM will have Power9 processors with 24 SMT4 cores or 12 SMT8 cores (more on that later). Further, Power9 is IBM’s first processor to support its Power 3.0 instruction set.

IBM Power9.jpg

According to IBM, its Power9 processors are between 50% to 125% faster than the previous generation Power8 CPUs depending on the application tested. The performance improvement is thanks to a doubling of the number of cores as well as a number of other smaller improvements including:

  • A 5 cycle shorter pipeline versus Power8
  • A single instruction random number generator (RNG)
  • Hardware assisted garbage collection for interpreted languages (e.g. Java)
  • New interrupt architecture
  • 128-bit quad precision floating point and decimal math support
    • Important for finance and security markets, massive databases and money math.
    • IEEE 754
  • CAPI 2.0 and NVLink support
  • Hardware accelerators for encryption and compression

The Power9 processor features 120 MB of direct attached eDRAM that acts as an L3 cache (256 GB/s). The chips offer up 7TB/s of aggregate fabric bandwidth which certainly sounds impressive but that is a number with everything added together. With that said, there is a lot going on under the hood. Power9 supports 48 lanes of PCI-E 4.0 (2 GB/s per lane per direction), 48 lanes of proprietary 25Gbps accelerator lanes – these will be used for NVLink 2.0 to connect to NVIDIA GPUs as well as to connect to FPGAs, ASICs, and other accelerators or new memory technologies using CAPI 2.0 (Coherent Accelerator Processor Interface) – , and four 16Gbps SMP links (NUMA) used to combine four quad socket Power9 boards into a single 16 socket “cluster.”

These are processors that are built to scale and tackle the big data problems. In fact, not only is Google interested in Power9 to power its services, but the US Department of Energy will be building two supercomputers using IBM’s Power9 CPUs and NVIDI’s Volta GPUs. Summit and Sierra will offer between 100 to 300 Petaflops of computer power and will be installed at Oak Ridge National Laboratory and Lawrence Livermore National Laboratory respectively. There, some of the projects they will tackle is enabling the researchers to visualize the internals of a virtual light water reactor, research methods to improve fuel economy, and delve further into bioinformatics research.

The Power9 processors will be available in four variants that differ in the number of cores and number of threads each core supports. The chips are broken down into Power9 SO (Scale Out) and Power9 SU (Scale Up) and each group has two processors depending on whether you need a greater number of weaker cores or a smaller number of more powerful cores. Power9 SO chips are intended for multi-core systems and will be used in servers with one or two sockets while Power9 SU chips are for multi-processor systems with up to four sockets per board and up to 16 total sockets per cluster when four four socket boards are linked together. Power9 SO uses DDR4 memory and supports a theoretical maximum 4TB of memory (1TB with today’s 64GB DIMMS) and 120 GB/s of bandwidth while Power9 SU uses IBM’s buffered “Centaur” memory scheme that allows the systems to address a theoretical maximum of 8TB of memory (2TB with 64GB DIMMS) at 230 GB/s. In other words, the SU series is Big Blue’s “big guns.”

Power9 SO Die Shot Photo.jpg

A photo of the 24 core SMT4 Power9 SO die.

Here is where it gets a bit muddy. The processors are further broken down by an SMT4 or SMT8 and both Power9 SO and Power9 SU have both options. There are Power9 CPUs with 24 SMT4 cores and there are CPUs with 12 SMT8 cores. IBM indicated that SMT4 (four threads per core) was suited to systems running Linux and virtualization with emphasis on high core counts. Meanwhile SMT8 (eight threads per core) is a better option for large logical partitions (one big system versus partitioning out the compute cluster into smaller VMs as above) and running IBM’s Hypervisor. In either case (24 SMT4 or 12 SMT8) there is the same number of total threads, but you are able to choose whether you want fewer “stronger” threads on each core or more (albeit weaker) threads per core depending on which you workloads are optimized for.

Servers supporting Power9 are already under development by Google and Rackspace and blueprints are even available from the OpenPower Foundation. Currently, it appears that Power9 SO will emerge as soon as the second half of next year (2H 2017) with Power9 SU following in 2018 which would line up with the expected date for the Summit and Sierra supercomputer launches.

This is not a chip that will be showing up in your desktop any time soon, but it is an interesting high performance processor! I will be keeping an eye on updates from Oak Ridge lab hehe.

IBM is feeling Powerful in the Core Wars, details on the Power9 architecture have arrived

Subject: General Tech | April 7, 2016 - 03:43 PM |
Tagged: GLOBALFOUNDRIES, IBM, power9

IBM's Power9 processor is scheduled to appear on the scene just over a year from now and finally we have some details about what it will be.  Firstly the core count is to be two higher than Intel, 24 cores and is optimized for use in two socket servers.  The chips are 14nm FinFETs fabbed by GLOBALFOUNDRIES which will be compatible with modern industry standards including DDR4, PCIe 4.0 and NVLink 2.0 so you can even take advantage of Jen-Hsun's latest products. 

The list of customers is quite impressive, Google has moved to Power8 already and described changing to the infrastructure as simple as flipping a switch,  the US Department of Energy will build their next HPCs using Power9 and Rackspace is currently working with Google to develop Power9 server blueprints for the Open Compute Project. 

Several Chinese companies will take advantage of those OpenPower blueprints to develop their own 'partner chips', Power8 and 9 architecture which will be using 10nm gates in 2018 to 2020.  This is somewhat amusing considering the shipping of Xeon processors to China has been banned by the US Government.  Check out more of the slides from IBM's presentation at The Register.

power_roadmap_crop.jpg

"IBM's Power9 processor, due to arrive in the second half of next year, will have 24 cores, double that of today's Power8 chips, it emerged today.

Meanwhile, Google has gone public with its Power work – confirming it has ported many of its big-name web services to the architecture, and that rebuilding its stack for non-Intel gear is a simple switch flip."

Here is some more Tech News from around the web:

Tech Talk

 

Source: The Register

Oak Ridge National Laboratory Chooses IBM and NVIDIA for Two Supercomputers, Summit and Sierra

Subject: General Tech, Systems | November 27, 2014 - 08:53 PM |
Tagged: nvidia, IBM, power9, Volta

The Oak Ridge National Laboratory has been interested in a successor for their Titan Supercomputer. Sponsored by the US Department of Energy, the new computer will be based on NVIDIA's Volta (GPU) and IBM's POWER9 (CPU) architectures. Its official name will be “Summit”, and it will have a little sibling, “Sierra”. Sierra, also based on Volta and POWER9, will be installed at the Lawrence Livermore National Laboratory.

nvidia-ibm-coral_summit_sierra_supercomputers.png

Image Credit: NVIDIA

The main feature of these supercomputers is expected to be “NVLink”, which is said to allow unified memory between CPU and GPU. This means that, if you have a workload that alternates rapidly between serial and parallel tasks, that you can save the lag in transferring memory between each switch. One example of this would be a series of for-each loops on a large data set with a bit of logic, checks, and conditional branches between. Memory management is like a lag between each chunk of work, especially across two banks of memory attached by a slow bus.

Summit and Sierra are both built by IBM, while Titan, Oak Ridge's previous supercomputer, was developed by Cray. Not much is known about the specifics of Sierra, but Summit will be about 5x-10x faster (peak computational throughput) than its predecessor at less than a fifth of the nodes. Despite the fewer nodes, it will suck down more total power (~10MW, up from Titan's ~9MW).

These two supercomputers are worth $325 million USD (combined). They are expected to go online in 2017. According to Reuters, an additional $100 million USD will go toward research into "extreme" supercomputing.

Source: Anandtech