Fee PHI fo fum; Intel changes the smell of a Pentium

Subject: General Tech | September 5, 2012 - 12:49 PM |
Tagged: Xeon Phi, xeon, larrabee, knights corner, Intel, hot chips

The Register is back with more information from Hot Chips about Intel's Xeon Phi coprocessor, which seems to be much more than just a GPU in drag.  Inside the shell you will find at least 50 cores and at least 8GB of GDDR5 graphics, wwith the cores being very heavily modified 22-nanometer Tri-Gate process Pentium P54C chips clocked somewhere between 1.2-1.6GHz.  There is a brand new Vector Processing Unit which processes 512-bit SIMD instructions and sports an Extended Math Unit to handle calculations with hardware not software.  Read on for more details about the high-speed ring interconnects that allow these chips to communicate among themselves and with the Xeon server it will be a part of.

ElReg_intel_xeon_phi_block_diagram.jpg

"Intel has been showing off the performance of the "Knights Corner" x86-based coprocessor for so long that it's easy to forget that it is not yet a product you can actually buy. Back in June, Knights Corner was branded as the "Xeon Phi", making it clear that Phi was a Xeon coprocessor even if it does not bear a lot of resemblance to the Xeon processors at the heart of the vast majority of the world's servers."

Here is some more Tech News from around the web:

Tech Talk

Source: The Register

Intel Introduces Xeon Phi: Larrabee Unleashed

Subject: Processors | June 19, 2012 - 08:46 AM |
Tagged: Xeon Phi, xeon e5, nvidia, larrabee, knights corner, Intel, HPC, gpgpu, amd

Intel does not respond well when asked about Larabee.  Though Intel has received a lot of bad press from the gaming community about what they were trying to do, that does not necessarily mean that Intel was wrong about how they set up the architecture.  The problem with Larrabee was that it was being considered as a consumer level product with an eye for breaking into the HPC/GPGPU market.  For the consumer level, Larrabee would have been a disaster.  Intel simply would not have been able to compete with AMD and NVIDIA for gamers’ hearts.
 
The problem with Larrabee and the consumer space was a matter of focus, process decisions, and die size.  Larrabee is unique in that it is almost fully programmable and features really only one fixed function unit.  In this case, that fixed function unit was all about texturing.    Everything else relied upon the large array of x86 processors and their attached vector units.  This turns out to be very inefficient when it comes to rendering games, which is the majority of work for the consumer market in graphics cards.  While no outlet was able to get a hold of a Larrabee sample and run benchmarks on it, the general feeling was that Intel would easily be a generation behind in performance.  When considering how large the die size would have to be to even get to that point, it was simply not economical for Intel to produce these cards.
 
phi_01.jpg
 
Xeon Phi is essentially an advanced part based on the original Larrabee architecture.
 
This is not to say that Larrabee does not have a place in the industry.  The actual design lends itself very nicely towards HPC applications.  With each chip hosting many x86 processors with powerful vector units attached, these products can provide tremendous performance in HPC applications which can leverage these particular units.  Because Intel utilized x86 processors instead of the more homogenous designs that AMD and NVIDIA use (lots of stream units doing vector and scalar, but no x86 units or a more traditional networking fabric to connect them).  This does give Intel a leg up on the competition when it comes to programming.  While GPGPU applications are working with products like OpenCL, C++ AMP, and NVIDIA’s CUDA, Intel is able to rely on many current programming languages which can utilize x86.  With the addition of wide vector units on each x86 core, it is relatively simple to make adjustments to utilize these new features as compared to porting something over to OpenCL.
 
So this leads us to the Intel Xeon Phi.  This is the first commercially available product based on an updated version of the Larrabee technology.  The exact code name is Knights Corner.  This is a new MIC (many integrated cores) product based on Intel’s latest 22 nm Tri-Gate process technology.  The details are scarce on how many cores this product actually contains, but it looks to be more than 50 of a very basic “Pentium” style core;  essentially low die space, in-order, and all connected by a robust networking fabric that allows fast data transfer between the memory interface, PCI-E interface, and the cores.
 
intelphi.jpg
 
Each Xeon Phi promises more than 1 TFLOP of performance (as measured by Linpack).  When combined with the new Xeon E5 series of processors, these products can provide a huge amount of computing power.  Furthermore, with the addition of the Cray interconnect technology that Intel acquired this year, clusters of these systems could provide for some of the fastest supercomputers on the market.  While it will take until the end of this year at least to integrate these products into a massive cluster, it will happen and Intel expects these products to be at the forefront of driving performance from the Petascale to the Exascale.
 
phi_02.jpg
 
These are the building blocks that Intel hopes to utilize to corner the HPC market.  Providing powerful CPUs and dozens if not hundreds of MIC units per cluster, the potential computer power should bring us to the Exascale that much sooner.
 
Time will of course tell if Intel will be successful with Xeon Phi and Knights Corner.  The idea behind this product seems sound, and the addition of powerful vector units being attached to simple x86 cores should make the software migration to massively parallel computing just a wee bit easier than what we are seeing now with GPU based products from AMD and NVIDIA.  The areas that those other manufacturers have advantages over Intel are that of many years of work with educational institutions (research), software developers (gaming, GPGPU, and HPC), and industry standards groups (Khronos).  Xeon Phi has a ways to go before being fully embraced by these other organizations, and its future is certainly not set in stone.  We have yet to see 3rd party groups get a hold of these products and put them to the test.  While Intel CPUs are certainly class leading, we still do not know of the full potential of these MIC products as compared to what is currently available in the market.
 

The one positive thing for Intel’s competitors is that it seems their enthusiasm for massively parallel computing is justified.  Intel just entered that ring with a unique architecture that will certainly help push high performance computing more towards true heterogeneous computing. 

Source: Intel

Intel Processors Power The Majority of Top 500 Supercomputers, Looking to Expand With MIC Solutions

Subject: General Tech, Processors | November 25, 2011 - 05:45 PM |
Tagged: xeon, SC11, mic, many integrated core, knights corner, Intel

This year saw the 40th anniversary of (the availability of) the world’s first microprocessor- the Intel 4004 processor- and Intel is as strong as ever. On the supercomputing and HPC (High Performance Computing) front, Intel processors are powering the majority of the Top 500 supercomputers, and at this years supercomputing conference (SC11) the company talked about their current and future high performance silicon. Mainly, Intel talked about its new Intel Xeon E5 family of processors and the new Many Integrated Cores Knights Corner Larrabee successor.

220px-Intel_xeon_e7.jpg

The Intel Xeon E5 is available now.

The new Xeon chips are launching now and should be widely available within the first half of 2012. Several (lucky) supercomputing centers have already gotten their hands on the new chips and are now powering 10 systems on the Top 500 list where the 20,000 Xeon E5 CPUs are delivering a combined 3.4 Petaflops.

According to benchmarks, Intel is expecting a respectable 70% performance increase on HPC workloads versus the previous generation Xeon 5600 CPUs. Further Intel stated that the new E5 silicon is capable of as much as a 2x increase in raw FLOPS performance, according to Linpack benchmarks.

Intel is reporting that demand for the initial production run chips is “approximately 20 times greater than previous generation processors.” Rajeeb Hazra, the General Manager of Technical Computing of Intel’s Datacenenter and Connected Systems Group, stated that “customer acceptance of the Intel Xeon E5 processor has exceeded our expectations and is driving the fastest debut on the TOP 500 list of any processor in Intel’s history.” The company further reiterated several supercomputers that are set to go online son and will be powered by the new E5 CPUs including the 10 Petaflops Stampede computer at the Texas Advanced Computing Center and the 1 Petaflops Pleiades expansion for NASA.

Intel Xeon Top 500.png

While Intel processors are powering the majority of the world’s fastest supercomputers, graphics card hardware and GPGPU software has started to make its way into quite a few supercomputers as powerful companion processors that can greatly outperform a similar number of traditional CPUs (assuming the software can take advantage of the GPU hardware of course). In response to this, Intel has been working on it’s own MIC (Many Integrated Core) solution for a few years now. Starting with Larrabee, then Knights Ferry, and now Knights Corner, Intel has been working on silicon that using numerous small processing cores that can use the X86 instruction set to power highly parallel applications. Examples given by Intel as useful applications for their Many Integrated Core hardware includes weather modeling, tomography, and protein folding.

Intel Many Integrated Core.png

Knights Corner is the company’s latest iteration of MIC hardware, and is the first hardware that is commercially available. Knights Corner is capable of delivering more than 1 Teraflops of double precision floating point performance. Hazra stated that “having this performance now in a single chip based on Intel MIC architecture is a milestone that will once again be etched into HPC history” much like Intel’s first Teraflop supercomputer that utilized 9,680 Pentium Pro CPUs in 1997.

What’s interesting about Knights Corner lies in the ability of the hardware to run existing applications without porting to alternative programing languages like Nvidia’s CUDA or AMD’s Stream GPU languages. That is not to say that the hardware itself is not interesting, however. Knights Corner will be produced using Intel’s Tri-Gate transistors on a 22nm manufacturing process, and will feature “more than 50 cores.” Unlike current GPGPU solutions, the Knights Corner hardware is fully accessible and can be programmed as if the card is it’s own HPC node running a Linux based operating system.

More information on the Knights Corner architecture can be found here. I think it will be interesting to see how well Knights Corner will be adopted for high performance workloads versus graphics cards from Nvidia and AMD, especially now that the industry has already begun adapting GPGPU solutions using such programming technologies like CUDA, and graphics cards are becoming more general purpose (or at least less specialized) in hardware design. Is Intel too late for the (supercomputing market adoption) party, or just in time? What do you think?

Source: Intel

Bulldozers at Knights Corner; duelling server chips

Subject: General Tech | November 16, 2011 - 09:36 AM |
Tagged: xeon e5, xeon, servers, opteron, knights ferry, knights corner, interlagos, hp, dell, bulldozer, acer

As you would expect, no sooner does AMD release news on its new line of Bulldozer era Opterons, Intel follows suit with news on their next generation of server chips.  AMD hit the news and the server room first thanks to interest shown by Dell, HP and Acer.   These vendors have based a series of 2U servers on AMD's new chip as well as a family of blade servers.  Dell's Poweredge C6145 was probably the most ambitious, with 4 sockets you can have 128 cores and 1TB of DDR3 in a 2U rack mount server and FusionIO was suggesting the inclusion of their 1.2TB Iodrive Duo card to ensure your storage media can keep up.

Intel also spoke with The Inquirer and other news sites about their new Xeon E5 processor family as well as providing more information about Knights Bridge. Intel has reached out to a different set of clients for the new Xeon, focusing on NVIDIA's latest target market of High Performance Computing (that HPC acronym you see hanging around Fermi).  They tout over 10,000 chips sold, some of which are sitting pretty in the TOP500.  Also on display was their Knights Ferry accelerator board, again targeted for the HPC crowd that NVIDIA has been courting.

So this processor generation we have Intel and NVIDIA fighting it out for HPC customers, while AMD seems to be without major competition in high density computing, although ARM has certainly been making inroads into that market.  

dell-2u-quad-socket-opteron-185x185.jpg

"AMD's partners have shown a small but impressive array of Bulldozer Opteron kit. Dell's 2U eight socket beast was arguably the most impressive of the lot on show in Munich, but AMD will know it needs more than just one vendor in its fight against Intel. Thankfully it has the might of HP also showing that its traditional rackmount and blade servers can make use of AMD's Bulldozer silicon."

Here is some more Tech News from around the web:

Tech Talk

 

Source: The Register

IDF 2011: Intel Many Integrated Core (MIC) Knights Corner

Subject: Processors, Shows and Expos | September 15, 2011 - 10:54 AM |
Tagged: idf, idf 2011, knights ferry, knights corner, mic, terascale

During Justin Rattner's closing keynote at the Intel Developer Forum he discussed the pending changes to the Many Integrated Core Architectures (MIC) that we previously knew as the Terascale projects.  While we have heard about the Knights Ferry component for some time, and it was basically used a software development platform for Intel's many-core initiative. 

02.jpg

Impressive to see at this stage, the upcoming Knights Corner product will actually be built on the new 22nm tri-gate transistors and with more than 50 cores.  They haven't posted more details on what exactly ">50" refers to but it does mean that Intel continues to progress down this path and is going to be pushing the terascale computing projects into the future. 

Rattner also indicated that not all of the cores on the many-core projects have to be identical and we will soon see designs that combine more than the x86 processors to make for truly heterogeneous computing platforms. 

03.jpg

Research into the program continues including things like stacked and shared memory, new communications protocols like optical interconnects, etc.  We are just as eager to see the fruits of this research as we were for its application to gaming and graphics that eventually failed.

Source: PCPer

Intel Hopes For Exaflop Capable Supercomputers Within 10 Years

Subject: Systems | June 21, 2011 - 12:52 AM |
Tagged: supercomputing, mic, larrabee, knights corner, Intel

Silicon Graphics International and Intel recently announced plans to reach exascale levels of computational power within ten years. Exascale computing amounts to computers that are capable of delivering 1,000+ petaflops (One exaflop is 1000 petaflops) of computational horsepower to process quintillions of calculations. To put that in perspective, today’s supercomputers are just now breaking into the level of single-digit petaflop performance, with the fastest supercomputer delivering 8.16 petaflops. It is capable of this thanks to many thousands of eight core CPUs, whereas other top 500 supercomputers are starting to utilize a CPU and GPU combination in order to achieve petaflop performance.

The Aubrey Isle Silicon Inside Knights Corner

This partnering of Central Processing Unit (CPU) and GPU (or other accelerator) allows high performance supercomputers to achieve much higher performance than with CPUs alone. Intel CPUs power close to 80% of the top 500 Supercomputers; however, they have begun to realize that specialized accelerators are able to speed up highly parallel computing tasks. Specifically, Intel plans to combine Xeon processors with successors to their Knights Corner Many Integrated Core accelerator to reach exascale performance levels when combined with other data transfer and inter-core communication advancements. Knights Corner is an upcoming successor to the Knights Ferry and Larrabee processors.

Computer World quotes Eng Lim Goh, the CTO of SGI, in stating that “Accelerators such as graphics processors (GPUs) are currently being used with CPUs to execute more calculations per second. While some accelerators achieve desired results, many are not satisfied with the performance related to the time and cost spent porting applications to work with accelerators.”

Knights corner will be able to run x86 based software and features 50 cores based on a 22nm manufacturing process.  Each core will run four threads at 1.2 GHz, have 8 MB of cache, and will be supported by 512 bit vector processing units.  It’s predecessor, Knights Ferry is based on 32 45nm cores and eight contained in a Xeon server and are capable of 7.4 teraflops. Their MIC chip is aimed directly at NVIDIA’s CUDA and AMD’s OpenCL graphics processors, and is claimed to offer performance in addition to ease of use as they are capable of running traditional x86 based software.

It looks like the CPU-only supercomputers will be seeing more competition from GPU and MIC accelerated supercomputers, and will eventually be replaced at the exascale level. AMD and NVIDIA are betting heavily on their OpenCL and CUDA programmable graphics cards while Intel is going with a chip capable of running less specialized but widely used x86 programmable chips.  It remains to be seen which platform will be victorious; however, the increased competition should hasten the advancement of high performance computing power.  You can read more about Intel’s plan for Many Integrated Core accelerated supercomputing here.

Larrabee rides again, almost ... meet Knights Corner the new Many Integrated Core design

Subject: General Tech | June 20, 2011 - 09:11 AM |
Tagged: Intel, mic, larrabee, knights corner, 50 GPGPU

Knights Corner is not exactly Larrabee but the idea behind both are very similar.  A large number of GPGPUs are integrated with a CPU, Intel is using a Xeon core now as opposed to a Pentium; with the GPGPUs hooked up in a similar method to Larrabee's ring of Pentium cores.  The design is proven as they have sold units of the previous generation Kights Ferry and offers a feature that a lot of programmers are going to appreciate; instead of needing to learn a new language like CUDA or OpenCL, standard x86 scalar code is used to program these chips.  This architecture is also expected to scale very well, for as ARM recently pointed out only specific multithreaded applications continue to scale well as more cores are added.   Drop by The Inquirer for more information.

KnightsFerry.jpg

They will likely be sold as PCIe card like the Knights Ferry card pictured above.

"CHIPMAKER Intel has announced its second generation hybrid core technology codenamed 'Knights Corner'.

Knights Corner is Intel's second chip in its Many Integrated Core (MIC) chip line and will feature Xeon X86 cores and more than 50 GPGPU cores loosely based on what was previously known as Larrabee. Knights Corner will be fabricated using Intel's 22nm tri-gate process node beginning in 2012, though the firm would not be drawn on the exact core count at this time."

Here is some more Tech News from around the web:

Tech Talk

Source: The Inquirer