Subject: General Tech | June 19, 2014 - 01:19 PM | Jeremy Hellstrom
Tagged: xeon, Intel, FPGA
Intel has just revealed what The Register is aptly referring to as the FrankenChip, a hybrid Xeon E5 and FPGA chip. This will allow large companies to access the power of a Xeon and be able to offload some work onto an FPGA they can program and optimize themselves. The low power FPGA is actually on the chip, as opposed to Microsoft's recent implementation which saw FPGA's added to PCIe slots. Intel's solution does not use up a slot and also offers direct access to the Xeon cache hierarchy and system memory via QPI which will allow for increased performance. Another low power shot has been fired at ARM's attempts to grow their share of the server market but we shall see if the inherent complexity of programming an FPGA to work with an x86 is more or less attractive than switching to ARM.
"Intel has expanded its chip customization business to help it take on the hazy threat posed by some of the world's biggest clouds adopting low-power ARM processors."
Here is some more Tech News from around the web:
- Amazon's new, not-really-3D Fire: Puts Bezos' cash register in YOUR pocket @ The Register
- Amazon Fire Phone will crash and burn @ The Inquirer
- Knitted Circuit Board Lends Flexibility to E-Textiles @ Hack a Day
- 3D Windowing System Developed Using Wayland, Oculus Rift @ Slashdot
- Google Play Store is littered with 'secret keys' @ The Inquirer
- How farsighted is Microsoft's Azure RemoteApp? @ The Register
- Rollei Mini WiFi Camcorder 1 Review @ NikKTech
- The Dell Inspiron 3000 & 5000 Launch Report @ Tech ARP
Subject: General Tech, Graphics Cards | October 16, 2013 - 10:00 PM | Scott Michaud
Tagged: FPGA, Altera
(Update 10/17/2013, 6:13 PM) Apparently I messed up inputing this into the website last night. To compare FPGAs with current hardware, the Altera Stratix 10 is rated at more than 10 TeraFLOPs compared to the Tesla K20X at ~4 TeraFLOPs or the GeForce Titan at ~4.5 TeraFLOPs. All figures are single precision. (end of update)
Field Programmable Gate Arrays (FPGAs) are not general purpose processors; they are not designed to perform any random instruction at any random time. If you have a specific set of instructions that you want performed efficiently, you can spend a couple of hours compiling your function(s) to an FPGA which will then be the hardware embodiment of your code.
This is similar to an Application-Specific Integrated Circuit (ASIC) except that, for an ASIC, it is the factory who bakes your application into the hardware. Many (actually, to my knowledge, almost every) FPGAs can even be reprogrammed if you can spare those few hours to configure it again.
Altera is a manufacturer of FPGAs. They are one of the few companies who were allowed access to Intel's 14nm fabrication facilities. Rahul Garg of Anandtech recently published a story which discussed compiling OpenCL kernels to FPGAs using Altera's compiler.
Now this is pretty interesting.
The design of OpenCL splits work between "host" and "kernel". The host application is written in some arbitrary language and follows typical programming techniques. Occasionally, the application will run across a large batch of instructions. A particle simulation, for instance, will require position information to be computed. Rather than having the host code loop through every particle and perform some complex calculation, what happens to each particle could be "a kernel" which the host adds to the queue of some accelerator hardware. Normally, this is a GPU with its thousands of cores chunked into groups of usually 32 or 64 (vendor-specific).
An FPGA, on the other hand, can lock itself to the specific set of instructions. It can decide to, within a few hours, configure some arbitrary number of compute paths and just churn through each kernel call until it is finished. The compiler knows exactly the application it will need to perform while the host code runs on the CPU.
This is obviously designed for enterprise applications, at least as far into the future as we can see. Current models are apparently priced in the thousands of dollars but, as the article points out, has the potential to out-perform a 200W GPU at just a tenth of the power. This could be very interesting for companies, perhaps a film production house, who wants to install accelerator cards for sub-d surfaces or ray tracing but would like to develop the software in-house and occasionally update their code after business hours.
Regardless of the potential market, a FPGA-based add-in card simply makes sense for OpenCL and its architecture.