Subject: General Tech | October 31, 2013 - 03:48 PM | Ken Addison
Tagged: podcast, video, R9 290X, amd, radeon, 290x crossfire, 280x, r9 280x, gtx 770, gtx 780, arm, mali, Altera
PC Perspective Podcast #275 - 10/31/2013
Join us this week as we discuss the AMD Radeon R9 290X, ARMTechCon 2013, NVIDIA Pricedrops and more!
The URL for the podcast is: http://pcper.com/podcast - Share with your friends!
- iTunes - Subscribe to the podcast directly through the Store
- RSS - Subscribe through your regular RSS reader
- MP3 - Direct download link to the MP3 file
Hosts: Ryan Shrout, Jeremy Hellstrom, Josh Walrath, and Allyn Malventano
Subject: Processors, Mobile | October 29, 2013 - 12:24 PM | Ryan Shrout
Tagged: techcon, Intel, arm techcon, arm, Altera, 14nm
In February of this year Intel and Altera announced that they would be partnering to build Altera FPGAs using the upcoming Intel 14nm tri-gate process technology. The deal was important for the industry as it marked one of the first times Intel has shared its process technology with another processor company. Seen as the company's most valuable asset, the decision to outsource work in the Intel fabrication facilities could have drastic ramifications for Intel's computing divisions and the industry as a whole. This seems to back up the speculation that Intel is having a hard time keeping their Fabs at anywhere near 100% utilization with only in-house designs.
Today though, news is coming out that Altera is going to be included ARM-based processing cores, specifically those based on the ARMv8 64-bit architecture. Starting in 2014 Altera's high-end Stratix 10 FPGA that uses four ARM Cortex-A53 cores will be produced by Intel fabs.
The deal may give Intel pause about its outsourcing strategy. To date the chip giant has experimented with offering its leading-edge fab processes as foundry services to a handful of chip designers, Altera being one of its largest planned customers to date.
Altera believes that by combing the ARMv8 A53 cores and Intel's 14nm tri-gate transistors they will be able to provide FPGA performance that is "two times the core performance" of current high-end 28nm options.
While this news might upset some people internally at Intel's architecture divisions, the news couldn't be better for ARM. Intel is universally recognized as being the process technology leader, generally a full process node ahead of the competition from TSMC and GlobalFoundries. I already learned yesterday that many of ARM's partners are skipping the 20nm technology from non-Intel foundries and instead are looking towards the 14/16nm FinFET transitions coming in late 2014.
ARM has been working with essentially every major foundry in the business EXCEPT Intel and many viewed Intel's chances of taking over the mobile/tablet/phone space as dependent on its process technology advantage. But if Intel continues to open up its facilities to the highest bidders, even if those customers are building ARM-based designs, then it could drastically improve the outlook for ARM's many partners.
UPDATE (7:57pm): After further talks with various parties there are a few clarifications that I wanted to make sure were added to our story. First, Altera's FPGAs are primarly focused on the markets of communication, industrial, military, etc. They are not really used as application processors and thus are not going to directly compete with Intel's processors in the phone/tablet space. It remains to be seen if Intel will open its foundries to a directly competing product but for now this announcement regarding the upcoming Stratix 10 FPGA on Intel's 14nm tri-gate is an interesting progression.
Subject: General Tech, Graphics Cards | October 16, 2013 - 10:00 PM | Scott Michaud
Tagged: FPGA, Altera
(Update 10/17/2013, 6:13 PM) Apparently I messed up inputing this into the website last night. To compare FPGAs with current hardware, the Altera Stratix 10 is rated at more than 10 TeraFLOPs compared to the Tesla K20X at ~4 TeraFLOPs or the GeForce Titan at ~4.5 TeraFLOPs. All figures are single precision. (end of update)
Field Programmable Gate Arrays (FPGAs) are not general purpose processors; they are not designed to perform any random instruction at any random time. If you have a specific set of instructions that you want performed efficiently, you can spend a couple of hours compiling your function(s) to an FPGA which will then be the hardware embodiment of your code.
This is similar to an Application-Specific Integrated Circuit (ASIC) except that, for an ASIC, it is the factory who bakes your application into the hardware. Many (actually, to my knowledge, almost every) FPGAs can even be reprogrammed if you can spare those few hours to configure it again.
Altera is a manufacturer of FPGAs. They are one of the few companies who were allowed access to Intel's 14nm fabrication facilities. Rahul Garg of Anandtech recently published a story which discussed compiling OpenCL kernels to FPGAs using Altera's compiler.
Now this is pretty interesting.
The design of OpenCL splits work between "host" and "kernel". The host application is written in some arbitrary language and follows typical programming techniques. Occasionally, the application will run across a large batch of instructions. A particle simulation, for instance, will require position information to be computed. Rather than having the host code loop through every particle and perform some complex calculation, what happens to each particle could be "a kernel" which the host adds to the queue of some accelerator hardware. Normally, this is a GPU with its thousands of cores chunked into groups of usually 32 or 64 (vendor-specific).
An FPGA, on the other hand, can lock itself to the specific set of instructions. It can decide to, within a few hours, configure some arbitrary number of compute paths and just churn through each kernel call until it is finished. The compiler knows exactly the application it will need to perform while the host code runs on the CPU.
This is obviously designed for enterprise applications, at least as far into the future as we can see. Current models are apparently priced in the thousands of dollars but, as the article points out, has the potential to out-perform a 200W GPU at just a tenth of the power. This could be very interesting for companies, perhaps a film production house, who wants to install accelerator cards for sub-d surfaces or ray tracing but would like to develop the software in-house and occasionally update their code after business hours.
Regardless of the potential market, a FPGA-based add-in card simply makes sense for OpenCL and its architecture.