When Magma Freezes Over...
Intel confirms that they have approached AMD about access to their Mantle API. The discussion, despite being clearly labeled as "an experiment" by an Intel spokesperson, was initiated by them -- not AMD. According to AMD's Gaming Scientist, Richard Huddy, via PCWorld, AMD's response was, "Give us a month or two" and "we'll go into the 1.0 phase sometime this year" which only has about five months left in it. When the API reaches 1.0, anyone who wants to participate (including hardware vendors) will be granted access.
AMD inside Intel Inside???
I do wonder why Intel would care, though. Intel has the fastest per-thread processors, and their GPUs are not known to be workhorses that are held back by API call bottlenecks, either. Of course, that is not to say that I cannot see any reason, however...
Subject: General Tech, Graphics Cards, Processors | July 2, 2014 - 03:55 AM | Scott Michaud
Tagged: Intel, Xeon Phi, xeon, silvermont, 14nm
Anandtech has just published a large editorial detailing Intel's Knights Landing. Mostly, it is stuff that we already knew from previous announcements and leaks, such as one by VR-Zone from last November (which we reported on). Officially, few details were given back then, except that it would be available as either a PCIe-based add-in board or as a socketed, bootable, x86-compatible processor based on the Silvermont architecture. Its many cores, threads, and 512 bit registers are each pretty weak, compared to Haswell, for instance, but combine to about 3 TFLOPs of double precision performance.
Not enough graphs. Could use another 256...
The best way to imagine it is running a PC with a modern, Silvermont-based Atom processor -- only with up to 288 processors listed in your Task Manager (72 actual cores with quad HyperThreading).
The main limitation of GPUs (and similar coprocessors), however, is memory bandwidth. GDDR5 is often the main bottleneck of compute performance and just about the first thing to be optimized. To compensate, Intel is packaging up-to 16GB of memory (stacked DRAM) on the chip, itself. This RAM is based on "Hybrid Memory Cube" (HMC), developed by Micron Technology, and supported by the Hybrid Memory Cube Consortium (HMCC). While the actual memory used in Knights Landing is derived from HMC, it uses a proprietary interface that is customized for Knights Landing. Its bandwidth is rated at around 500GB/s. For comparison, the NVIDIA GeForce Titan Black has 336.4GB/s of memory bandwidth.
Intel and Micron have worked together in the past. In 2006, the two companies formed "IM Flash" to produce the NAND flash for Intel and Crucial SSDs. Crucial is Micron's consumer-facing brand.
So the vision for Knights Landing seems to be the bridge between CPU-like architectures and GPU-like ones. For compute tasks, GPUs edge out CPUs by crunching through bundles of similar tasks at the same time, across many (hundreds of, thousands of) computing units. The difference with (at least socketed) Xeon Phi processors is that, unlike most GPUs, Intel does not rely upon APIs, such as OpenCL, and drivers to translate a handful of functions into bundles of GPU-specific machine language. Instead, especially if the Xeon Phi is your system's main processor, it will run standard, x86-based software. The software will just run slowly, unless it is capable of vectorizing itself and splitting across multiple threads. Obviously, OpenCL (and other APIs) would make this parallelization easy, by their host/kernel design, but it is apparently not required.
It is a cool way that Intel arrives at the same goal, based on their background. Especially when you mix-and-match Xeons and Xeon Phis on the same computer, it is a push toward heterogeneous computing -- with a lot of specialized threads backing up a handful of strong ones. I just wonder if providing a more-direct method of programming will really help developers finally adopt massively parallel coding practices.
I mean, without even considering GPU compute, how efficient is most software at splitting into even two threads? Four threads? Eight threads? Can this help drive heterogeneous development? Or will this product simply try to appeal to those who are already considering it?
Subject: Processors | June 23, 2014 - 04:05 PM | Jeremy Hellstrom
Tagged: amd, fx 9590, vishera
Hardware Canucks have just let out AMD's secret on a new take on a Vishera processor, the FX-9590 which will come with a Cooler Master Seidon 120 AIO LCS which will add $40 to the original $320 price tag. The base clock of the 8 CPUs will still be 4.7GHz, 5GHz boost buit with the TDP of 219W the watercooler should allow the boost clock to be maintained longer. If you ever planned on overclocking the FX-9590 but never picked it up because of the challenge of cooling it, then here is your chance.
"It all started with a tweet. AMD teased an unnamed new FX-series chip on Twitter and we've got the inside track. It's a refreshed 5GHz FX-9590 with an included water cooling unit."
Here are some more Processor articles from around the web:
- AMD A10-7850K (Kaveri) @ Bjorn3d
- Intel Core i7 4790K Devil’s Canyon Overclocking @ Kitguru
- Intel Core i7 4790K: Devil's Canyon Benchmarks On Ubuntu Linux @ Phoronix
- Intel Fourth Generation Core i7 4790K Review @ OCC
- Intel Devil's Canyon i7-4790K Performance Review @ Hardware Canucks
- Intel Core i7-4790 (Haswell Refresh) @ techPowerUp
- Intel Core i7-4790K Devil's Canyon Processor Review @ Legit Reviews
- ntel Pentium 20th Anniversary Edition G3258 CPU Review @ Madshrimps
Subject: Processors, Mobile | June 23, 2014 - 01:08 PM | Ryan Shrout
Tagged: snapdragon, qualcomm, gaming, Android, adreno
Today Qualcomm has published a 22-page white paper that keys in on the company's focus around Android gaming and the benefits that Qualcomm SoCs offer. As the dominant SoC vendor in the Android ecosystem of smartphones, tablets and handhelds (shipping more than 32% in Q2 of 2013) QC is able to offer a unique combination of solutions to both developers and gamers that push Android gaming into higher fidelity with more robust game play.
According to the white paper, Android gaming is the fastest growing segment of the gaming market with a 30% compound annual growth rate from 2013 to 2015, as projected by Gartner. Experiences for mobile games have drastically improved since Android was released in 2008 with developers like Epic Games and the Unreal Engine pushing visuals to near-console and near-PC qualities.
Qualcomm is taking a heterogeneous approach to address the requirements of gaming that include AI execution, physics simulation, animation, low latency input and high speed network connectivity in addition to high quality graphics and 3D rendering. Though not directly a part of the HSA standards still in development, the many specialized engines that Qualcomm has developed for its Snapdragon SoC processors including traditional CPUs, GPUs, DSPs, security and connectivity allow the company to create a solution that is built for Android gaming dominance.
In the white paper Qualcomm dives into the advantages that the Krait CPU architecture offers for CPU-based tasks as well as the power of the Adreno 4x series of GPUs that offer both raw performance and the flexibility to support current and future gaming APIs. All of this is done with single-digit wattage draw and a passive, fanless design and points to the huge undertaking that mobile gaming requires from an engineering and implementation perspective.
For developers, the ability to target Snapdragon architectures with a single code path that can address a scalable product stack allows for the least amount of development time and the most return on investment possible. Qualcomm continues to support the development community with tools and assistance to bring out the peak performance of Krait and Adreno to get games running on lower power parts as well as the latest and upcoming generations of SoCs in flagship devices.
It is great to see Qualcomm focus on this aspect of the mobile market and the challenges presented by it require strong dedication from these engineering teams. Being able to create compelling gaming experiences with high quality imagery while maintaining the required power envelope is a task that many other company's have struggled with.
Check out the new landing page over at Qualcomm if you are interested in more technical information as well as direct access to the white paper detailing the work Qualcomm is putting into its Snapdragon line of SoC for gamers.
Subject: Editorial, General Tech, Graphics Cards, Processors, Chipsets | June 13, 2014 - 06:45 PM | Scott Michaud
Tagged: x86, restructure, gpu, arm, APU, amd
According to VR-Zone, AMD has reworked their business, last Thursday, sorting each of their projects into two divisions and moving some executives around. The company is now segmented into the "Enterprise, Embedded, and Semi-Custom Business Group", and the "Computing and Graphics Business Group". The company used to be divided between "Computing Solutions", which handled CPUs, APUs, chipsets, and so forth, "Graphics and Visual Solutions", which is best known for GPUs but also contains console royalties, and "All Other", which was... everything else.
Lisa Su, former general manger of global business, has moved up to Chief Operating Officer (COO), along with other changes.
This restructure makes sense for a couple of reasons. First, it pairs some unprofitable ventures with other, highly profitable ones. AMD's graphics division has been steadily adding profitability to the company while its CPU division has been mostly losing money. Secondly, "All Other" is about a nebulous as a name can get. Instead of having three unbalanced divisions, one of which makes no sense to someone glancing at AMD's quarterly earnings reports, they should now have two, roughly equal segments.
At the very least, it should look better to an uninformed investor. Someone who does not know the company might look at the sheet and assume that, if AMD divested from everything except graphics, that the company would be profitable. If, you know, they did not know that console contracts came into their graphics division because their compute division had x86 APUs, and so forth. This setup is now more aligned to customers, not products.
A refresh for Haswell
Intel is not very good at keeping secrets recently. Rumors of a refreshed Haswell line of processors have been circulating for most of 2014. In March, it not only confirmed that release but promised an even more exciting part called Devil's Canyon. The DC parts are still quad-core Haswell processors built on Intel's 22nm process technology, but change a few specific things.
Intel spent some time on the Devil's Canyon Haswell processors to improve the packaging and thermals for overclockers and enthusiasts. The thermal interface material (TIM) that lies in between the die and the heat spreader has been updated to a next-generation polymer TIM (NGPTIM). The change should improve cooling performance of all currently shipping cooling solutions (air or liquid), but it is still a question just HOW MUCH this change will actually matter.
You can also tell from the photo comparison above that Intel has added capacitors to the back of the processor to "smooth" power delivery. This, in combination with the NGPTIM, should enable a bit more headroom for clock speeds with the Core i7-4790K.
In fact, there are two Devil's Canyon processors being launched this month. The Core i7-4790K will sell for $339, the same price as the Core i7-4770K, while the Core i5-4690K will sell for $242. The lower end option is a 3.5 GHz base clock, 3.9 GHz Turbo clock quad-core CPU without HyperThreading. While a nice step over the Core i5-4670K, it's only 100 MHz faster. Clearly the Core i7-4790K is the part everyone is going to be scrambling to buy.
Another interesting change is that both the Core i7-4790K and the Core i5-4690K enable support for both Intel's VT-d virtualization IO technology and Intel's TSX-NI transactional memory instructions. This makes them the first enthusiast-grade unlocked processors from Intel to support them!
As Intel states it, the Core i7-4790K and the Core i5-4690K have been "designed to be used in conjunction with the Z97 chipset." That being said, at least one motherboard manufacturer, ASUS, has released limited firmware updates to support the Devil's Canyon parts on Z87 products. Not all motherboards are going to be capable, and not all vendors are going to the spend the time to integrate support, so keep an eye on the support page for your specific motherboard.
The CPU itself looks no different on the top, save for the updated model numbering.
Core i7-4790K on the left, Core i7-4770K on the right
On the back you can see the added capacitors that help with stable overclocking.
The clock speed advantage that the Core i7-4790K provides over the Core i7-4770K should not be overlooked, even before overclocking is taken into consideration. A 500 MHz base clock boost is 14% higher in this case and in those specific CPU-limited tasks, you should see very high scaling.
Subject: Processors | June 5, 2014 - 06:32 PM | Jeremy Hellstrom
Tagged: baytrail, linux, N2820, ubuntu 14.04, Linux 3.13, Linux 3.15, mesa, nuc
It would seem that installing Linux on your brand new Bay Trail powered NUC will cost you a bit of performance. The testing Phoronix has performed on Intel NUC DN2820FYKH proves that it can handle running Linux without a hitch, however you will find that your overall graphical performance will dip a bit. Using MESA 10.3 and both the current 3.13 kernel and the 3.15 development kernel Phoronix saw a small delta in performance between Ubuntu 14.04 and Win 8.1 ... until they hit the OpenGL performance. As there is still no full OpenGL 4.0+ support there were tests that could not be run and even with the tests that could be there was a very large performance gap. Do not let this worry you, as they point out in the article there is a dedicated team working on full compliance and you can expect updated results in the near future.
"A few days ago my benchmarking revealed Windows 8.1 is outperforming Ubuntu Linux with the latest Intel open-source graphics drivers on Haswell hardware. I have since conducted tests on the Celeron N2820 NUC, and sadly, the better OpenGL performance is found with Microsoft's operating system."
Here are some more Processor articles from around the web:
- NVIDIA Tegra K1 Compared To AMD AM1 APUs @ Phoronix
- AMD's New Athlon/Semprons Give Old Phenom CPUs A Big Run For The Money @ Phoronix
- Overclocking The AMD AM1 Athlon & Sempron APUs @ Phoronix
- AMD Athlon 5350 "Kabini" APU Review @HiTech Legion
- Athlon 5350 and Sempron 3850 Processors (Kabini) and Socket AM1 Platform Review @ X-bit Labs
- AMD A10-7850K @ X-bit Labs
- Intel Haswell Refresh Reviewed: Core i7-4790, i5-4690, i5-4590 and i5-4460 Tested @ Madshrimps
- Intel Core i7-4790, i5-4690, i5-4590, i5-4460, i3-4360, i3-4350 and i3-4150 @ X-bit Labs
Subject: Processors, Mobile | June 4, 2014 - 11:00 AM | Ryan Shrout
Tagged: computex, computex 2014, arm, cavium, thunderx
While much of the news coming from Computex was centered around PC hardware, many of ARMs partners are making waves as well. Take Cavium for example, introducing the ThunderX CN88XX family of processors. With a completely custom ARMv8 architectural core design, the ThunderX processors will range from 24 to 48 cores and are targeted at large volume servers and cloud infrastructure. 48 cores!
The ThunderX family will be the first SoC to scale up to 48 cores and with a clock speed of 2.5 GHz and 16MB of L2 cache, should offer some truly impressive performance levels. Cavium claims to be the first socket-coherent ARM processor as well, using the Cavium Coherent Processor Interconnect. The I/O capacity stretches into the hundreds of Gigabits and quad channel DDR3 and DDR4 memory speeds up to 2.4 GHz keep the processors fed with work.
Here is the breakdown on the ThunderX families.
ThunderX_CP: Up to 48 highly efficient cores along with integrated virtSOC, dual socket coherency, multiple 10/40 GbE and high memory bandwidth. This family is optimized for private and public cloud web servers, content delivery, web caching, search and social media workloads.
ThunderX_ST: Up to 48 highly efficient cores along with integrated virtSOC, multiple SATAv3 controllers, 10/40 GbE & PCIe Gen3 ports, high memory bandwidth, dual socket coherency, and scalable fabric for east-west as well as north-south traffic connectivity. This family includes hardware accelerators for data protection/ integrity/security, user to user efficient data movement (RoCE) and compressed storage. This family is optimized for Hadoop, block & object storage, distributed file storage and hot/warm/cold storage type workloads.
ThunderX_SC: Up to 48 highly efficient cores along with integrated virtSOC, 10/40 GbE connectivity, multiple PCIe Gen3 ports, high memory bandwidth, dual socket coherency, and scalable fabric for east-west as well as north-south traffic connectivity. The hardware accelerators include Cavium’s industry leading, 4th generation NITROX and TurboDPI technology with acceleration for IPSec, SSL, Anti-virus, Anti-malware, firewall and DPI. This family is optimized for Secure Web front-end, security appliances and Cloud RAN type workloads.
ThunderX_NT: Up to 48 highly efficient cores along with integrated virtSOC, 10/40/100 GbE connectivity, multiple PCIe Gen3 ports, high memory bandwidth, dual socket coherency, and scalable fabric with feature rich capabilities for bandwidth provisioning , QoS, traffic Shaping and tunnel termination. The hardware accelerators include high packet throughput processing, network virtualization and data monitoring. This family is optimized for media servers, scale-out embedded applications and NFV type workloads.
We spoke with ARM earlier this year about its push into the server market and it is partnerships like these that will begin the ramp up to wide spread adoption of ARM-based server infrastructure. The ThunderX family will begin sampling in early Q4 2014 and production should be available by early 2015.
Kaveri Goes Mobile
The processor market is in an interesting place today. At the high end of the market Intel continues to stand pretty much unchallenged, ranging from the Ivy Bridge-E at $1000 to the $300 Haswell parts available for DIY users. The same could really be said for the mobile market - if you want a high performance part the default choice continues to rest with Intel. But AMD has some interesting options that Intel can't match when you start to enter the world of the mainstream notebook. The APU was slow to develop but it has placed AMD in a unique position, separated from the Intel processors with a more or less reversed compute focus. While Intel dominates in the performance on the x86 side of things, the GPU in AMD's latest APUs continue to lead in gaming and compute performance.
The biggest problem for AMD is that the computing software ecosystem still has not caught up with the performance that a GPU can provide. With the exception of games, the GPU in a notebook or desktop remains under utilized. Certain software vendors are making strides - see the changes in video transcoding and image manipulation - but there is still some ground AMD needs to accelerate down.
Today we are looking at the mobile version of Kaveri, AMD's latest entry into the world of APUs. This processor combines the latest AMD processor architecture with a GCN-based graphics design for a pretty advanced part. When the desktop version of this processor was released, we wrote quite a bit about the architecture and the technological advancements made into, including becoming the first processor that is fully HSA compliant. I won't be diving into the architecture details here since we covered them so completely back in January just after CES.
The mobile version of Kaveri is basically identical in architecture with some changes for better power efficiency. The flagship part will ship with 12 Compute Cores (4 Steamroller x86 cores and 8 GCN cores) and will support all the same features of GCN graphics designs including the new Mantle API.
Early in the spring we heard rumors that the AMD FX brand was going to make a comeback! Immediately enthusiasts were thinking up ways AMD could compete against the desktop Core i7 parts from Intel; could it be with 12 cores? DDR4 integration?? As it turns out...not so much.
Subject: Graphics Cards, Processors | June 3, 2014 - 02:10 PM | Ryan Shrout
Tagged: Intel, amd, richard huddy
Interesting news is crossing the ocean today as we learn that Richard Huddy, who has previously had stints at NVIDIA, ATI, AMD and most recently, Intel, is teaming up with AMD once again. Richard brings with him years of experience and innovation in the world of developer relations and graphics technology. Often called "the Godfather" of DirectX, AMD wants to prove to the community it is taking PC gaming seriously.
The official statement from AMD follows:
AMD is proud to announce the return of the well-respected authority in gaming, Richard Huddy. After three years away from AMD, Richard returns as AMD's Gaming Scientist in the Office of the CTO - he'll be serving as a senior advisor to key technology executives, like Mark Papermaster, Raja Koduri and Joe Macri. AMD is extremely excited to have such an industry visionary back. Having spent his professional career with companies like NVIDIA, Intel and ATI, and having led the worldwide ISV engineering team for over six years at AMD, Mr. Huddy has a truly unique perspective on the PC and Gaming industries.
Mr. Huddy rejoins AMD after a brief stint at Intel, where he had a major impact on their graphics roadmap. During his career Richard has made enormous contributions to the industry, including the development of DirectX and a wide range of visual effects technologies. Mr. Huddy’s contributions in gaming have been so significant that he was immortalized as ‘The Scientist’ in Max Payne (if you’re a gamer, you’ll see the resemblance immediately).
Kitguru has a video from Richard Huddy explaining his reasoning for the move back to AMD.
This move points AMD in a very interesting direction going forward. The creation of the Mantle API and the debate around AMD's developer relations programs are going to be hot topics as we move into the summer and I am curious how quickly Huddy thinks he can have an impact.
I have it on good authority we will find out very soon.