All | Editorial | General Tech | Graphics Cards | Networking | Motherboards | Cases and Cooling | Processors | Chipsets | Memory | Displays | Systems | Storage | Mobile | Shows and Expos
Server and Workstation Upgrades
Today, on the eve of the Intel Developer Forum, the company is taking the wraps off its new server and workstation class high performance processors, Xeon E5-2600 v3. Known previously by the code name Haswell-EP, the release marks the entry of the latest microarchitecture from Intel to multi-socket infrastructure. Though we don't have hardware today to offer you in-house benchmarks quite yet, the details Intel shared with me last month in Oregon are simply stunning.
Starting with the E5-2600 v3 processor overview, there are more changes in this product transition than we saw in the move from Sandy Bridge-EP to Ivy Bridge-EP. First and foremost, the v3 Xeons will be available in core counts as high as 18, with HyperThreading allowing for 36 accessible threads in a single CPU socket. A new socket, LGA2011-v3 or R3, allows the Xeon platforms to run a quad-channel DDR4 memory system, very similar to the upgrade we saw with the Haswell-E Core i7-5960X processor we reviewed just last week.
The move to a Haswell-based microarchitecture also means that the Xeon line of processors is getting AVX 2.0, known also as Haswell New Instructions, allowing for 2x the FLOPS per clock per core. It also introduces some interesting changes to Turbo Mode and power delivery we'll discuss in a bit.
Maybe the most interesting architectural change to the Haswell-EP design is per core P-states, allowing each of the up to 18 cores running on a single Xeon processor to run at independent voltages and clocks. This is something that the consumer variants of Haswell do not currently support - every cores is tied to the same P-state. It turns out that when you have up to 18 cores on a single die, this ability is crucial to supporting maximum performance on a wide array of compute workloads and to maintain power efficiency. This is also the first processor to allow independent uncore frequency scaling, giving Intel the ability to improve performance with available headroom even if the CPU cores aren't the bottleneck.
Pushing the 8 Cores
It seems like yesterday when I last talked about an AMD refresh! Oh wait, it almost was. Some weeks ago I was able to cover the latest AMD APU offerings that helped to flesh out the Kaveri lineup. We thought AMD was done for a while. Color us wrong. AMD pulled out all the stops and set up an AM3+ refresh! There is a little excitement here, I guess. I am trying to contain the tongue-in-cheek lines that I am oh-so-tempted to write.
AMD is refreshing their FX lineup in the waning days of Summer!
Let me explain the situation from my point of view. The FX lineup for AM3+ has not done a whole lot since the initial release of the Piledriver based FX-8350 and family (Vishera). Piledriver was a pretty significant update from Bulldozer as it slightly improved IPC and greatly improved power consumption (all the while helping to improve clockspeed by a small degree). There were two updates before this one, but they did not receive nearly as much coverage. These updates were the FX-6350 and the FX-9000 series. The FX-6350 is quite popular with the budget enthusiast crowd who still had not moved over to the Intel side of the equation. The FX-9000 series were OEM only initially and reaching up to $1000 at the high end. During that time since the original Vishera chips were released, we have seen the Intel Ivy Bridge and Haswell architectures (with a small refresh with Haswell with the 2nd gen products and the latest Socket 2011 units).
Revamped Enthusiast Platform
Join us at 12:30pm PT / 3:30pm ET as Intel's Matt Dunford joins us for a live stream event to discuss the release of Haswell-E and the X99 platform!! Find us at http://www.pcper.com/live!!
Sometimes writing these reviews can be pretty anti-climactic. With all of the official and leaked information released about Haswell-E over the last six to nine months, there isn't much more to divulge that can truly be called revolutionary. Yes, we are looking at the new king of the enthusiast market with an 8-core processor that not only brings a 33% increase in core count over the previous generation Ivy Bridge-E and Sandy Bridge-E platforms, but also includes the adoption of the DDR4 memory specification, which allows for high density and high speed memory subsystems.
And along with the new processor on a modified socket (though still LGA2011) comes a new chipset with some interesting new features. If you were left wanting for USB 3.0 or Thunderbolt on X79, then you are going to love what you see with X99. Did you think you needed some more SATA ports to really liven up your pool of hard drives? Retail boards are going to have you covered.
Again, just like last time, you will find a set of three processors that are coming into the market at the same time. These offerings range from the $999 price point and go down to the much more reasonable cost of $389. But this time there are more interesting decisions to be made based on specification differences in the family. Do the changes that Intel made in the sub-$1000 SKUs make it a better or worse buy for users looking to finally upgrade?
Haswell-E: A New Enthusiast Lineup from Intel
Today's launch of the Intel Core i7-5960X processor continues on the company's path of enthusiast branded parts that are built off of a subset of the workstation and server market. It is no secret that some Xeon branded processors will work in X79 motherboards and the same is true of the upcoming Haswell-EP series (with its X99 platform) launching today. As an enthusiast though, I think we can agree that it doesn't really matter how a processor like this comes about, as long as it continues to occur well into the future.
The Core i7-5960X processor is an 8-core, 16-thread design built on what is essentially the same architecture we saw released with the mainstream Haswell parts released in June of 2013. There are some important differences of course, including the lack of integrated graphics and the move from DDR3 to DDR4 for system memory. The underlying microarchitecture remains unchanged, though. Previously known as the Haswell-E platform, the Core i7-5960X continues Intel's trend of releasing enthusiast/workstation grade platforms that are based on an existing mainstream architecture.
Since the introduction of the Haswell line of CPUs, the Internet has been aflame with how hot the CPUs run. Speculation ran rampant on the cause with theories abounding about the lesser surface area and inferior thermal interface material (TIM) in between the CPU die surface and the underside of the CPU heat spreader. It was later confirmed that Intel had changed the TIM interfacing the CPU die surface to the heat spreader with Haswell, leading to the hotter than expected CPU temperatures. This increase in temperature led to inconsistent core-to-core temperatures as well as vastly inferior overclockability of the Haswell K-series chips over previous generations.
A few of the more adventurous enthusiasts took it upon themselves to use inventive ways to address the heat concerns surrounding the Haswell by delidding the processor. The delidding procedure involves physically removing the heat spreader from the CPU, exposing the CPU die. Some individuals choose to clean the existing TIM from the core die and heat spreader underside, applying superior TIM such as metal or diamond-infused paste or even the Coollaboratory Liquid Ultra metal material and fixing the heat spreader back in place. Others choose a more radical solution, removing the heat spreader from the equation entirely for direct cooling of the naked CPU die. This type of cooling method requires use of a die support plate, such as the MSI Die Guard included with the MSI Z97 XPower motherboard.
Whichever outcome you choose, you must first remove the heat spreader from the CPU's PCB. The heat spreader itself is fixed in place with black RTV-type material ensuring a secure and air-tight seal, protecting the fragile die from outside contaminants and influences. Removal can be done in multiple ways with two of the most popular being the razor blade method and the vise method. With both methods, you are attempting to separate the CPU PCB from the heat spreader without damaging the CPU die or components on the top or bottom sides of the CPU PCB.
Coming in 2014: Intel Core M
The era of Broadwell begins in late 2014 and based on what Intel has disclosed to us today, the processor architecture appears to be impressive in nearly every aspect. Coming off the success of the Haswell design in 2013 built on 22nm, the Broadwell-Y architecture will not only be the first to market with a new microarchitecture, but will be the flagship product on Intel’s new 14nm tri-gate process technology.
The Intel Core M processor, as Broadwell-Y has been dubbed, includes impressive technological improvements over previous low power Intel processors that result in lower power, thinner form factors, and longer battery life designs. Broadwell-Y will stretch into even lower TDPs enabling 9mm or small fanless designs that maintain current battery lifespans. A new 2nd generation FIVR with modified power delivery design allows for even thinner packaging and a wider range of dynamic frequencies than before. And of course, along with the shift comes an updated converged core design and improved graphics performance.
All of these changes are in service to what Intel claims is a re-invention of the notebook. Compared to 2010 when the company introduced the original Intel Core processor, thus redirecting Intel’s direction almost completely, Intel Core M and the Broadwell-Y changes will allow for some dramatic platform changes.
Notebook thickness will go from 26mm (~1.02 inches) down to a small as 7mm (~0.27 inches) as Intel has proven with its Llama Mountain reference platform. Reductions in total thermal dissipation of 4x while improving core performance by 2x and graphics performance by 7x are something no other company has been able to do over the same time span. And in the end, one of the most important features for the consumer, is getting double the useful battery life with a smaller (and lighter) battery required for it.
But these kinds of advancements just don’t happen by chance – ask any other semiconductor company that is either trying to keep ahead of or catch up to Intel. It takes countless engineers and endless hours to build a platform like this. Today Intel is sharing some key details on how it was able to make this jump including the move to a 14nm FinFET / tri-gate transistor technology and impressive packaging and core design changes to the Broadwell architecture.
Intel 14nm Technology Advancement
Intel consistently creates and builds the most impressive manufacturing and production processes in the world and it has helped it maintain a market leadership over rivals in the CPU space. It is also one of the key tenants that Intel hopes will help them deliver on the world of mobile including tablets and smartphones. At the 22nm node Intel was the first offer 3D transistors, what they called tri-gate and others refer to as FinFET. By focusing on power consumption rather than top level performance Intel was able to build the Haswell design (as well as Silvermont for the Atom line) with impressive performance and power scaling, allowing thinner and less power hungry designs than with previous generations. Some enthusiasts might think that Intel has done this at the expense of high performance components, and there is some truth to that. But Intel believes that by committing to this space it builds the best future for the company.
Filling the Product Gaps
In the first several years of my PCPer employment, I typically handled most of the AMD CPU refreshes. These were rather standard affairs that involved small jumps in clockspeed and performance. These happened every 6 to 8 months, with the bigger architectural shifts happening some years apart. We are finally seeing a new refresh of the AMD APU parts after the initial release of Kaveri to the world at the beginning of this year. This update is different. Unlike previous years, there are no faster parts than the already available A10-7850K.
This refresh deals with fleshing out the rest of the Kaveri lineup with products that address different TDPs, markets, and prices. The A10-7850K is still the king when it comes to performance on the FM2+ socket (as long as users do not pay attention to the faster CPU performance of the A10-6800K). The initial launch in January also featured another part that never became available until now; the A8-7600 was supposed to be available some months ago, but is only making it to market now. The 7600 part was unique in that it had a configurable TDP that went from 65 watts down to 45 watts. The 7850K on the other hand was configurable from 95 watts down to 65 watts.
So what are we seeing today? AMD is releasing three parts to address the lower power markets that AMD hopes to expand their reach into. The A8-7600 was again detailed back in January, but never released until recently. The other two parts are brand new. The A10-7800 is a 65 watt TDP part with a cTDP that goes down to 45 watts. The other new chip is the A6-7600K which is unlocked, has a configurable TDP, and looks to compete directly with Intel’s recently released 20 year Anniversary Pentium G3258.
When Magma Freezes Over...
Intel confirms that they have approached AMD about access to their Mantle API. The discussion, despite being clearly labeled as "an experiment" by an Intel spokesperson, was initiated by them -- not AMD. According to AMD's Gaming Scientist, Richard Huddy, via PCWorld, AMD's response was, "Give us a month or two" and "we'll go into the 1.0 phase sometime this year" which only has about five months left in it. When the API reaches 1.0, anyone who wants to participate (including hardware vendors) will be granted access.
AMD inside Intel Inside???
I do wonder why Intel would care, though. Intel has the fastest per-thread processors, and their GPUs are not known to be workhorses that are held back by API call bottlenecks, either. Of course, that is not to say that I cannot see any reason, however...
A refresh for Haswell
Intel is not very good at keeping secrets recently. Rumors of a refreshed Haswell line of processors have been circulating for most of 2014. In March, it not only confirmed that release but promised an even more exciting part called Devil's Canyon. The DC parts are still quad-core Haswell processors built on Intel's 22nm process technology, but change a few specific things.
Intel spent some time on the Devil's Canyon Haswell processors to improve the packaging and thermals for overclockers and enthusiasts. The thermal interface material (TIM) that lies in between the die and the heat spreader has been updated to a next-generation polymer TIM (NGPTIM). The change should improve cooling performance of all currently shipping cooling solutions (air or liquid), but it is still a question just HOW MUCH this change will actually matter.
You can also tell from the photo comparison above that Intel has added capacitors to the back of the processor to "smooth" power delivery. This, in combination with the NGPTIM, should enable a bit more headroom for clock speeds with the Core i7-4790K.
In fact, there are two Devil's Canyon processors being launched this month. The Core i7-4790K will sell for $339, the same price as the Core i7-4770K, while the Core i5-4690K will sell for $242. The lower end option is a 3.5 GHz base clock, 3.9 GHz Turbo clock quad-core CPU without HyperThreading. While a nice step over the Core i5-4670K, it's only 100 MHz faster. Clearly the Core i7-4790K is the part everyone is going to be scrambling to buy.
Another interesting change is that both the Core i7-4790K and the Core i5-4690K enable support for both Intel's VT-d virtualization IO technology and Intel's TSX-NI transactional memory instructions. This makes them the first enthusiast-grade unlocked processors from Intel to support them!
As Intel states it, the Core i7-4790K and the Core i5-4690K have been "designed to be used in conjunction with the Z97 chipset." That being said, at least one motherboard manufacturer, ASUS, has released limited firmware updates to support the Devil's Canyon parts on Z87 products. Not all motherboards are going to be capable, and not all vendors are going to the spend the time to integrate support, so keep an eye on the support page for your specific motherboard.
The CPU itself looks no different on the top, save for the updated model numbering.
Core i7-4790K on the left, Core i7-4770K on the right
On the back you can see the added capacitors that help with stable overclocking.
The clock speed advantage that the Core i7-4790K provides over the Core i7-4770K should not be overlooked, even before overclocking is taken into consideration. A 500 MHz base clock boost is 14% higher in this case and in those specific CPU-limited tasks, you should see very high scaling.
Kaveri Goes Mobile
The processor market is in an interesting place today. At the high end of the market Intel continues to stand pretty much unchallenged, ranging from the Ivy Bridge-E at $1000 to the $300 Haswell parts available for DIY users. The same could really be said for the mobile market - if you want a high performance part the default choice continues to rest with Intel. But AMD has some interesting options that Intel can't match when you start to enter the world of the mainstream notebook. The APU was slow to develop but it has placed AMD in a unique position, separated from the Intel processors with a more or less reversed compute focus. While Intel dominates in the performance on the x86 side of things, the GPU in AMD's latest APUs continue to lead in gaming and compute performance.
The biggest problem for AMD is that the computing software ecosystem still has not caught up with the performance that a GPU can provide. With the exception of games, the GPU in a notebook or desktop remains under utilized. Certain software vendors are making strides - see the changes in video transcoding and image manipulation - but there is still some ground AMD needs to accelerate down.
Today we are looking at the mobile version of Kaveri, AMD's latest entry into the world of APUs. This processor combines the latest AMD processor architecture with a GCN-based graphics design for a pretty advanced part. When the desktop version of this processor was released, we wrote quite a bit about the architecture and the technological advancements made into, including becoming the first processor that is fully HSA compliant. I won't be diving into the architecture details here since we covered them so completely back in January just after CES.
The mobile version of Kaveri is basically identical in architecture with some changes for better power efficiency. The flagship part will ship with 12 Compute Cores (4 Steamroller x86 cores and 8 GCN cores) and will support all the same features of GCN graphics designs including the new Mantle API.
Early in the spring we heard rumors that the AMD FX brand was going to make a comeback! Immediately enthusiasts were thinking up ways AMD could compete against the desktop Core i7 parts from Intel; could it be with 12 cores? DDR4 integration?? As it turns out...not so much.
Another Boring Presentation...?
In my old age I am turning into a bit of a skeptic. It is hard to really blame a guy; we are surrounded by marketing and hype, both from inside companies and from their fans. When I first started to listen in on AMD’s Core Innovation Update presentation, I was not expecting much. I figured it would be a rehash of the past year, more talk about Mullins/Beema, and some nice words about some of the upcoming Kaveri mobile products.
I was wrong.
AMD decided to give us a pretty interesting look at what they are hoping to accomplish in the next three years. It was not all that long ago that AMD was essentially considered road kill, and there was a lot of pessimism that Rory Read and Co. could turn AMD around. Now after a couple solid years of growth, a laser-like focus on product development based on the IP strengths of the company, and a pretty significant cut of the workforce, we are seeing an AMD that is vastly different from the one that Dirk Meyers was in charge of (or Hector Ruiz for that matter). Their view for the future takes a pretty significant turn from where AMD was even 8 years ago. x86 certainly has a future for AMD, but the full-scale adoption of the ARM architecture looks to be what finally differentiates this company from Intel.
Look, I’m Amphibious!
AMD is not amphibious. They are working on being ambidextrous. Their goal is not only to develop and sell x86 based processors, but also be a prime moving force in the ARM market. AMD has survived against a very large, well funded, and aggressive organization for the past 35 years. They believe their experience here can help them break into, and thrive within, the ARM marketplace. Their goals are not necessarily to be in every smartphone out there, but they are leveraging the ARM architecture to address high growth markets that have a lot of potential.
There are really two dominant architectures in the world with ARM and x86. They power the vast majority of computing devices around the world. Sure, we still have some Power and MIPS implementations, but they are dwarfed by the combined presence of x86 and ARM in modern devices. The flexibility of x86 allows it to scale from the extreme mobile up to the highest performing clusters. ARM also has the ability to scale in performance from handhelds up to the server world, but so far their introduction into servers and HPC solutions has been minimal to non-existent. This is an area that AMD hopes to change, but it will not happen overnight. A lot of infrastructure is needed to get ARM into that particular area. Ask Intel how long it took for x86 to gain a handhold in the lucrative server and workstation markets.
AMD Makes some Lemonade...
I guess we could say that AMD has been rather busy lately. It seems that a significant amount of the content on PC Perspective this month revolved around the AMD AM1 platform. Before that we had the Kaveri products and the R7 265. AMD also reported some fairly solid growth over the past year with their graphics and APU lines. Things are not as grim and dire as they once were for the company. This is good news for consumers as they will continue to be offered competing solutions that will vie for that hard earned dollar.
AMD is continuing their releases for 2014 with the announcement of their latest low-power and mainstream mobile APUs. These are codenamed “Beema” and “Mullins”, but they are based on the year old Kabini chip. This may cause a few people to roll their eyes as AMD has had some fairly unimpressive refreshes in the past. We saw the rather meager increases in clockspeed and power consumption with Brazos 2.0 a couple of years back, and it looked like this would be the case again for Beema and Mullins.
I was again expecting said meager improvements in power consumption and clockspeeds that we had received all those years ago with Brazos 2.0. Turns out I was wrong. This is a fairly major refresh which does a few things that I did not think were entirely possible, and I’m a rather optimistic person. So why is this release surprising? Let us take a good look under the hood.
AMD Brings Kabini to the Desktop
Perhaps we are performing a study of opposites? Yesterday Ryan posted his R9 295X2 review, which covers the 500 watt, dual GPU monster that will be retailing for $1499. A card that is meant for only the extreme enthusiast who has plenty of room in their case, plenty of knowledge about their power supply, and plenty of electricity and air conditioning to keep this monster at bay. The product that I am reviewing could not be any more different. Inexpensive, cool running, power efficient, and can be fit pretty much anywhere. These products can almost be viewed as polar opposites.
The interesting thing of course is that it shows how flexible AMD’s GCN architecture is. GCN can efficiently and effectively power the highest performing product in AMD’s graphics portfolio, as well as their lowest power offerings in the APU market. The performance scales very linearly when it comes to adding in more GCN compute cores.
The product that I am of course referring to are the latest Athlon and Sempron APUs that are based on the Kabini architecture which fuses Jaguar x86 cores with GCN compute cores. These APUs were announced last month, but we did not have the chance at the time to test them. Since then these products have popped up in a couple of places around the world, but this is the first time that reviewers have officially received product from AMD and their partners.
Low Power and Low Price
Back at CES earlier this year, we came across a couple of interesting motherboards that were neither AM3+ nor FM2+. These small, sparse, and inexpensive boards were actually based on the unannounced AM1 platform. This socket is actually the FS1b socket that is typically reserved for mobile applications which require the use of swappable APUs. The goal here is to provide a low cost, upgradeable platform for emerging markets where price is absolutely key.
AMD has not exactly been living on easy street for the past several years. Their CPU technologies have not been entirely competitive with Intel. This is their bread and butter. Helping to prop the company up though is a very robust and competitive graphics unit. The standalone and integrated graphics technology they offer are not only competitive, but also class leading in some cases. The integration of AMD’s GCN architecture into APUs has been their crowning achievement as of late.
This is not to say that AMD is totally deficient in their CPU designs. Their low power/low cost designs that started with the Bobcat architecture all those years back have always been very competitive in terms of performance, price, and power consumption. The latest iteration is the Kabini APU based on the Jaguar core architecture paired with GCN graphics. Kabini will be the part going into the FS1b socket that powers the AM1 platform.
Kabini is a four core processor (Jaguar) with a 128 unit GCN graphics part (8 GCN cores). These APUs will be rated at 25 watts up and down the stack. Even if they come with half the cores, it will still be a 25 watt part. AMD says that 25 watts is the sweet spot in terms of performance, cooling, and power consumption. Go lower than that and too much performance is sacrificed, and any higher it would make more sense to go with a Trinity/Richland/Kaveri solution. That 25 watt figure also encompasses the primary I/O functionality that typically resides on a standalone motherboard chipset. Kabini features 2 SATA 6G ports, 2 USB 3.0 ports, and 8 USB 2.0 ports. It also features multiple PCI-E lanes as well as a 4x PCI-E connection for external graphics. The chip also supports DisplayPort, HDMI, and VGA outputs. This is a true SOC from AMD that does a whole lot of work for not a whole lot of power.
Hybrid CrossFire that actually works
The road to redemption for AMD and its driver team has been a tough one. Since we first started to reveal the significant issues with AMD's CrossFire technology back in January of 2013 the Catalyst driver team has been hard at work on a fix, though I will freely admit it took longer to convince them that the issue was real than I would have liked. We saw the first steps of the fix released in August of 2013 with the release of the Catalyst 13.8 beta driver. It supported DX11 and DX10 games and resolutions of 2560x1600 and under (no Eyefinity support) but was obviously still less than perfect.
In October with the release of AMD's latest Hawaii GPU the company took another step by reorganizing the internal architecture of CrossFire on the chip level with XDMA. The result was frame pacing that worked on the R9 290X and R9 290 in all resolutions, including Eyefinity, though still left out older DX9 titles.
One thing that had not been addressed, at least not until today, was the issues that surrounded AMD's Hybrid CrossFire technology, now known as Dual Graphics. This is the ability for an AMD APU with integrated Radeon graphics to pair with a low cost discrete GPU to improve graphics performance and gaming experiences. Recently over at Tom's Hardware they discovered that Dual Graphics suffered from the exact same scaling issues as standard CrossFire; frame rates in FRAPS looked good but the actually perceived frame rate was much lower.
A little while ago a new driver made its way into my hands under the name of Catalyst 13.35 Beta X, a driver that promised to enable Dual Graphics frame pacing with Kaveri and R7 graphics cards. As you'll see in the coming pages, the fix definitely is working. And, as I learned after doing some more probing, the 13.35 driver is actually a much more important release than it at first seemed. Not only is Kaveri-based Dual Graphics frame pacing enabled, but Richland and Trinity are included as well. And even better, this driver will apparently fix resolutions higher than 2560x1600 in desktop graphics as well - something you can be sure we are checking on this week!
Just as we saw with the first implementation of Frame Pacing in the Catalyst Control Center, with the 13.35 Beta we are using today you'll find a new set of options in the Gaming section to enable or disable Frame Pacing. The default setting is On; which makes me smile inside every time I see it.
The hardware we are using is the same basic setup we used in my initial review of the AMD Kaveri A8-7600 APU review. That includes the A8-7600 APU, an Asrock A88X mini-ITX motherboard, 16GB of DDR3 2133 MHz memory and a Samsung 840 Pro SSD. Of course for our testing this time we needed a discrete card to enable Dual Graphics and we chose the MSI R7 250 OC Edition with 2GB of DDR3 memory. This card will run you an additional $89 or so on Amazon.com. You could use either the DDR3 or GDDR5 versions of the R7 250 as well as the R7 240, but in our talks with AMD they seemed to think the R7 250 DDR3 was the sweet spot for the CrossFire implementation.
Both the R7 250 and the A8-7600 actually share the same number of SIMD units at 384, otherwise known as 384 shader processors or 6 Compute Units based on the new nomenclature that AMD is creating. However, the MSI card is clocked at 1100 MHz while the GPU portions of the A8-7600 APU are running at only 720 MHz.
So the question is, has AMD truly fixed the issues with frame pacing with Dual Graphics configurations, once again making the budget gamer feature something worth recommending? Let's find out!
The AMD Kaveri Architecture
Kaveri: AMD’s New Flagship Processor
How big is Kaveri? We already know the die size of it, but what kind of impact will it have on the marketplace? Has AMD chosen the right path by focusing on power consumption and HSA? Starting out an article with three questions in a row is a questionable tactic for any writer, but these are the things that first come to mind when considering a product the likes of Kaveri. I am hoping we can answer a few of these questions by the end of this article, but alas it seems as though the market will have the final say as to how successful this new architecture is.
AMD has been pursuing the “Future is Fusion” line for several years, but it can be argued that Kaveri is truly the first “Fusion” product that completes the overall vision for where AMD wants to go. The previous several generations of APUs were initially not all that integrated in a functional sense, but the complexity and completeness of that integration has been improved upon with each iteration. Kaveri takes this integration to the next step, and one which fulfills the promise of a truly heterogeneous computing solution. While AMD has the hardware available, we have yet to see if the software companies are willing to leverage the compute power afforded by a robust and programmable graphics unit powered by AMD’s GCN architecture.
(Editor's Note: The following two pages were written by our own Josh Walrath, dicsussing the technology and architecture of AMD Kaveri. Testing and performance analysis by Ryan Shrout starts on page 3.)
The first step in understanding Kaveri is taking a look at the process technology that AMD is using for this particular product. Since AMD divested itself of their manufacturing arm, they have had to rely on GLOBALFOUNDRIES to produce nearly all of their current CPUs and APUs. Bulldozer, Piledriver, Llano, Trinity, and Richland based parts were all produced on GF’s 32 nm PD-SOI process. The lower power APUs such as Brazos and Kabini have been produced by TSMC on their 40 nm and 28 nm processes respectively.
Kaveri will take a slightly different approach here. It will be produced by GLOBALFOUNDRIES, but it will forego the SOI and utilize a bulk silicon process. 28 nm HKMG is very common around the industry, but few pure play foundries were willing to tailor their process to the direct needs of AMD and the Kaveri product. GF was able to do such a thing. APUs are a different kind of animal when it comes to fabrication, primarily because the two disparate units require different characteristics to perform at the highest efficiency. As such, compromises had to be made.
More Details from Lisa Su
The executives at AMD like to break their own NDAs. Then again, they are the ones typically setting these NDA dates, so it isn’t a big deal. It is no secret that Kaveri has been in the pipeline for some time. We knew a lot of the basic details of the product, but there were certainly things that were missing. Lisu Su went up onstage and shared a few new details with us.
Kaveri will be made up of 4 “Steamroller” cores, which are enhanced versions of the previous Bulldozer/Trinity/Vishera families of products. Nearly everything in the processor is doubled. It now has dual decode, more cache, larger TLBs, and a host of other smaller features that all add up to greater single thread performance and better multi-threaded handling and performance. Integer performance will be improved, and the FPU/MMX/SSE unit now features 2 x 128 bit FMAC units which can “fuse” and support AVX 256.
However, there was no mention of the fabled 6 core Kaveri. At this time, it is unlikely that particular product will be launched anytime soon.
ARM is Serious About Graphics
Ask most computer users from 10 years ago who ARM is, and very few would give the correct answer. Some well informed people might mention “Intel” and “StrongARM” or “XScale”, but ARM remained a shadowy presence until we saw the rise of the Smartphone. Since then, ARM has built up their brand, much to the chagrin of companies like Intel and AMD. Partners such as Samsung, Apple, Qualcomm, MediaTek, Rockchip, and NVIDIA have all worked with ARM to produce chips based on the ARMv7 architecture, with Apple being the first to release the first ARMv8 (64 bit) SOCs. The multitude of ARM architectures are likely the most shipped chips in the world, going from very basic processors to the very latest Apple A7 SOC.
The ARMv7 and ARMv8 architectures are very power efficient, yet provide enough performance to handle the vast majority of tasks utilized on smartphones and tablets (as well as a handful of laptops). With the growth of visual computing, ARM also dedicated itself towards designing competent graphics portions of their chips. The Mali architecture is aimed at being an affordable option for those without access to their own graphics design groups (NVIDIA, Qualcomm), but competitive with others that are willing to license their IP out (Imagination Technologies).
ARM was in fact one of the first to license out the very latest graphics technology to partners in the form of the Mali-T600 series of products. These modules were among the first to support OpenGL ES 3.0 (compatible with 2.0 and 1.1) and DirectX 11. The T600 architecture is very comparable to Imagination Technologies’ Series 6 and the Qualcomm Adreno 300 series of products. Currently NVIDIA does not have a unified mobile architecture in production that supports OpenGL ES 3.0/DX11, but they are adapting the Kepler architecture to mobile and will be licensing it to interested parties. Qualcomm does not license out Adreno after buying that group from AMD (Adreno is an anagram of Radeon).
Another Next Unit of Computing
Just about a year ago Intel released a new product called the Next Unit of Computing, or NUC for short. The idea was to allow Intel's board and design teams to bring the efficient performance of the ultra low voltage processors to a desktop, and creative, form factor. By taking what is essentially Ultrabook hardware and putting it in a 4-in by 4-in design Intel is attempting to rethink what the "desktop" computer is and how the industry develops for it.
We reviewed the first NUC last year, based on the Intel Ivy Bridge processor and took away a surprising amount of interest in the platform. It was (and is) a bit more expensive than many consumers are going to be willing to spend on such a "small" physical device but the performance and feature set is compelling.
This time around Intel has updated the 4x4 enclosure a bit and upgrade the hardware from Ivy Bridge to Haswell. That alone should result in a modest increase in CPU performance with quite a bit of increase in the integrated GPU performance courtesy of the Intel HD Graphics 5000. Other changes are on the table to; let's take a look.
The Intel D54250WYK NUC is a bare bones system that will run you about $360. You'll need to buy system memory and an mSATA SSD for storage (wireless is optional) to complete the build.
A Whole New Atom Family
This past spring I spent some time with Intel at its offices in Santa Clara to learn about a brand new architecture called Silvermont. Built for and targeted at low power platforms like tablets and smartphones, Silvermont was not simply another refresh of the aging Atom processors that were all based on Pentium cores from years ago; instead Silvermont was built from the ground up for low power consumption and high efficiency to compete against the juggernaut that is ARM and its partners. My initial preview of the Silvermont architecture had plenty of detail about the change to an out-of-order architecture, the dual-core modules that comprise it and the power optimizations included.
Today, during the annual Intel Developer Forum held in San Francisco, we are finally able to reveal the remaining details about the new Atom processors based on Silvermont, code named Bay Trail. Not only do we have new information about the designs, but we were able to get our hands on some reference tablets integrating Bay Trail and the new Atom Z3000 series of SoCs to benchmark and compare to offerings from Qualcomm, NVIDIA and AMD.
A Whole New Atom Family
It should be surprise to anyone that the name “Intel Atom Processor” has had a stigma attached to it almost since its initial release during the netbook craze. It was known for being slow and hastily put together though it was still a very successful product in terms of sales. With each successive release and update, Diamondville to Pineview to Cedarview, Atom was improved but only marginally so. Even with Medfield and Clover Trail the products were based around that legacy infrastructure and it showed. Tablets and systems based on Clover Trail saw only moderate success and lukewarm reviews.
With Silvermont the Atom brand gets a second chance. Some may consider it a fifth or sixth chance, but Intel is sticking with the name. Silvermont as an architecture is incredibly flexible and will find its way into several Intel products like Avoton, Bay Trail and Merrifield and in segments from the micro-server to smartphones to convertible tablets. Not only that, but Intel is aware that Windows isn’t the only game out there anymore and the company will support the architecture across Linux, Android and Windows environments.
Atom has been in tablets for some time now, starting in September of last year with Clover Trail deigns being announced during IDF. In February we saw the initial Android-based options also filter out, again based on Clover Trail. They were okay, but really only stop-gaps to prove that Intel was serious about the space. The real test will be this holiday season with Bay Trail at the helm.
While we always knew these Bay Trail platforms were going be branded as Atom we now have the full details on the numbering scheme and productization of the architecture. The Atom Z3700 series will consist of quad-core SoCs with Intel HD graphics (the same design as the Core processor series though with fewer compute units) that will support Windows and Android operating systems. The Atom Z3600 will be dual-core processors, still with Intel HD graphics, targeted only at the Android market.