All | Editorial | General Tech | Graphics Cards | Networking | Motherboards | Cases and Cooling | Processors | Chipsets | Memory | Displays | Systems | Storage | Mobile | Shows and Expos
Digging into a specific market
A little while ago, I decided to think about processor design as a game. You are given a budget of complexity, which is determined by your process node, power, heat, die size, and so forth, and the objective is to lay out features in the way that suits your goal and workload best. While not the topic of today's post, GPUs are a great example of what I mean. They make the assumption that in a batch of work, nearby tasks are very similar, such as the math behind two neighboring pixels on the screen. This assumption allows GPU manufacturers to save complexity by chaining dozens of cores together into not-quite-independent work groups. The circuit fits the work better, and thus it lets more get done in the same complexity budget.
Carrizo is aiming at a 63 million unit per year market segment.
This article is about Carrizo, though. This is AMD's sixth-generation APU, starting with Llano's release in June 2011. For this launch, Carrizo is targeting the 15W and 35W power envelopes for $400-$700 USD notebook devices. AMD needed to increase efficiency on the same, 28nm process that we have seen in their product stack since Kabini and Temash were released in May of 2013. They tasked their engineers to optimize their APU's design for these constraints, which led to dense architectures and clever features on the same budget of complexity, rather than smaller transistors or a bigger die.
15W was their primary target, and they claim to have exceeded their own expectations.
Backing up for a second. Beep. Beep. Beep. Beep.
When I met with AMD last month, I brought up the Bulldozer architecture with many individuals. I suspected that it was a quite clever design that didn't reach its potential because of external factors. As I started this editorial, processor design is a game and, if you can save complexity by knowing your workload, you can do more with less.
Bulldozer looked like it wanted to take a shortcut by cutting elements that its designers believed would be redundant going forward. First and foremost, two cores share a single floating point (decimal) unit. While you need some floating point capacity, upcoming workloads could use the GPU for a massive increase in performance, which is right there on the same die. As such, the complexity that is dedicated to every second FPU can be cut and used for something else. You can see this trend throughout various elements of the architecture.
Some Fresh Hope for 2016
EDIT 2015-05-07: A day after the AMD analyst meeting we now know that the roadmaps delivered here are not legitimate. While some of the information is likely correct on the roadmaps, they were not leaked by AMD. There is no FM3 socket, rather AMD is going with AM4. AMD will be providing more information throughout this quarter about their roadmaps, but for now take all of this information as "not legit".
SH SOTN has some eagle eyes and spotted the latest leaked roadmap for AMD. These roadmaps cover both mobile and desktop, from 2015 through 2016. There are obviously quite a few interesting tidbits of information here.
On the mobility roadmap we see the upcoming release of Carrizo, which we have been talking about since before CES. This will be the very first HSA 1.0 compliant part to hit the market, and AMD has done some really interesting things with the design in terms of performance, power efficiency, and die size optimizations. Carrizo will span the market from 15 watts to 35 watts TDP. This is a mobile only part, but indications point to it being pretty competent overall. This is a true SOC that will support all traditional I/O functions of older standalone southbridges. Most believe that this part will be manufactured by GLOBALFOUNDIRES on their 28 nm HKMG process that is more tuned to AMD's APU needs.
Carrizo-L will be based on the Puma+ architecture and will go from 10 watts to 15 watts TDP. This will use the same FP4 BGA connection as the big Carrizo APU. This should make these parts more palatable for OEMs as they do not have to differentiate the motherboard infrastructure. Making things easier for OEMs will give more reasons for these folks to offer products based on Carrizo and Carrizo-L APUs. The other big reason will be the GCN graphics compute units. Puma+ is a very solid processor architecture for low power products, but these parts are still limited to the older 28 nm HKMG process from TSMC.
One interesting addition here is that AMD will be introducing their "Amur" APU for the low power and ultra-low power markets. These will be comprised of four Cortex-A57 CPUs combined with AMD's GCN graphics units. This will be the first time we see this combination, and the first time AMD has integrated with ARM since ATI spun off their mobile graphics to Qualcomm under the "Adreno" branding (anagram for "Radeon"). What is most interesting here is that this APU will be a 20 nm part most likely fabricated by TSMC. This is not to say that Samsung or GLOBALFOUNDRIES might be producing it, but those companies are expending their energy on the 14 nm FinFET process that will be their bread and butter for years to come. This will be a welcome addition to the mobile market (tablets and handhelds) and could be a nice profit center for AMD if they are able to release this in a timely manner.
2016 is when things get very interesting. The Zen x86 design will dominate the upper 2/3 of the roadmap. I had talked about Zen when we had some new diagram leaks yesterday, but now we get to see the first potential products based off of this architecture. In mobile it will span from 5 watts to 35 watts TDP. The performance and mainstream offerings will be the "Bristol Ridge" APU which will feature 4 Zen cores (or one Zen module) combined with the next gen GCN architecture. This will be a 14nm part, and the assumption is that it will be GLOBALFOUNDRIES using 14nm FinFET LPP (Low Power Plus) that will be more tuned for larger APUs. This will also be a full SOC.
The next APU will be codenamed "Basilisk" that will span the 5 watt to 15 watt range. It will be comprised of 2 Zen cores (1/2 of a Zen module) and likely feature 2 to 4 MB of L3 cache, depending on power requirements. This looks to be the first Skybridge set of APUs that will share the same infrastructure as the ARM based Amur SOC. FT4 BGA is the basis for both the 2015 Amur and 2016 Basilisk SOCs.
Finally we have the first iteration of AMD's first ground up implementation of ARM's ARMv8-A ISA. The "Styx" APU features the new K12 CPU cores that AMD has designed from scratch. It too will feature the next generation GCN units as well as share the same FT4 BGA connection. Many are anxiously watching this space to see if AMD can build a better mousetrap when it comes to licensing the ARM ISA (as have Qualcomm, NVIDIA, and others).
2015 shows no difference in the performance desktop space, as it is still serviced by the now venerable Piledriver based FX parts on AM3+. The only change we expect to see here is that there will be a handful of new motherboard offerings from the usual suspects that will include the new USB 3.1 functionality derived from a 3rd party controller.
Mainstream and Performance will utilize the upcoming Godavari APUs. These are power and speed optimized APUs that are still based on the current Kaveri design. These look to be a simple refresh/rebadge with a slight performance tweak. Not exciting, but needs to happen for OEMs.
Low power will continue to be addressed by Beema based APUs. These are regular Puma based cores (not Puma+). AMD likely does not have the numbers to justify a new product in this rather small market.
2016 is when things get interesting again. We see the release of the FM3 socket (final proof that AM3+ is dead) that will house the latest Zen based APUs. At the top end we see "Summit Ridge" which will be composed of 8 Zen cores (or 2 Zen modules). This will have 4 MB of L2 cache and 16 MB of L3 cache if our other leaks are correct. These will be manufactured on 14nm FinFET LPE (the more appropriate process product for larger, more performance oriented parts). These will not be SOCs. We can expect these to be the basis of new Opterons as well, but there is obviously no confirmation of that on these particular slides. This will be the first new product in some years from AMD that has the chance to compete with higher end desktop SKUs from Intel.
From there we have the lower power Bristol Ridge and Basilisk APUs that we already covered in the mobile discussion. These look to be significant upgrades from the current Kaveri (and upcoming Godavari) APUs. New graphics cores, new CPU cores, and new SOC implementations where necessary.
AMD will really be shaking up the game in 2016. At the very least they will have proven that they can still change up their game and release higher end (and hopefully competitive) products. AMD has enough revenue and cash on hand to survive through 2016 and 2017 at the rate they are going now. We can only hope that this widescale change will allow AMD to make some significant inroads with OEMs on all levels. Otherwise Intel is free to do what they want and what price they want across multiple markets.
ARM Releases Cortex-A72 for Licensing
On February 3rd, ARM announced a slew of new designs, including the Cortex A72. Few details were shared with us, but what we learned was that it could potentially redefine power and performance in the ARM ecosystem. Ryan was invited to London to participate in a deep dive of what ARM has done to improve its position against market behemoth Intel in the very competitive mobile space. Intel has a leg up on process technology with their 14nm Tri-Gate process, but they are continuing to work hard in making their x86 based processors more power efficient, while still maintaining good performance. There are certain drawbacks to using an ISA that is focused on high performance computing rather than being designed from scratch to provide good performance with excellent energy efficiency.
ARM has been on a pretty good roll with their Cortex A9, A7, A15, A17, A53, and A57 parts over the past several years. These designs have been utilized in a multitude of products and scenarios, with configurations that have scaled up to 16 cores. While each iteration has improved upon the previous, ARM is facing the specter of Intel’s latest generation, highly efficient x86 SOCs based on the 2nd gen 14nm Tri-Gate process. Several things have fallen into place for ARM to help them stay competitive, but we also cannot ignore the experience and design hours that have led to this product.
(Editor's Note: During my time with ARM last week it became very apparent that it is not standing still, not satisfied with its current status. With competition from Intel, Qualcomm and others ramping up over the next 12 months in both mobile and server markets, ARM will more than ever be depedent on the evolution of core design and GPU design to maintain advantages in performance and efficiency. As Josh will go into more detail here, the Cortex-A72 appears to be an incredibly impressive design and all indications and conversations I have had with others, outside of ARM, believe that it will be an incredibly successful product.)
Cortex A72: Highest Performance ARM Cortex
ARM has been ubiquitous for mobile applications since it first started selling licenses for their products in the 90s. They were found everywhere it seemed, but most people wouldn’t recognize the name ARM because these chips were fabricated and sold by licensees under their own names. Guys like Ti, Qualcomm, Apple, DEC and others all licensed and adopted ARM technology in one form or the other.
ARM’s importance grew dramatically with the introduction of increased complexity cellphones and smartphones. They also gained attention through multimedia devices such as the Microsoft Zune. What was once a fairly niche company with low performance, low power offerings became the 800 pound gorilla in the mobile market. Billions of chips are sold yearly based on ARM technology. To stay in that position ARM has worked aggressively on continually providing excellent power characteristics for their parts, but now they are really focusing on overall performance and capabilities to address, not only the smartphone market, but also the higher performance computing and server spaces that they want a significant presence in.
SoFIA, Cherry Trail Make Debuts
Mobile World Congress is traditionally dominated by Samsung, Qualcomm, HTC, and others yet Intel continues to make in-roads into the mobile market. Though the company has admittedly lost a lot of money during this growing process, Intel pushes forward with today's announcement of a trio of new processor lines that keep the Atom brand. The Atom x3, the Atom x5, and the Atom x7 will be the company's answer in 2015 for a wide range of products, starting at the sub-$75 phone market and stretching up to ~$400 tablets and all-in-ones.
There are some significant differences in these Atom processors, more than the naming scheme might indicate.
Intel Atom x3 SoFIA Processor
For years now we have questioned Intel's capability to develop a processor that could fit inside the thermal envelope that is required for a smartphone while also offering performance comparable to Qualcomm, MediaTek, and others. It seemed that the x86 architecture was a weight around Intel's ankles rather than a float lifting it up. Intel's answer was the development of SoFIA, (S)mart (o)r (F)eature phone with (I)ntel (A)rchitecture. The project started about 2 years ago leading to product announcements finally reaching us today. SoFIA parts are "designed for budget smartphones; SoFIA is set to give Qualcomm and MediaTek a run for their money in this rapidly growing part of the market."
The SoFIA processors are based on the same Silvermont architecture as the current generation of Atom processors, but they are more tuned for power efficiency. Originally planned to be a dual-core only option, Intel has actually built both dual-core and quad-core variants that will pair with varying modem options to create a combination that best fit target price points and markets. Intel has partnered with RockChip for these designs, even though the architecture is completely IA/x86 based. Production will be done on a 28nm process technology at an unnamed vendor, though you can expect that to mean TSMC. This allows RockChip access to the designs, to help accelerate development, and to release them into the key markets that Intel is targeting.
AMD Details Carrizo Further
Some months back AMD introduced us to their “Carrizo” product. Details were slim, but we learned that this would be another 28 nm part that has improved power efficiency over its predecessor. It would be based on the new “Excavator” core that will be the final implementation of the Bulldozer architecture. The graphics will be based on the latest iteration of the GCN architecture as well. Carrizo would be a true SOC in that it integrates the southbridge controller. The final piece of information that we received was that it would be interchangeable with the Carrizo-L SOC, which is a extremely low power APU based on the Puma+ cores.
A few months later we were invited by AMD to their CES meeting rooms to see early Carrizo samples in action. These products were running a variety of applications very smoothly, but we were not informed of speeds and actual power draw. All that we knew is that Carrizo was working and able to run pretty significant workloads like high quality 4K video playback. Details were yet again very scarce other than the expected timeline of release, the TDP ratings of these future parts, and how it was going to be a significant jump in energy efficiency over the previous Kaveri based APUs.
AMD is presenting more information on Carrizo at the ISSCC 2015 conference. This information dives a little deeper into how AMD has made the APU smaller, more power efficient, and faster overall than the previous 15 watt to 35 watt APUs based on Kaveri. AMD claims that they have a product that will increase power efficiency in a way not ever seen before for the company. This is particularly important considering that Carrizo is still a 28 nm product.
Intel Pushes Broadwell to the Next Unit of Computing
Intel continues to invest a significant amount of money into this small form factor product dubbed the Next Unit of Computing, or NUC. When it was initially released in December of 2012, the NUC was built as an evolutionary step of the desktop PC, part of a move for Intel to find new and unique form factors that its processors can exist in. With a 4" x 4" motherboard design the NUC is certainly a differentiating design and several of Intel's partners have adopted it for products of their: Gigabyte's BRIX line being the most relevant.
But Intel's development team continues to push the NUC platform forward and today we are evaluating the most recent iteration. The Intel NUC5i5RYK is based on the latest 14nm Broadwell processor and offers improved CPU performance, a higher speed GPU and lower power consumption. All of this is packed into a smaller package than any previous NUC on the market and the result is both impressive and totally expected.
A Walk Around the NUC
To most poeple the latest Intel NUC will look very similar to the previous models based on Ivy Bridge and Haswell. You'd be right of course - the fundamental design is unchanged. But Intel continues to push forward in small ways, nipping and tucking away. But the NUC is still just a box. An incredibly small one with a lot of hardware crammed into it, but a box none the less.
While I can appreciate the details including the black and silver colors and rounded edges, I think that Intel needs to find a way to add some more excitement into the NUC product line going forward. Admittedly, it is hard to inovate in that directions with a focus on size and compression.
New Features and Specifications
It is increasingly obvious that in the high end smartphone and tablet market, much like we saw occur over the last several years in the PC space, consumers are becoming more concerned with features and experiences than just raw specifications. There is still plenty to drool over when looking at and talking about 4K screens in the palm of your hand, octa-core processors and mobile SoC GPUs measuring performance in hundreds of GFLOPS, but at the end of the day the vast majority of consumers want something that does something to “wow” them.
As a result, device manufacturers and SoC vendors are shifting priorities for performance, features and how those are presented both the public and to the media. Take this week’s Qualcomm event in San Diego where a team of VPs, PR personnel and engineers walked me through the new Snapdragon 810 processor. Rather than showing slide after slide of comparative performance numbers to the competition, I was shown room after room of demos. Wi-Fi, LTE, 4K capture and playback, gaming capability, thermals, antennae modifications, etc. The goal is showcase the experience of the entire platform – something that Qualcomm has been providing for longer than just about anyone in this business, while educating consumers on the need for balance too.
As a 15-year veteran of the hardware space my first reaction here couldn’t have been scripted any more precisely: a company that doesn’t show performance numbers has something to hide. But I was given time with a reference platform featuring the Snapdragon 810 processor in a tablet form-factor and the results show impressive increases over the 801 and 805 processors from the previous family. Rumors of the chips heat issues seem overblown, but that part will be hard to prove for sure until we get retail hardware in our hands to confirm.
Today’s story will outline the primary feature changes of the Snapdragon 810 SoC, though there was so much detail presented at the event with such a short window of time for writing that I definitely won’t be able to get to it all. I will follow up the gory specification details with performance results compared to a wide array of other tablets and smartphones to provide some context to where 810 stands in the market.
SFF PCs get an upgrade
Ultra compact computers, otherwise known as small form factor PCs, are a rapidly increasing market as consumers realize that, for nearly all purposes other than gaming and video editing, Ultrabook-class hardware is "fast enough". I know that some of our readers will debate that fact, and we welcome the discussion, but as CPU architectures continue to improve in both performance and efficiency, you will be able to combine higher performance into smaller spaces. The Gigabyte BRIX platform is the exact result that you expect to see with that combination.
Previously, we have seen several other Gigabyte BRIX devices including our first desktop interaction with Iris Pro graphics, the BRIX Pro. Unfortunately though, that unit was plagued by noise issues - the small fan spun pretty fast to cool a 65 watt processor. For a small computer that would likely sit on top of your desk, that's a significant drawback.
Intel Ivy Bridge NUC, Gigabyte BRIX S Broadwell, Gigabyte BRIX Pro Haswell
This time around, Gigabyte is using the new Broadwell-U architecture in the Core i7-5500U and its significantly lower, 15 watt TDP. That does come with some specification concessions though, including a dual-core CPU instead of a quad-core CPU and a peak Turbo clock rate that is 900 MHz lower. Comparing the Broadwell BRIX S to the more relevant previous generation based on Haswell, we get essentially the same clock speed, a similar TDP, but also an improved core architecture.
Today we are going to look at the new Gigabyte BRIX S featuring the Core i7-5500U and an NFC chip for some interesting interactions. The "S" designates that this model could support a full size 2.5-in hard drive in addition to the mSATA port.
ARM Releases Top Cortex Design to Partners
ARM has an interesting history of releasing products. The company was once in the shadowy background of the CPU world, but with the explosion of mobile devices and its relevance in that market, ARM has had to adjust how it approaches the public with their technologies. For years ARM has announced products and technology, only to see it ship one to two years down the line. It seems that with the increased competition in the marketplace from Apple, Intel, NVIDIA, and Qualcomm ARM is now pushing to license out its new IP in a way that will enable their partners to achieve a faster time to market.
The big news this time is the introduction of the Cortex A72. This is a brand new design that will be based on the ARMv8-A instruction set. This is a 64 bit capable processor that is also backwards compatible with 32 bit applications programmed for ARMv7 based processors. ARM does not go into great detail about the product other than it is significantly faster than the previous Cortex-A15 and Cortex-A57.
The previous Cortex-A15 processors were announced several years back and made their first introduction in late 2013/early 2014. These were still 32 bit processors and while they had good performance for the time, they did not stack up well against the latest A8 SOCs from Apple. The A53 and A57 designs were also announced around two years ago. These are the first 64 bit designs from ARM and were meant to compete with the latest custom designs from Apple and Qualcomm’s upcoming 64 bit part. We are only now just seeing these parts make it into production, and even Qualcomm has licensed the A53 and A57 designs to insure a faster time to market for this latest batch of next-generation mobile devices.
We can look back over the past five years and see that ARM is moving forward in announcing their parts and then having their partners ship them within a much shorter timespan than we were used to seeing. ARM is hoping to accelerate the introduction of its new parts within the next year.
NVIDIA's Tegra X1
NVIDIA seems to like begin on a one year cycle with their latest Tegra products. Many years ago we were introduced to the Tegra 2, and the year after that the Tegra 3, and the year after that the Tegra 4. Well, NVIDIA did spice up their naming scheme to get away from the numbers (not to mention the potential stigma of how many of those products actually made an impact in the industry). Last year's entry was the Tegra K1 based on the Kepler graphics technology. These products were interesting due to the use of the very latest, cutting edge graphics technology in a mobile/low power format. The Tegra K1 64 bit variant used two “Denver” cores that were actually designed by NVIDIA.
While technically interesting, the Tegra K1 series have made about the same impact as the previous versions. The Nexus 9 was the biggest win for NVIDIA with these parts, and we have heard of a smattering of automotive companies using Tegra K1 in those applications. NVIDIA uses the Tegra K1 in their latest Shield tablet, but they do not typically release data regarding the number of products sold. The Tegra K1 looks to be the most successful product since the original Tegra 2, but the question of how well they actually sold looms over the entire brand.
So why the history lesson? Well, we have to see where NVIDIA has been to get a good idea of where they are heading next. Today, NVIDIA is introducing the latest Tegra product, and it is going in a slightly different direction than what many had expected.
The reference board with 4 GB of LPDDR4.
Core M 5Y70 Specifications
Back in August of this year, Intel invited me out to Portland, Oregon to talk about the future of processors and process technology. Broadwell is the first microarchitecture to ship on Intel's newest 14nm process technology and the performance and power implications of it are as impressive as they are complex. We finally have the first retail product based on Broadwell-Y in our hands and I am eager to see how this combination of technology is going to be implemented.
If you have not read through my article that dives into the intricacies of the 14nm process and the architectural changes coming with Broadwell, then I would highly recommend that you do so before diving any further into this review. Our Intel Core M Processor: Broadwell Architecture and 14nm Process Reveal story clearly explains the "how" and "why" for many of the decisions that determined the direction the Core M 5Y70 heads in.
As I stated at the time:
"The information provided by Intel about Broadwell-Y today shows me the company is clearly innovating and iterating on its plans set in place years ago with the focus on power efficiency. Broadwell and the 14nm process technology will likely be another substantial leap between Intel and AMD in the x86 tablet space and should make an impact on other tablet markets (like Android) as long as pricing can remain competitive. That 14nm process gives Intel an advantage that no one else in the industry can claim and unless Intel begins fabricating processors for the competition (not completely out of the question), that will remain a house advantage."
With a background on Intel's goals with Broadwell-Y, let's look at the first true implementation.
Core M 5Y70 Early Testing
During a press session today with Intel, I was able to get some early performance results on Broadwell-Y in the form of the upcoming Core M 5Y70 processor.
Testing was done on a reference design platform code named Llama Mountain and at the heart of the system is the Broadwell-Y designed dual-core CPU, the Core M 5Y70, which is due out later this year. Power consumption of this system is low enough that Intel has built it with a fanless design. As we posted last week, this processor has a base frequency of just 1.10 GHz but it can boost as high as 2.6 GHz for extra performance when it's needed.
Before we dive into the actual result, you should keep in mind a couple of things. First, we didn't have to analyze the systems to check driver revisions, etc., so we are going on Intel's word that these are setup as you would expect to see them in the real world. Next, because of the disjointed nature of test were were able to run, the comparisons in our graphs aren't as great as I would like. Still, the results for the Core M 5Y70 are here should you want to compare them to any other scores you like.
First, let's take a look at old faithful: CineBench 11.5.
UPDATE: A previous version of this graph showed the TDP for the Intel Core M 5Y70 as 15 watts, not the 4.5 watt listed here now. The reasons are complicated. Even though the Intel Ark website lists the TDP of the Core M 5Y70, Intel has publicly stated the processor will make very short "spikes" at 15 watts when in its highest Turbo Boost modes. It comes to a discussion of semantics really. The cooling capability of the tablet is only targeted to 4.5-6.0 watts and those very short 15 watt spikes can be dissipated without the need for extra heatsink surface...because they are so short. SDP anyone? END UPDATE
With a score of 2.77, the Core M 5Y70 processor puts up an impressive fight against CPUs with much higher TDP settings. For example, Intel's own Pentium G3258 gets a score of 2.71 in CB11, and did so with a considerably higher thermal envelope. The Core i3-4330 scores 38% higher than the Core M 5Y70 but it requires a TDP 3.6-times larger to do so. Both of AMD's APUs in the 45 watt envelope fail to keep up with Core M.
Server and Workstation Upgrades
Today, on the eve of the Intel Developer Forum, the company is taking the wraps off its new server and workstation class high performance processors, Xeon E5-2600 v3. Known previously by the code name Haswell-EP, the release marks the entry of the latest microarchitecture from Intel to multi-socket infrastructure. Though we don't have hardware today to offer you in-house benchmarks quite yet, the details Intel shared with me last month in Oregon are simply stunning.
Starting with the E5-2600 v3 processor overview, there are more changes in this product transition than we saw in the move from Sandy Bridge-EP to Ivy Bridge-EP. First and foremost, the v3 Xeons will be available in core counts as high as 18, with HyperThreading allowing for 36 accessible threads in a single CPU socket. A new socket, LGA2011-v3 or R3, allows the Xeon platforms to run a quad-channel DDR4 memory system, very similar to the upgrade we saw with the Haswell-E Core i7-5960X processor we reviewed just last week.
The move to a Haswell-based microarchitecture also means that the Xeon line of processors is getting AVX 2.0, known also as Haswell New Instructions, allowing for 2x the FLOPS per clock per core. It also introduces some interesting changes to Turbo Mode and power delivery we'll discuss in a bit.
Maybe the most interesting architectural change to the Haswell-EP design is per core P-states, allowing each of the up to 18 cores running on a single Xeon processor to run at independent voltages and clocks. This is something that the consumer variants of Haswell do not currently support - every cores is tied to the same P-state. It turns out that when you have up to 18 cores on a single die, this ability is crucial to supporting maximum performance on a wide array of compute workloads and to maintain power efficiency. This is also the first processor to allow independent uncore frequency scaling, giving Intel the ability to improve performance with available headroom even if the CPU cores aren't the bottleneck.
Pushing the 8 Cores
It seems like yesterday when I last talked about an AMD refresh! Oh wait, it almost was. Some weeks ago I was able to cover the latest AMD APU offerings that helped to flesh out the Kaveri lineup. We thought AMD was done for a while. Color us wrong. AMD pulled out all the stops and set up an AM3+ refresh! There is a little excitement here, I guess. I am trying to contain the tongue-in-cheek lines that I am oh-so-tempted to write.
AMD is refreshing their FX lineup in the waning days of Summer!
Let me explain the situation from my point of view. The FX lineup for AM3+ has not done a whole lot since the initial release of the Piledriver based FX-8350 and family (Vishera). Piledriver was a pretty significant update from Bulldozer as it slightly improved IPC and greatly improved power consumption (all the while helping to improve clockspeed by a small degree). There were two updates before this one, but they did not receive nearly as much coverage. These updates were the FX-6350 and the FX-9000 series. The FX-6350 is quite popular with the budget enthusiast crowd who still had not moved over to the Intel side of the equation. The FX-9000 series were OEM only initially and reaching up to $1000 at the high end. During that time since the original Vishera chips were released, we have seen the Intel Ivy Bridge and Haswell architectures (with a small refresh with Haswell with the 2nd gen products and the latest Socket 2011 units).
Revamped Enthusiast Platform
Join us at 12:30pm PT / 3:30pm ET as Intel's Matt Dunford joins us for a live stream event to discuss the release of Haswell-E and the X99 platform!! Find us at http://www.pcper.com/live!!
Sometimes writing these reviews can be pretty anti-climactic. With all of the official and leaked information released about Haswell-E over the last six to nine months, there isn't much more to divulge that can truly be called revolutionary. Yes, we are looking at the new king of the enthusiast market with an 8-core processor that not only brings a 33% increase in core count over the previous generation Ivy Bridge-E and Sandy Bridge-E platforms, but also includes the adoption of the DDR4 memory specification, which allows for high density and high speed memory subsystems.
And along with the new processor on a modified socket (though still LGA2011) comes a new chipset with some interesting new features. If you were left wanting for USB 3.0 or Thunderbolt on X79, then you are going to love what you see with X99. Did you think you needed some more SATA ports to really liven up your pool of hard drives? Retail boards are going to have you covered.
Again, just like last time, you will find a set of three processors that are coming into the market at the same time. These offerings range from the $999 price point and go down to the much more reasonable cost of $389. But this time there are more interesting decisions to be made based on specification differences in the family. Do the changes that Intel made in the sub-$1000 SKUs make it a better or worse buy for users looking to finally upgrade?
Haswell-E: A New Enthusiast Lineup from Intel
Today's launch of the Intel Core i7-5960X processor continues on the company's path of enthusiast branded parts that are built off of a subset of the workstation and server market. It is no secret that some Xeon branded processors will work in X79 motherboards and the same is true of the upcoming Haswell-EP series (with its X99 platform) launching today. As an enthusiast though, I think we can agree that it doesn't really matter how a processor like this comes about, as long as it continues to occur well into the future.
The Core i7-5960X processor is an 8-core, 16-thread design built on what is essentially the same architecture we saw released with the mainstream Haswell parts released in June of 2013. There are some important differences of course, including the lack of integrated graphics and the move from DDR3 to DDR4 for system memory. The underlying microarchitecture remains unchanged, though. Previously known as the Haswell-E platform, the Core i7-5960X continues Intel's trend of releasing enthusiast/workstation grade platforms that are based on an existing mainstream architecture.
Since the introduction of the Haswell line of CPUs, the Internet has been aflame with how hot the CPUs run. Speculation ran rampant on the cause with theories abounding about the lesser surface area and inferior thermal interface material (TIM) in between the CPU die surface and the underside of the CPU heat spreader. It was later confirmed that Intel had changed the TIM interfacing the CPU die surface to the heat spreader with Haswell, leading to the hotter than expected CPU temperatures. This increase in temperature led to inconsistent core-to-core temperatures as well as vastly inferior overclockability of the Haswell K-series chips over previous generations.
A few of the more adventurous enthusiasts took it upon themselves to use inventive ways to address the heat concerns surrounding the Haswell by delidding the processor. The delidding procedure involves physically removing the heat spreader from the CPU, exposing the CPU die. Some individuals choose to clean the existing TIM from the core die and heat spreader underside, applying superior TIM such as metal or diamond-infused paste or even the Coollaboratory Liquid Ultra metal material and fixing the heat spreader back in place. Others choose a more radical solution, removing the heat spreader from the equation entirely for direct cooling of the naked CPU die. This type of cooling method requires use of a die support plate, such as the MSI Die Guard included with the MSI Z97 XPower motherboard.
Whichever outcome you choose, you must first remove the heat spreader from the CPU's PCB. The heat spreader itself is fixed in place with black RTV-type material ensuring a secure and air-tight seal, protecting the fragile die from outside contaminants and influences. Removal can be done in multiple ways with two of the most popular being the razor blade method and the vise method. With both methods, you are attempting to separate the CPU PCB from the heat spreader without damaging the CPU die or components on the top or bottom sides of the CPU PCB.
Coming in 2014: Intel Core M
The era of Broadwell begins in late 2014 and based on what Intel has disclosed to us today, the processor architecture appears to be impressive in nearly every aspect. Coming off the success of the Haswell design in 2013 built on 22nm, the Broadwell-Y architecture will not only be the first to market with a new microarchitecture, but will be the flagship product on Intel’s new 14nm tri-gate process technology.
The Intel Core M processor, as Broadwell-Y has been dubbed, includes impressive technological improvements over previous low power Intel processors that result in lower power, thinner form factors, and longer battery life designs. Broadwell-Y will stretch into even lower TDPs enabling 9mm or small fanless designs that maintain current battery lifespans. A new 2nd generation FIVR with modified power delivery design allows for even thinner packaging and a wider range of dynamic frequencies than before. And of course, along with the shift comes an updated converged core design and improved graphics performance.
All of these changes are in service to what Intel claims is a re-invention of the notebook. Compared to 2010 when the company introduced the original Intel Core processor, thus redirecting Intel’s direction almost completely, Intel Core M and the Broadwell-Y changes will allow for some dramatic platform changes.
Notebook thickness will go from 26mm (~1.02 inches) down to a small as 7mm (~0.27 inches) as Intel has proven with its Llama Mountain reference platform. Reductions in total thermal dissipation of 4x while improving core performance by 2x and graphics performance by 7x are something no other company has been able to do over the same time span. And in the end, one of the most important features for the consumer, is getting double the useful battery life with a smaller (and lighter) battery required for it.
But these kinds of advancements just don’t happen by chance – ask any other semiconductor company that is either trying to keep ahead of or catch up to Intel. It takes countless engineers and endless hours to build a platform like this. Today Intel is sharing some key details on how it was able to make this jump including the move to a 14nm FinFET / tri-gate transistor technology and impressive packaging and core design changes to the Broadwell architecture.
Intel 14nm Technology Advancement
Intel consistently creates and builds the most impressive manufacturing and production processes in the world and it has helped it maintain a market leadership over rivals in the CPU space. It is also one of the key tenants that Intel hopes will help them deliver on the world of mobile including tablets and smartphones. At the 22nm node Intel was the first offer 3D transistors, what they called tri-gate and others refer to as FinFET. By focusing on power consumption rather than top level performance Intel was able to build the Haswell design (as well as Silvermont for the Atom line) with impressive performance and power scaling, allowing thinner and less power hungry designs than with previous generations. Some enthusiasts might think that Intel has done this at the expense of high performance components, and there is some truth to that. But Intel believes that by committing to this space it builds the best future for the company.
Filling the Product Gaps
In the first several years of my PCPer employment, I typically handled most of the AMD CPU refreshes. These were rather standard affairs that involved small jumps in clockspeed and performance. These happened every 6 to 8 months, with the bigger architectural shifts happening some years apart. We are finally seeing a new refresh of the AMD APU parts after the initial release of Kaveri to the world at the beginning of this year. This update is different. Unlike previous years, there are no faster parts than the already available A10-7850K.
This refresh deals with fleshing out the rest of the Kaveri lineup with products that address different TDPs, markets, and prices. The A10-7850K is still the king when it comes to performance on the FM2+ socket (as long as users do not pay attention to the faster CPU performance of the A10-6800K). The initial launch in January also featured another part that never became available until now; the A8-7600 was supposed to be available some months ago, but is only making it to market now. The 7600 part was unique in that it had a configurable TDP that went from 65 watts down to 45 watts. The 7850K on the other hand was configurable from 95 watts down to 65 watts.
So what are we seeing today? AMD is releasing three parts to address the lower power markets that AMD hopes to expand their reach into. The A8-7600 was again detailed back in January, but never released until recently. The other two parts are brand new. The A10-7800 is a 65 watt TDP part with a cTDP that goes down to 45 watts. The other new chip is the A6-7600K which is unlocked, has a configurable TDP, and looks to compete directly with Intel’s recently released 20 year Anniversary Pentium G3258.
When Magma Freezes Over...
Intel confirms that they have approached AMD about access to their Mantle API. The discussion, despite being clearly labeled as "an experiment" by an Intel spokesperson, was initiated by them -- not AMD. According to AMD's Gaming Scientist, Richard Huddy, via PCWorld, AMD's response was, "Give us a month or two" and "we'll go into the 1.0 phase sometime this year" which only has about five months left in it. When the API reaches 1.0, anyone who wants to participate (including hardware vendors) will be granted access.
AMD inside Intel Inside???
I do wonder why Intel would care, though. Intel has the fastest per-thread processors, and their GPUs are not known to be workhorses that are held back by API call bottlenecks, either. Of course, that is not to say that I cannot see any reason, however...
A refresh for Haswell
Intel is not very good at keeping secrets recently. Rumors of a refreshed Haswell line of processors have been circulating for most of 2014. In March, it not only confirmed that release but promised an even more exciting part called Devil's Canyon. The DC parts are still quad-core Haswell processors built on Intel's 22nm process technology, but change a few specific things.
Intel spent some time on the Devil's Canyon Haswell processors to improve the packaging and thermals for overclockers and enthusiasts. The thermal interface material (TIM) that lies in between the die and the heat spreader has been updated to a next-generation polymer TIM (NGPTIM). The change should improve cooling performance of all currently shipping cooling solutions (air or liquid), but it is still a question just HOW MUCH this change will actually matter.
You can also tell from the photo comparison above that Intel has added capacitors to the back of the processor to "smooth" power delivery. This, in combination with the NGPTIM, should enable a bit more headroom for clock speeds with the Core i7-4790K.
In fact, there are two Devil's Canyon processors being launched this month. The Core i7-4790K will sell for $339, the same price as the Core i7-4770K, while the Core i5-4690K will sell for $242. The lower end option is a 3.5 GHz base clock, 3.9 GHz Turbo clock quad-core CPU without HyperThreading. While a nice step over the Core i5-4670K, it's only 100 MHz faster. Clearly the Core i7-4790K is the part everyone is going to be scrambling to buy.
Another interesting change is that both the Core i7-4790K and the Core i5-4690K enable support for both Intel's VT-d virtualization IO technology and Intel's TSX-NI transactional memory instructions. This makes them the first enthusiast-grade unlocked processors from Intel to support them!
As Intel states it, the Core i7-4790K and the Core i5-4690K have been "designed to be used in conjunction with the Z97 chipset." That being said, at least one motherboard manufacturer, ASUS, has released limited firmware updates to support the Devil's Canyon parts on Z87 products. Not all motherboards are going to be capable, and not all vendors are going to the spend the time to integrate support, so keep an eye on the support page for your specific motherboard.
The CPU itself looks no different on the top, save for the updated model numbering.
Core i7-4790K on the left, Core i7-4770K on the right
On the back you can see the added capacitors that help with stable overclocking.
The clock speed advantage that the Core i7-4790K provides over the Core i7-4770K should not be overlooked, even before overclocking is taken into consideration. A 500 MHz base clock boost is 14% higher in this case and in those specific CPU-limited tasks, you should see very high scaling.