Author:
Subject: Processors
Manufacturer: AMD

Get your brains ready

Just before the weekend, Josh and I got a chance to speak with David Kanter about the AMD Zen architecture and what it might mean for the Ryzen processor due out in less than a month. For those of you not familiar with David and his work, he is an analyst and consultant on processor architectrure and design through Real World Tech while also serving as a writer and analyst for the Microprocessor Report as part of the Linley Group. If you want to see a discussion forum that focuses on architecture at an incredibly detailed level, the Real World Tech forum will have you covered - it's an impressive place to learn.

zenpm-4.jpg

David was kind enough to spend an hour with us to talk about a recently-made-public report he wrote on Zen. It's definitely a discussion that dives into details most articles and stories on Zen don't broach, so be prepared to do some pausing and Googling phrases and technologies you may not be familiar with. Still, for any technology enthusiast that wants to get an expert's opinion on how Zen compares to Intel Skylake and how Ryzen might fare when its released this year, you won't want to miss it.

High Bandwidth Cache

Apart from AMD’s other new architecture due out in 2017, its Zen CPU design, there is no other product that has had as much build up and excitement surrounding it than its Vega GPU architecture. After the world learned that Polaris would be a mainstream-only design that was released as the Radeon RX 480, the focus for enthusiasts came straight to Vega. It’s been on the public facing roadmaps for years and signifies the company’s return to the world of high end GPUs, something they have been missing since the release of the Fury X in mid-2015.

slides-2.jpg

Let’s be clear: today does not mark the release of the Vega GPU or products based on Vega. In reality, we don’t even know enough to make highly educated guesses about the performance without more details on the specific implementations. That being said, the information released by AMD today is interesting and shows that Vega will be much more than simply an increase in shader count over Polaris. It reminds me a lot of the build to the Fiji GPU release, when the information and speculation about how HBM would affect power consumption, form factor and performance flourished. What we can hope for, and what AMD’s goal needs to be, is a cleaner and more consistent product release than how the Fury X turned out.

The Design Goals

AMD began its discussion about Vega last month by talking about the changes in the world of GPUs and how the data sets and workloads have evolved over the last decade. No longer are GPUs only worried about games, but instead they must address profession workloads, enterprise workloads, scientific workloads. Even more interestingly, as we have discussed the gap in CPU performance vs CPU memory bandwidth and the growing gap between them, AMD posits that the gap between memory capacity and GPU performance is a significant hurdle and limiter to performance and expansion. Game installs, professional graphics sets, and compute data sets continue to skyrocket. Game installs now are regularly over 50GB but compute workloads can exceed petabytes. Even as we saw GPU memory capacities increase from Megabytes to Gigabytes, reaching as high as 12GB in high end consumer products, AMD thinks there should be more.

slides-8.jpg

Coming from a company that chose to release a high-end product limited to 4GB of memory in 2015, it’s a noteworthy statement.

slides-11.jpg

The High Bandwidth Cache

Bold enough to claim a direct nomenclature change, Vega 10 will feature a HBM2 based high bandwidth cache (HBC) along with a new memory hierarchy to call it into play. This HBC will be a collection of memory on the GPU package just like we saw on Fiji with the first HBM implementation and will be measured in gigabytes. Why the move to calling it a cache will be covered below. (But can’t we call get behind the removal of the term “frame buffer”?) Interestingly, this HBC doesn’t have to be HBM2 and in fact I was told that you could expect to see other memory systems on lower cost products going forward; cards that integrate this new memory topology with GDDR5X or some equivalent seem assured.

slides-13.jpg

Continue reading our preview of the AMD Vega GPU Architecture!

Author:
Subject: Processors, Mobile
Manufacturer: Qualcomm

Semi-custom CPU

With the near comes a new push for performance, efficiency and feature leadership from Qualcomm and its Snapdragon line of mobile SoCs. The Snapdragon 835 was officially announced in November of last year when the partnership with Samsung on 10nm process technology was announced, but we now have the freedom to share more of the details on this new part and how it changes Qualcomm’s position in the ultra-device market. Though devices with the new 835 part won’t be on the market for several more months, with announcements likely coming at CES this year.

slides1-5.jpg

Qualcomm frames the story around the Snapdragon 835 processor with what they call the “five pillars” – five different aspects of mobile processor design that they have addressed with updates and technologies. Qualcomm lists them as battery life (efficiency), immersion (performance), connectivity, and security.

slides1-6.jpg

Starting where they start, on battery life and efficiency, the SD 835 has a unique focus that might surprise many. Rather than talking up the improvements in performance of the new processor cores, or the power of the new Adreno GPU, Qualcomm is firmly planted on looking at Snapdragon through the lens of battery life. Snapdragon 835 uses half of the power of Snapdragon 801.

slides2-2.jpg

The company touts usage claims of 1+ day of talk time, 5+ days of music playback, 11 hours of 4K video playback, 3 hours of 4K video capture and 2+ hours of sustained VR gaming. These sound impressive, but as we must always do in this market, you must wait for consumer devices from Qualcomm partners to really measure how well this platform will do. Going through a typical power user comparison of a device built on the Snapdragon 835 to one use the 820, Qualcomm thinks it could result in 2 or more hours of additional battery life at the end of the day.

We have already discussed the new Quick Charge 4 technology, that can offer 5 hours of use with just 5 minutes of charge time.

Continue reading our preview of the Qualcomm Snapdragon 835 SoC!

Author:
Subject: Processors
Manufacturer: Intel

Architectural Background

It probably doesn't surprise any of our readers that there has been a tepid response to the leaks and reviews that have come out about the new Core i7-7700K CPU ahead of the scheduled launch of Kaby Lake-S from Intel. Replacing the Skylake-based 6700K part as the new "flagship" consumer enthusiast CPU, the 7700K has quite a bit stacked against it. We know that Kaby Lake is the first in the new sequence of tick-tock-optimize, and thus there are few architectural changes to any portion of the chip. However, that does not mean that the 7700K and Kaby Lake in general don't offer new capabilities (HEVC) or performance (clock speed). 

The Core i7-7700K is in an interesting spot as well with regard to motherboards and platforms. Nearly all motherboards that run the Z170 chipset will be able to run the new Kaby Lake parts without requiring an upgrade to the newly released Z270 chipset. However, the likelihood that any user on a Z170 platform today using a Skylake processor will feel the NEED to upgrade to Kaby Lake is minimal, to say the least. The Z270 chipset only offers a couple of new features compared to last generation, so the upgrade path is again somewhat limited in excitement.

Let's start by taking a look at the Core i7-7700K and how it compares to the previous top-end parts from the consumer processor line and then touch on the changes that Kaby Lake brings to the table.

slides06.jpg

With the beginning of CES just days away (as I write this), Intel is taking the wrapping paper off of its first gift of 2017 to the industry. As you can see from the slide above, more than just the Kaby Lake-S consumer socketed processors are launching today, but other components including Iris Plus graphics implementations and quad-core notebook implementations will need to wait for another day.

slides10.jpg

For DIY builders and OEMs, Kaby Lake-S, now known as the 7th Generation Core Processor family, offer some changes and additions. First, we will get a dual-core HyperThreaded processor with an unlocked designation in the Core i3-7350K. Other than the aforementioned Z270 chipset, Kaby Lake will be the first platform compatible with Intel Optane memory. (To be extra clear, I was told that previous processors will NOT be able to utilize Optane in its M.2 form factor.)

slides11.jpg

Though we have already witnessed Lenovo announcing products using Optane, this is the first official Intel discussion about it. Optane memory will be available in M.2 modules that can be installed on Z270 motherboards, improving snappiness and responsiveness. It seems this will be launched later in the quarter as we don't have any performance numbers or benchmarks to point to demonstrating the advantages that Intel touts. I know both Allyn and I are very excited to see how this differs from previous Intel caching technologies.

  Core i7-7700K Core i7-6700K Core i7-5775C Core i7-4790K Core i7-4770K Core i7-3770K
Architecture Kaby Lake Skylake Broadwell Haswell Haswell Ivy Bridge
Process Tech 14nm+ 14nm 14nm 22nm 22nm 22nm
Socket LGA 1151 LGA 1151 LGA 1150 LGA 1150 LGA 1150 LGA 1155
Cores/Threads 4/8 4/8 4/8 4/8 4/8 4/8
Base Clock 4.2 GHz 4.0 GHz 3.3 GHz 4.0 GHz 3.5 GHz 3.5 GHz
Max Turbo Clock 4.5 GHz 4.2 GHz 3.7 GHz 4.4 GHz 3.9 GHz 3.9 GHz
Memory Tech DDR4 DDR4 DDR3 DDR3 DDR3 DDR3
Memory Speeds Up to 2400 MHz Up to 2133 MHz Up to 1600 MHz Up to 1600 MHz Up to 1600 MHz Up to 1600 MHz
Cache (L4 Cache) 8MB 8MB 6MB (128MB) 8MB 8MB 8MB
System Bus DMI3 - 8.0 GT/s DMI3 - 8.0 GT/s DMI2 - 6.4 GT/s DMI2 - 5.0 GT/s DMI2 - 5.0 GT/s DMI2 - 5.0 GT/s
Graphics HD Graphics 630 HD Graphics 530 Iris Pro 6200 HD Graphics  4600 HD Graphics 4600 HD Graphics  4000
Max Graphics Clock 1.15 GHz 1.15 GHz 1.15 GHz 1.25 GHz 1.25 GHz 1.15 GHz
TDP 91W 91W 65W 88W 84W 77W
MSRP $339 $339 $366 $339 $339 $332

Continue reading our review of the Intel Core i7-7700K Kaby Lake processor!!

Author:
Subject: Processors
Manufacturer: AMD
Tagged: Zen, ryzen, processor, cpu, amd

Ryzen coming in 2017

As much as we might want it to be, today is not the day that AMD launches its new Zen processors to the world. We’ve been teased with it for years now, with trickles of information at event after event…but we are going to have to wait a little bit longer with one more tease at least. Today’s AMD is announcing the official branding of the consumer processors based on Zen, previously code named Summit Ridge, along with a clock speed data point and a preview of five technology that will help it be competitive with the Intel Core lineup.

ryzen-22.jpg

The future consumer desktop processor from AMD will now officially be known as Ryzen. That’s pronounced “RISE-IN” not “RIS-IN”, just so we are all on the same page. CEO Lisa Su was on stage during the reveal at a media event last week and claimed that while media, fans and AMD fell in love with the Zen name, it needed a differentiation from the architecture itself. The name is solid – not earth shattering though I foresee a long life of mispronunciation ahead of it.

Now that we have the official branding behind us, let’s get to the rest of the disclosed information we can reveal today.

ryzen-24.jpg

We already knew that Summit Ridge would ship with an 8 core, 16 thread version (with lower core counts at lower prices very likely) but now we know a frequency and a cache size. AMD tells us that there will be a processor (the flagship) that will have a base clock of 3.4 GHz with boost clocks above that. How much above that is still a mystery – AMD is likely still tweaking its implementation of boost to get as much performance as possible for launch. This should help put those clock speed rumors to rest for now.

The 20MB of cache matches the Core i7-6900K, though obviously with some dramatic architecture differences between Broadwell and Zen, the effect and utilization of that cache will be interesting measure next year.

ryzen-10.jpg

We already knew that Ryzen will be utilizing the AM4 platform, but it’s nice to see it reiterated a modern feature set and expandability. DDR4 memory, PCI Express Gen3, native USB 3.1 and NVMe support – there are all necessary building blocks for a modern consumer and enthusiast PC. We still should see how many of these ports the chipset offers and how aggressive motherboard companies like ASUS, MSI and Gigabyte are in their designs. I am hoping there are as many options as would see for an X99/Z170 platform, including budget boards in the $100 space as well as “anything and everything” options for those types of buyers that want to adopt AMD’s new CPU.

Continue reading our latest preview of AMD Zen, now known as Ryzen!

Author:
Manufacturer: NVIDIA

A Holiday Project

A couple of years ago, I performed an experiment around the GeForce GTX 750 Ti graphics card to see if we could upgrade basic OEM, off-the-shelf computers to become competent gaming PCs. The key to this potential upgrade was that the GTX 750 Ti offered a great amount of GPU horsepower (at the time) without the need for an external power connector. Lower power requirements on the GPU meant that even the most basic of OEM power supplies should be able to do the job.

That story was a success, both in terms of the result in gaming performance and the positive feedback it received. Today, I am attempting to do that same thing but with a new class of GPU and a new class of PC games.

The goal for today’s experiment remains pretty much the same: can a low-cost, low-power GeForce GTX 1050 Ti graphics card that also does not require any external power connector offer enough gaming horsepower to upgrade current shipping OEM PCs to "gaming PC" status?

Our target PCs for today come from Dell and ASUS. I went into my local Best Buy just before the Thanksgiving holiday and looked for two machines that varied in price and relative performance.

01.jpg

  Dell Inspiron 3650 ASUS M32CD-B09
Processor Intel Core i3-6100 Intel Core i7-6700
Motherboard Custom Custom
Memory 8GB DDR4 12GB DDR4
Graphics Card Intel HD Graphics 530 Intel HD Graphics 530
Storage 1TB HDD 1TB Hybrid HDD
Case Custom Custom
Power Supply 240 watt 350 watt
OS Windows 10 64-bit Windows 10 64-bit
Total Price $429 (Best Buy) $749 (Best Buy)

The specifications of these two machines are relatively modern for OEM computers. The Dell Inspiron 3650 uses a modest dual-core Core i3-6100 processor with a fixed clock speed of 3.7 GHz. It has a 1TB standard hard drive and a 240 watt power supply. The ASUS M32CD-B09 PC has a quad-core HyperThreaded processor with a 4.0 GHz maximum Turbo clock, a 1TB hybrid hard drive and a 350 watt power supply. Both of the CPUs share the same Intel brand of integrated graphics, the HD Graphics 520. You’ll see in our testing that not only is this integrated GPU unqualified for modern PC gaming, but it also performs quite differently based on the CPU it is paired with.

Continue reading our look at upgrading an OEM machine with the GTX 1050 Ti!!

Author:
Subject: Processors
Manufacturer: Intel

Introduction

In August at the company’s annual developer forum, Intel officially took the lid off its 7th generation of Core processor series, codenamed Kaby Lake. The build up to this release has been an interesting one as we saw the retirement of the “tick tock” cadence of processor releases and instead are moving into a market where Intel can spend more development time on a single architecture design to refine and tweak it as the engineers see fit. With that knowledge in tow, I believed, as I think many still do today, that Kaby Lake would be something along the lines of a simple rebrand of current shipping product. After all, since we know of no major architectural changes from Skylake other than improvements in the video and media side of the GPU, what is left for us to look forward to?

As it turns out, the advantages of the 7th Generation Core processor family and Kaby Lake are more substantial than I expected. I was able to get a hold of two different notebooks from the HP Spectre lineup, as near to identical as I could manage, with the primary difference being the move from the 6th Generation Skylake design to the 7th Generation Kaby Lake. After running both machines through a gamut of tests ranging from productivity to content creation and of course battery life, I can say with authority that Intel’s 7th Gen product deserves more accolades than it is getting.

Architectural Refresher

Before we get into the systems and to our results, I think it’s worth taking some time to quickly go over some of what we know about Kaby Lake from the processor perspective. Most of this content was published back in August just after the Intel Developer Forum, so if you are sure you are caught up, you can jump right along to a pictorial look at the two notebooks being tested today.

At its core, the microarchitecture of Kaby Lake is identical to that of Skylake. Instructions per clock (IPC) remain the same with the exception of dedicated hardware changes in the media engine, so you should not expect any performance differences with Kaby Lake except with improved clock speeds.

Also worth noting is that Intel is still building Kaby Lake on 14nm process technology, the same used on Skylake. The term “same” will be debated as well as Intel claims that improvements made in the process technology over the last 24 months have allowed them to expand clock speeds and improve on efficiency.

core.jpg

Dubbing this new revision of the process as “14nm+”, Intel tells me that they have improved the fin profile for the 3D transistors as well as channel strain while more tightly integrating the design process with manufacturing. The result is a 12% increase in process performance; that is a sizeable gain in a fairly tight time frame even for Intel.

That process improvement directly results in higher clock speeds for Kaby Lake when compared to Skylake when running at the same target TDPs. In general, we are looking at 300-400 MHz higher peak clock speeds in Turbo Boost situations when compared to similar TDP products in the 6th generation. Sustained clocks will very likely remain voltage / thermally limited but the ability spike up to higher clocks for even short bursts can improve performance and responsiveness of Kaby Lake when compared to Skylake.

Along with higher fixed clock speeds for Kaby Lake processors, tweaks to Speed Shift will allow these processors to get to peak clock speeds more quickly than previous designs. I extensively tested Speed Shift when the feature was first enabled in Windows 10 and found that the improvement in user experience was striking. Though the move from Skylake to Kaby Lake won’t be as big of a change, Intel was able to improve the behavior.

The graphics architecture and EU (execution unit) layout remains the same from Skylake, but Intel was able to integrate a new video decode unit to improve power efficiency. That new engine can work in parallel with the EUs to improve performance throughput as well, but obviously at the expensive of some power efficiency.

peca-8.jpg

Specific additions to the codec lineup include decode support for 10-bit HEVC and 8/10-bit VP9 as well as encode support for 10-bit HEVC and 9-bit VP9. The video engine adds HDR support with tone mapping though it does require EU utilization. Wide Color Gamut (Rec. 2020) is prepped and ready to go according to Intel for when that standard starts rolling out to displays.

Performance levels for these new HEVC encode/decode blocks is set to allow for 4K 120mbps real-time on both the Y-series (4.5 watt) and U-series (15 watt) processors.

It’s obvious that the changes to Kaby Lake from Skylake are subtle and even I found myself overlooking the benefits that it might offer. While the capabilities it has will be tested on the desktop side at a later date in 2017, for thin and light notebooks, convertibles and even some tablets, the 7th Generation Core processors do in fact take advantage of the process improvements and higher clock speeds to offer an improved user experience.

Continue reading our look at Kaby Lake mobile performance!

Author:
Subject: Processors
Manufacturer: Intel

What's new and what's not

While spending time learning about upcoming products and technologies at the Intel Developer Forum earlier this month, I sat down with the company to learn about the release of Kaby Lake, now known as the 7th Generation Core processor family. We have been seeing and reporting on the details of Kaby Lake for quite some time here on PC Perspective – it became a more important topic when we realized that this would be the product that officially killed off the ‘tick-tock’ design philosophy that Intel had implemented years ago and that was responsible for much of the innovation in the CPU space over the last decade.

Today Intel released new information about the 7th Gen CPU family and Kaby Lake. Let’s dive into this topic with a simple and straight forward mindset in how it compares to Skylake.

What is the same

Actually, quite a lot. At its core, the microarchitecture of Kaby Lake is identical to that of Skylake. Instructions per clock (IPC) remain the same with the exception of dedicated hardware changes in the media engine, so you should not expect any performance differences with Kaby Lake except with improved clock speeds we’ll discuss in a bit.

core.jpg

Because of this lack of change many people will look down on the Kaby Lake release as Intel’s attempt to repackage an existing product to make sure it meets a financial market required annual product cadence. It is a valid but arguable criticism, but Intel is making changes in other areas that should make KBL an improvement in the thin and light ecosystem.

Also worth noting is that Intel is still building Kaby Lake on 14nm process technology, the same used on Skylake. The term “same” will be debated as well as Intel claims that improvements made in the process technology over the last 24 months have allowed them to expand clock speeds and improve on efficiency

What is changed

Dubbing this new revision of the process as “14nm+”, Intel tells me that they have improved the fin profile for the 3D transistors as well as channel strain while more tightly integrating the design process with manufacturing. The result is a 12% increase in process performance; that is a sizeable gain in a fairly tight time frame even for Intel.

That process improvement directly results in higher clock speeds for Kaby Lake when compared to Skylake when running at the same target TDPs. In general, we are looking at 300-400 MHz higher peak clock speeds in Turbo Boost situations when compared to similar TDP products in the 6th generation. Sustained clocks will very likely remain voltage / thermally limited but the ability spike up to higher clocks for even short bursts can improve performance and responsiveness of Kaby Lake when compared to Skylake.

slide-12.jpg

In these two examples, Intel compares the 15 watt Core i7-6500U (a common part in currently shipping notebooks) and the upcoming 15 watt Core i7-7500U, both with dual-core HyperThreaded configurations. In SYSmark 2014 a 12% score improvement is measured while WebXPRT shows a 19% advantage. Double digit performance increases are pretty astounding for a new generational jump that does not include a new microarchitecture or a new process technology (more or less) though we should temper expectations for other applications and workload profiles like content creation.

Continue reading our overview of Intel's Kaby Lake processors!

Author:
Subject: Processors
Manufacturer: AMD

Clean Sheet and New Focus

It is no secret that AMD has been struggling for some time.  The company has had success through the years, but it seems that the last decade has been somewhat bleak in terms of competitive advantages.  The company has certainly made an impact in throughout the decades with their 486 products, K6, the original Athlon, and the industry changing Athlon 64.  Since that time we have had a couple of bright spots with the Phenom II being far more competitive than expected, and the introduction of very solid graphics performance in their APUs.

Sadly for AMD their investment in the “Bulldozer” architecture was misplaced for where the industry was heading.  While we certainly see far more software support for multi-threaded CPUs, IPC is still extremely important for most workloads.  The original Bulldozer was somewhat rushed to market and was not fully optimized, while the “Piledriver” based Vishera products fixed many of these issues we have not seen the non-APU products updated to the latest Steamroller and Excavator architectures.  The non-APU desktop market has been served for the past four years with 32nm PD-SOI based parts that utilize a rebranded chipset base that has not changed since 2010.

hc_03.png

Four years ago AMD decided to change course entirely with their desktop and server CPUs.  Instead of evolving the “Bulldozer” style architecture featuring CMT (Core Multi-Threading) they were going to do a clean sheet design that focused on efficiency, IPC, and scalability.  While Bulldozer certainly could scale the thread count fairly effectively, the overall performance targets and clockspeeds needed to compete with Intel were just not feasible considering the challenges of process technology.  AMD brought back Jim Keller to lead this effort, an industry veteran with a huge amount of experience across multiple architectures.  Zen was born.

 

Hot Chips 28

This year’s Hot Chips is the first deep dive that we have received about the features of the Zen architecture.  Mike Clark is taking us through all of the changes and advances that we can expect with the upcoming Zen products.

Zen is a clean sheet design that borrows very little from previous architectures.  This is not to say that concepts that worked well in previous architectures were not revisited and optimized, but the overall floorplan has changed dramatically from what we have seen in the past.  AMD did not stand still with their Bulldozer products, and the latest Excavator core does improve upon the power consumption and performance of the original.  This evolution was simply not enough considering market pressures and Intel’s steady improvement of their core architecture year upon year.  Zen was designed to significantly improve IPC and AMD claims that this product has a whopping 40% increase in IPC (instructions per clock) from the latest Excavator core.

hc_04.png

AMD also has focused on scaling the Zen architecture from low power envelopes up to server level TDPs.  The company looks to have pushed down the top end power envelope of Zen from the 125+ watts of Bulldozer/Vishera into the more acceptable 95 to 100 watt range.  This also has allowed them to scale Zen down to the 15 to 25 watt TDP levels without sacrificing performance or overall efficiency.  Most architectures have sweet spots where they tend to perform best.  Vishera for example could scale nicely from 95 to 220 watts, but the design did not translate well into sub-65 watt envelopes.  Excavator based “Carrizo” products on the other hand could scale from 15 watts to 65 watts without real problems, but became terribly inefficient above 65 watts with increased clockspeeds.  Zen looks to address these differences by being able to scale from sub-25 watt TDPs up to 95 or 100.  In theory this should allow AMD to simplify their product stack by offering a common architecture across multiple platforms.

Click to continue reading about AMD's Zen architecture!

Author:
Subject: Processors
Manufacturer: AMD
Tagged: Zen, amd

Gunning for Broadwell-E

As I walked away from the St. Regis in downtown San Francisco tonight, I found myself wandering through the streets towards my hotel with something unique in tow. It was a smile. I was smiling, thinking about what AMD had just demonstrated and showed at its latest Zen processor reveal. The importance of this product launch can literally not be overstated for a company struggling to find a foothold to hang on to in a market that it once had a definitive lead. It’s been many years since I left a conference call, or a meeting, or a press conference feeling genuinely hopefully and enthusiastic about what AMD has shown me. Tonight I had that.

AMD’s CEO Lisa Su, and CTO Mark Papermaster, took stage down the street from the Intel Developer Forum to roll out a handful of new architectural details about the Zen architecture while also showing the first performance results comparing it to competing parts from Intel. The crowd in attendance, a mix of media and analysts, were impressed. The feeling was palpable in the room.

zenicon.jpg

It’s late as I write this, and while there are some interesting architecture details to discuss, I think it is in everyone’s best interest that we touch on them lightly for now, and instead refocus on the deep-dive once the Hot Chips information comes out early next week. What you really want to know is clear: can Zen make Intel work again? Can Zen make that $1700 price tag on the Broadwell-E 6950X seem even more ludicrous? Yes.

The Zen Architecture

Much of what was discussed from the Zen architecture is a re-release of what has been out in recent months. This is a completely new, from the ground up, microarchitecture and not a revamp of the aging Bulldozer design. It integrated SMT (simultaneous multi-threading), a first for an AMD CPU, to better take efficient advantage of a longer pipeline. Intel has had HyperThreading for a long time now and AMD is finally joining the fold. A high bandwidth and low latency caching system is used to “feed the beast” as Papermaster put it and utilizing 14nm process technology (starting at Global Foundries) gives efficiency, and scaling a significant bump while enabling AMD to scale from notebooks to desktops to servers with the same architecture.

zenpm-10.jpg

By far the most impressive claim from AMD thus far was that of a 40% increase in IPC over previous AMD designs. That’s a HUGE claim and is key to the success or failure of Zen. AMD proved to me today that the claims are real and that we will see the immediate impact of that architecture bump from day one.

zenpm-4.jpg

Press was told of a handful of high level changes to the new architecture as well. Branch prediction gets a complete overhaul. This marks the first AMD processor to have a micro-op cache. Wider execution width with broader instruction schedulers are integrated, all of which adds up to much higher instruction level parallelism to improve single threaded performance.

zenpm-6.jpg

Performance improvements aside, throughput and efficiency go up with Zen as well. AMD has integrated an 8MB L3 cache and improved prefetching for up 5x the cache bandwidth available per core on the CPU. SMT makes sure the pipeline stays full to prevent “bubbles” that introduce latency and lower efficiency while region-specific power gating means that we’ll see Zen in notebooks as well as enterprise servers in 2017. It truly is an impressive design from AMD.

zenfull-27.jpg

Summit Ridge, the enthusiast platform that will be the first product available with Zen, is based on the AM4 platform and processors will go up to 8-cores and 16-threads. DDR4 memory support is included, PCI Express 3.0 and what AMD calls “next-gen” IO – I would expect a quick leap forward for AMD to catch up on things like NVMe and Thunderbolt.

The Real Deal – Zen Performance

As part of today’s reveal, AMD is showing the first true comparison between Zen and Intel processors. Sure, AMD showed a Zen-powered system running the upcoming Deus Ex running at 4K with a system powered by the Fury X, but the really impressive results where shown when comparing Zen to a Broadwell-E platform.

zenfull-29.jpg

Using Blender to measure the performance of a rendering workload (a Zen CPU mockup of course), AMD ran an 8-core / 16-thread Zen processor at 3.0 GHz against an 8-core / 16-thread Broadwell-E processor at 3.0 GHz (likely a fixed clocked Core i7-6900K). The point of the demonstration was to showcase the IPC improvements of Zen and it worked: the render completed on the Zen platform a second or two faster than it did on the Intel Broadwell-E system.

DSC01490.jpg

Not much to look at, but Zen on the left, Broadwell-E on the right...

Of course there are lots of caveats: we didn’t setup the systems, I don’t know for sure that GPUs weren’t involved, we don’t know the final clocks of the Zen processors releasing in early 2017, etc. But I took two things away from the demonstration that are very important.

  1. The IPC of Zen is on-par or better than Broadwell.
  2. Zen will scale higher than 3.0 GHz in 8-core configurations.

AMD obviously didn’t state what specific SKUs were going to launch with the Zen architecture, what clock speeds they would run at, or even what TDPs they were targeting. Instead we were left with a vague but understandable remark of “comparable TDPs to Broadwell-E”.

Pricing? Overclocking? We’ll just have to wait a bit longer for that kind of information.

Closing Thoughts

There is clearly a lot more for AMD to share about Zen but the announcement and showcase made this week with the early prototype products have solidified for me the capability and promise of this new microarchitecture. We have asked for, and needed, as an industry, a competitor to Intel in the enthusiast CPU space – something we haven’t legitimately had since the Athlon X2 days. Zen is what we have been pining over, what gamers and consumers have needed.

zenpm-11.jpg

AMD’s processor stars might finally be aligning for a product that combines performance, efficiency and scalability at the right time. I’m ready for it –are you?

Author:
Subject: Processors
Manufacturer: AMD

Bristol Ridge Takes on Mobile: E2 Through FX

It is no secret that AMD has faced an uphill battle since the release of the original Core 2 processors from Intel.  While stayed mostly competitive through the Phenom II years, they hit some major performance issues when moving to the Bulldozer architecture.  While on paper the idea of Chip Multi-Threading sounded fantastic, AMD was never able to get the per thread performance up to expectations.  While their CPUs performed well in heavily multi-threaded applications, they just were never seen in as positive of a light as the competing Intel products.

br_01.png

The other part of the performance equation that has hammered AMD is the lack of a new process node that would allow it to more adequately compete with Intel.  When AMD was at 32 nm PD-SOI, Intel had introduced its 22nm TriGate/FinFET.  AMD then transitioned to a 28nm HKMG planar process that was more size optimized than 32nm, but did not drastically improve upon power and transistor switching performance.

So AMD had a double whammy on their hands with an underperforming architecture and limitted to no access to advanced process nodes that would actually improve their power and speed situation.  They could not force their foundry partners to spend billions on a crash course in FinFET technology to bring that to market faster, so they had to iterate and innovate on their designs.

br_02.png

Bristol Ridge is the fruit of that particular labor.  It is also the end point to the architecture that was introduced with Bulldozer way back in 2011.

Click here to read the entire introduction of AMD's Bristol Ridge lineup!

Author:
Subject: Processors
Manufacturer: Intel

Broadwell-E Platform

It has been nearly two years since the release of the Haswell-E platform, which began with the launch of the Core i7-5960X processor. Back then, the introduction of an 8-core consumer processor was the primary selling point; along with the new X99 chipset and DDR4 memory support. At the time, I heralded the processor as “easily the fastest consumer processor we have ever had in our hands” and “nearly impossible to beat.” So what has changed over the course of 24 months?

01.jpg

Today Intel is launching Broadwell-E, the follow up to Haswell-E, and things look very much the same as they did before. There are definitely a couple of changes worth noting and discussing, including the move to a 10-core processor option as well as Turbo Boost Max Technology 3.0, which is significantly more interesting than its marketing name implies. Intel is sticking with the X99 platform (good for users that might want to upgrade), though the cost of these new processors is more than slightly disappointing based on trends elsewhere in the market.

This review of the new Core i7-6950X 10-core Broadwell-E processor is going to be quick, and to the point: what changes, what is the performance, how does it overclock, and what will it cost you?

Go.

Continue reading our review of the new Core i7-6950X 10-core processor!!

Author:
Subject: Processors
Manufacturer: ARM

10nm Sooner Than Expected?

It seems only yesterday that we had the first major GPU released on 16nm FF+ and now we are talking about ARM about to receive their first 10nm FF test chips!  Well, in fact it was yesterday that NVIDIA formally released performance figures on the latest GeForce GTX 1080 which is based on TSMC’s 16nm FF+ process technology.  Currently TSMC is going full bore on their latest process node and producing the fastest current graphics chip around.  It has taken the foundry industry as a whole a lot longer to develop FinFET technology than expected, but now that they have that piece of the puzzle seemingly mastered they are moving to a new process node at an accelerated rate.

arm_td01.png

TSMC’s 10nm FF is not well understood by press and analysts yet, but we gather that it is more of a marketing term than a true drop to 10 nm features.  Intel has yet to get past 14nm and does not expect 10 nm production until well into next year.  TSMC is promising their version in the second half of 2016.  We cannot assume that TSMC’s version will match what Intel will be doing in terms of geometries and electrical characteristics, but we do know that it is a step past TSMC’s 16nm FF products.  Lithography will likely get a boost with triple patterning exposure.  My guess is that the back end will also move away from the “20nm metal” stages that we see with 16nm.  All in all, it should be an improved product from what we see with 16nm, but time will tell if it can match the performance and density of competing lines that bear the 10nm name from Intel, Samsung, and GLOBALFOUNDRIES.

ARM has a history of porting their architectures to new process nodes, but they are being a bit more aggressive here than we have seen in the past.  It used to be that ARM would announce a new core or technology, and it would take up to two years to be introduced into the market.  Now we are seeing technology announcements and actual products hitting the scenes about nine months later.  With the mobile market continuing to grow we expect to see products quicker to market still.

arm_td02.png

The company designed a simplified test chip to tape out and send to TSMC for test production on the aforementioned 10nm FF process.  The chip was taped out in December, 2015.  The design was shipped to TSMC for mask production and wafer starts.  ARM is expecting the finished wafers to arrive this month.

Click here to continue reading about ARM's test foray into 10nm!

Author:
Subject: Processors
Manufacturer: AMD

Lower Power, Same Performance

AMD is in a strange position in that there is a lot of excitement about their upcoming Zen architecture, but we are still many months away from that introduction.  AMD obviously needs to keep the dollars flowing in, and part of that means that we get refreshes now and then of current products.  The “Kaveri” products that have been powering the latest APUs from AMD have received one of those refreshes.  AMD has done some redesigning of the chip and tweaked the process technology used to manufacture them.  The resulting product is the “Godavari” refresh that offers slightly higher clockspeeds as well as better overall power efficiency as compared to the previous “Kaveri” products.

7860K_01.jpg

One of the first refreshes was the A8-7670K that hit the ground in November of 2015.  This is a slightly cut down part that features 6 GPU compute units vs. the 8 that a fully enabled Godavari chip has.  This continues to be a FM2+ based chip with a 95 watt TDP.  The clockspeed of this part goes from 3.6 GHz to 3.9 GHz.  The GPU portion runs at the same 757 MHz that the original A10-7850K ran at.  It is interesting to note that it is still a 95 watt TDP part with essentially the same clockspeeds as the 7850K, but with two fewer GPU compute units.

The other product being covered here is a bit more interesting.  The A10-7860K looks to be a larger improvement from the previous 7850K in terms of power and performance.  It shares the same CPU clockspeed range as the 7850K (3.6 GHz to 3.9 GHz), but improves upon the GPU clockspeed by hitting around 800 MHz.  At first this seems underwhelming until we realize that AMD has lowered the TDP from 95 watts down to 65 watts.  Less power consumed and less heat produced for the same performance from the CPU side and improved performance from the GPU seems like a nice advance.

amd_cool_stack.png

AMD continues to utilize GLOBALFOUNDRIES 28 nm Bulk/HKMG process for their latest APUs and will continue to do so until Zen is released late this year.  This is not the same 28 nm process that we were introduced to over four years ago.  Over that time improvements have been made to improve yields and bins, as well as optimize power and clockspeed.  GF also can adjust the process on a per batch basis to improve certain aspects of a design (higher speed, more leakage, lower power, etc.).  They cannot produce miracles though.  Do not expect 22 nm FinFET performance or density with these latest AMD products.  Those kinds of improvements will show up with Samsung/GF’s 14nm LPP and TSMC’s 16nm FF+ lines.  While AMD will be introducing GPUs on 14nm LPP this summer, the Zen launch in late 2016 will be the first AMD CPU to utilize that advanced process.

Click here to read the entire AMD A10-7860K and A10-7670K Review!

Author:
Subject: Processors
Manufacturer: AMD

Clockspeed Jump and More!

On March 1st AMD announced the availability of two new processors as well as more information on the A10 7860 APU.

The two new units are the A10-7890K and the Athlon X4 880K.  These are both Kaveri based parts, but of course the Athlon has the GPU portion disabled.  Product refreshes for the past several years have followed a far different schedule than the days of yore.  Remember back in time when the Phenom II series and the competing Core 2 series would have clockspeed updates that were expected yearly, if not every half year with a slightly faster top end performer to garner top dollar from consumers?

amd_lineup.png

Things have changed, for better or worse.  We have so far seen two clockspeed bumps for the Kaveri /Godavari based APU.  Kaveri was first introduced over two years ago with the A10-7850K and the lower end derivatives.  The 7850K has a clockspeed that ranges from 3.7 GHz to the max 4 GHz with boost.  The GPU portion is clocked at 720 MHz.  This is a 95 watt TDP part that is one of the introductory units from GLOBALFOUNDRIES 28 nm HKMG process.

Today the new top end A10-7890K is clocked at 4.1 GHz to 4.3 GHz max.  The GPU receives a significant boost in performance with a clockspeed of 866 MHz.  The combination of CPU and GPU clockspeed increases push the total performance of the part exceeding 1 TFLOPs.  It features the same dual module/quad core Godavari design as well as the 8 GCN Units.  The interesting part here is that the APU does not exceed the 95 watt TDP that it shares with the older and slower 7850K.  It is also a boost in performance from last year’s refresh of the A10-7870K which is clocked 200 MHz slower on the CPU portion but retains the 866 MHz speed of the GPU.  This APU is fully unlocked so a user can easily overclock both the CPU and GPU cores.

amd_7890k.png

The Athlon X4 880K is still based on the Godavari family rather than the Carizzo update that the X4 845 uses.  This part is clocked from 4.0 to 4.2 GHz.  It again retains the 95 watt TDP rating of the previous Athlon X4 CPUs.  Previously the X4 860K was the highest clocked unit at 3.7 GHz to 4.0, but the 880K raises that to 4 to 4.2 GHz.  A 300 MHz gain in base clock is pretty significant as well as stretching that ceiling to 4.2 GHz.  The Godavari modules retain their full amount of L2 cache so the 880K has 4 MB available to it.  These parts are very popular with budget enthusiasts and gaming builds as they are extremely inexpensive and perform at an acceptable level with free overclocking thrown in.

Click here to read more about AMD's March 2016 Refresh!

Author:
Subject: Processors
Manufacturer: AMD

AMD Keeps Q1 Interesting

CES 2016 was not a watershed moment for AMD.  They showed off their line of current video cards and, perhaps more importantly, showed off working Polaris silicon, which will be their workhorse for 2016 in the graphics department.  They did not show off Zen, a next generation APU, or any AM4 motherboards.  The CPU and APU world was not presented in a way that was revolutionary.  What they did show off, however, hinted at the things to come to help keep AMD relevant in the desktop space.

AMD_NewQ1.jpg

It was odd to see an announcement about the stock cooler that AMD was introducing, but when we learned more about it, the more important it was for AMD’s reputation moving forward.  The Wraith cooler is a new unit to help control the noise and temperatures of the latest AMD CPUs and select APUs.  This is a fairly beefy unit with a large, slow moving fan that produces very little noise.  This is a big change from the variable speed fans on previous coolers that could get rather noisy and leave temperatures that were higher in range than are comfortable.  There has been some derision aimed at AMD for providing “just a cooler” for their top end products, but it is a push that is making them more user and enthusiast friendly without breaking the bank.

Socket AM3+ is not dead yet.  Though we have been commenting on the health of the platform for some time, AMD and its partners work to improve and iterate upon these products to include technologies such as USB 3.1 and M.2 support.  While these chipsets are limited to PCI-E 2.0 speeds, the four lanes available to most M.2 controllers allows these boards to provide enough bandwidth to fully utilize the latest NVMe based M.2 drives available.  We likely will not see a faster refresh on AM3+, but we will see new SKUs utilizing the Wraith cooler as well as a price break for the processors that exist in this socket.

Click here to continue reading about AMD's latest offerings for Q1 2016!

Manufacturer: PC Perspective
Tagged: moores law, gpu, cpu

Are Computers Still Getting Faster?

It looks like CES is starting to wind down, which makes sense because it ended three days ago. Now that we're mostly caught up, I found a new video from The 8-Bit Guy. He doesn't really explain any old technologies in this one. Instead, he poses an open question about computer speed. He was able to have a functional computing experience on a ten-year-old Apple laptop, which made him wonder if the rate of computer advancement is slowing down.

I believe that he (and his guest hosts) made great points, but also missed a few important ones.

One of his main arguments is that software seems to have slowed down relative to hardware. I don't believe that is true, but I believe it's looking in the right area. PCs these days are more than capable of doing just about anything in terms of 2D user interface that we would want to, and do so with a lot of overhead for inefficient platforms and sub-optimal programming (relative to the 80's and 90's at the very least). The areas that require extra horsepower are usually doing large batches of many related tasks. GPUs are key in this area, and they are keeping up as fast as they can, despite some stagnation with fabrication processes and a difficulty (at least before HBM takes hold) in keeping up with memory bandwidth.

For the last five years to ten years or so, CPUs have been evolving toward efficiency as GPUs are being adopted for the tasks that need to scale up. I'm guessing that AMD, when they designed the Bulldozer architecture, hoped that GPUs would have been adopted much more aggressively, but even as graphics devices, they now have a huge effect on Web, UI, and media applications.

google-android-opengl-es-extensions.jpg

These are also tasks that can scale well between devices by lowering resolution (and so forth). The primary thing that a main CPU thread needs to do is figure out the system's state and keep the graphics card fed before the frame-train leaves the station. In my experience, that doesn't scale well (although you can sometimes reduce the amount of tracked objects for games and so forth). Moreover, it is easier to add GPU performance, compared to single-threaded CPU, because increasing frequency and single-threaded IPC should be more complicated than planning out more, duplicated blocks of shaders. These factors combine to give lower-end hardware a similar experience in the most noticeable areas.

So, up to this point, we discussed:

  • Software is often scaling in ways that are GPU (and RAM) limited.
  • CPUs are scaling down in power more than up in performance.
  • GPU-limited tasks can often be approximated with smaller workloads.
    • Software gets heavier, but it doesn't need to be "all the way up" (ex: resolution).
    • Some latencies are hard to notice anyway.

Back to the Original Question

This is where “Are computers still getting faster?” can be open to interpretation.

intel-devilscanyon-overview.JPG

Tasks are diverging from one class of processor into two, and both have separate industries, each with their own, multiple goals. As stated, CPUs are mostly progressing in power efficiency, which extends (an assumed to be) sufficient amount of performance downward to multiple types of devices. GPUs are definitely getting faster, but they can't do everything. At the same time, RAM is plentiful but its contribution to performance can be approximated with paging unused chunks to the hard disk or, more recently on Windows, compressing them in-place. Newer computers with extra RAM won't help as long as any single task only uses a manageable amount of it -- unless it's seen from a viewpoint that cares about multi-tasking.

In short, computers are still progressing, but the paths are now forked and winding.

Author:
Manufacturer: AMD

May the Radeon be with You

In celebration of the release of The Force Awakens as well as the new Star Wars Battlefront game from DICE and EA, AMD sent over some hardware for us to use in a system build, targeted at getting users up and running in Battlefront with impressive quality and performance, but still on a reasonable budget. Pairing up an AMD processor, MSI motherboard, Sapphire GPU with a low cost chassis, SSD and more, the combined system includes a FreeSync monitor for around $1,200.

swbf.jpg

Holiday breaks are MADE for Star Wars Battlefront

Though the holiday is already here and you'd be hard pressed to build this system in time for it, I have a feeling that quite a few of our readers and viewers will find themselves with some cash and gift certificates in hand, just ITCHING for a place to invest in a new gaming PC.

The video above includes a list of components, the build process (in brief) and shows us getting our gaming on with Star Wars Battlefront. Interested in building a system similar the one above on your own? Here's the hardware breakdown.

  AMD Powered Star Wars Battlefront System
Processor AMD FX-8370 - $197
Cooler Master Hyper 212 EVO - $29
Motherboard MSI 990FXA Gaming - $137
Memory AMD Radeon Memory DDR3-2400 - $79
Graphics Card Sapphire NITRO Radeon R9 380X - $266
Storage SanDisk Ultra II 240GB SSD - $79
Case Corsair Carbide 300R - $68
Power Supply Seasonic 600 watt 80 Plus - $69
Monitor AOC G2460PF 1920x1080 144Hz FreeSync - $259
Total Price Full System (without monitor) - Amazon.com - $924

For under $1,000, plus another $250 or so for the AOC FreeSync capable 1080p monitor, you can have a complete gaming rig for your winter break. Let's detail some of the specific components.

cpu.jpg

AMD sent over the FX-8370 processor for our build, a 4-module / 8-core CPU that runs at 4.0 GHz, more than capable of handling any gaming work load you can toss at it. And if you need to do some transcoding, video work or, heaven forbid, school or productivity work, the FX-8370 has you covered there too.

cooler.jpg

For the motherboard AMD sent over the MSI 990FXA Gaming board, one of the newer AMD platforms that includes support for USB 3.1 so you'll have a good length of usability for future expansion. The Cooler Master Hyper 212 EVO cooler was our selection to keep the FX-8370 running smoothly and 8GB of AMD Radeon DDR3-2133 memory is enough for the system to keep applications and the Windows 10 operating system happy.

Continue reading about our AMD system build for Star Wars Battlefront!!

Author:
Subject: Processors, Mobile
Manufacturer: Intel

Skylake Architecture Comes Through

When Intel finally revealed the details surrounding it's latest Skylake architecture design back in August at IDF, we learned for the first time about a new technology called Intel Speed Shift. A feature that moves some of the control of CPU clock speed and ramp up away from the operating system and into hardware gives more control to the processor itself, making it less dependent on Windows (and presumably in the future, other operating systems). This allows the clock speed of a Skylake processor to get higher, faster, allowing for better user responsiveness.

pro4-3.jpg

It's pretty clear that Intel is targeting this feature addition for tablets and 2-in-1s where the finger/pen to screen interaction is highly reliant on immediate performance to enable improved user experiences. It has long been known that one of the biggest performance deltas between iOS from Apple and Android from Google centers on the ability for the machine to FEEL faster when doing direct interaction, regardless of how fast the background rendering of an application or web browser actually is. Intel has been on a quest to fix this problem for Android for some time, where it has the ability to influence software development, and now they are bringing that emphasis to Windows 10.

With the most recent Windows 10 update, to build v10586, Intel Speed Shift has finally been enabled for Skylake users. And since you cannot disable the feature once it's installed, this is the one and only time we'll be able to measure performance in our test systems. So let's see if Intel's claims of improved user experiences stand up to our scrutiny.

Continue reading our performance evaluation of Intel Speed Shift on the Skylake Architecture!!

Author:
Subject: Processors
Manufacturer: Intel

That is a lotta SKUs!

The slow, gradual release of information about Intel's Skylake-based product portfolio continues forward. We have already tested and benchmarked the desktop variant flagship Core i7-6700K processor and also have a better understanding of the microarchitectural changes the new design brings forth. But today Intel's 6th Generation Core processors get a major reveal, with all the mobile and desktop CPU variants from 4.5 watts up to 91 watts, getting detailed specifications. Not only that, but it also marks the first day that vendors can announce and begin selling Skylake-based notebooks and systems!

All indications are that vendors like Dell, Lenovo and ASUS are still some weeks away from having any product available, but expect to see your feeds and favorite tech sites flooded with new product announcements. And of course with a new Apple event coming up soon...there should be Skylake in the new MacBooks this month.

Since I have already talked about the architecture and the performance changes from Haswell/Broadwell to Skylake in our 6700K story, today's release is just a bucket of specifications and information surround 46 different 6th Generation Skylake processors.

Intel's 6th Generation Core Processors

intel6th-6.jpg

At Intel's Developer Forum in August, the media learned quite a bit about the new 6th Generation Core processor family including Intel's stance on how Skylake changes the mobile landscape.

intel6th-7.jpg

Skylake is being broken up into 4 different line of Intel processors: S-series for desktop DIY users, H-series for mobile gaming machines, U-series for your everyday Ultrabooks and all-in-ones, Y-series for tablets and 2-in-1 detachables. (Side note: Intel does not reference an "Ultrabook" anymore. Huh.)

intel6th-8.jpg

As you would expect, Intel has some impressive gains to claim with the new 6th Generation processor. However, it is important to put them in context. All of the claims above, including 2.5x performance, 30x graphics improvement and 3x longer battery life, are comparing Skylake-based products to CPUs from 5 years ago. Specifically, Intel is comparing the new Core i5-6200U (a 15 watt part) against the Core i5-520UM (an 18 watt part) from mid-2010.

Continue reading our overview of the 46 new Intel Skylake 6th Generation Core processors!!