Author:
Manufacturer: Intel

An abundance of new processors

During the its press conference at Computex 2017, Intel has officially announced the upcoming release of an entire new family of HEDT (high-end desktop) processors along with a new chipset and platform to power it. Though it has only been a year since Intel launched the Core i7-6950X, a Broadwell-E processor with 10-cores and 20-threads, it feels like it has been much longer than that. At the time Intel was accused of “sitting” on the market – offering only slight performance upgrades and raising prices on the segment with a flagship CPU cost of $1700. With can only be described as scathing press circuit, coupled with a revived and aggressive competitor in AMD and its Ryzen product line, Intel and its executive teams have decided it’s time to take enthusiasts and high end prosumer markets serious, once again.

slides-3.jpg

Though the company doesn’t want to admit to anything publicly, it seems obvious that Intel feels threatened by the release of the Ryzen 7 product line. The Ryzen 7 1800X was launched at $499 and offered 8 cores and 16 threads of processing, competing well in most tests against the likes of the Intel Core i7-6900X that sold for over $1000. Adding to the pressure was the announcement at AMD’s Financial Analyst Day that a new brand of processors called Threadripper would be coming this summer, offering up to 16 cores and 32 threads of processing for that same high-end consumer market. Even without pricing, clocks or availability timeframes, it was clear that AMD was going to come after this HEDT market with a brand shift of its EPYC server processors, just like Intel does with Xeon.

The New Processors

Normally I would jump into the new platform, technologies and features added to the processors, or something like that before giving you the goods on the CPU specifications, but that’s not the mood we are in. Instead, let’s start with the table of nine (9!!) new products and work backwards.

  Core i9-7980XE Core i9-7960X Core i9-7940X Core i9-7920X Core i9-7900X Core i7-7820X Core i7-7800X Core i7-7740X Core i5-7640X
Architecture Skylake-X Skylake-X Skylake-X Skylake-X Skylake-X Skylake-X Skylake-X Kaby Lake-X Kaby Lake-X
Process Tech 14nm+ 14nm+ 14nm+ 14nm+ 14nm+ 14nm+ 14nm+ 14nm+ 14nm+
Cores/Threads 18/36 16/32 14/28 12/24 10/20 8/16 6/12 4/8 4/4
Base Clock ? ? ? ? 3.3 GHz 3.6 GHz 3.5 GHz 4.3 GHz 4.0 GHz
Turbo Boost 2.0 ? ? ? ? 4.3 GHz 4.3 GHz 4.0 GHz 4.5 GHz 4.2 GHz
Turbo Boost Max 3.0 ? ? ? ? 4.5 GHz 4.5 GHz N/A N/A N/A
Cache 16.5MB (?) 16.5MB (?) 16.5MB (?) 16.5MB (?) 13.75MB 11MB 8.25MB 8MB 6MB
Memory Support ? ? ? ? DDR4-2666
Quad Channel
DDR4-2666
Quad Channel
DDR4-2666
Quad Channel
DDR4-2666
Dual Channel
DDR4-2666 Dual Channel
PCIe Lanes ? ? ? ? 44 28 28 16 16
TDP 165 watts (?) 165 watts (?) 165 watts (?) 165 watts (?) 140 watts 140 watts 140 watts 112 watts 112 watts
Socket 2066 2066 2066 2066 2066 2066 2066 2066 2066
Price $1999 $1699 $1399 $1199 $999 $599 $389 $339 $242

There is a lot to take in here. The most interesting points are that Intel plans to one-up AMD Threadripper by offering an 18-core processor but it also wants to change the perception of the X299-class platform by offering lower price, lower core count CPUs like the quad-core, non-HyperThreaded Core i5-7640X. We also see the first ever branding of Core i9.

Intel only provided detailed specifications up to the Core i9-7900X, a 10-core / 20-thread processor with a base clock of 3.3 GHz and a Turbo peak of 4.5 GHz using the new Turbo Boost Max Technology 3.0. It sports 13.75MB of cache thanks to an updated cache configuration, includes 44 lanes of PCIe 3.0, an increase of 4 lanes over Broadwell-E, quad-channel DDR4 memory up to 2666 MHz and a 140 watt TDP. The new LGA2066 socket will be utilized. Pricing for this CPU is set at $999, which is interesting for a couple of reasons. First, it is $700 less than the starting MSRP of the 10c/20t Core i7-6950X from one year ago; obviously a big plus. However, there is quite a ways UP the stack, with the 18c/36t Core i9-7980XE coming in at a cool $1999.

intel1.jpg

The next CPU down the stack is compelling as well. The Core i7-7820X is the new 8-core / 16-thread HEDT option from Intel, with similar clock speeds to the 10-core above it, save the higher base clock. It has 11MB of L3 cache, 28-lanes of PCI Express (4 higher than Broadwell-E) but has a $599 price tag. Compared to the 8-core 6900K, that is ~$400 lower, while the new Skylake-X part iteration includes a 700 MHz clock speed advantage. That’s huge, and is a direct attack on the AMD Ryzen 7 1800X that sells for $499 today and cut Intel off at the knees this March. In fact, the base clock of the Core i7-7820X is only 100 MHz lower than the maximum Turbo Boost clock of the Core i7-6900K!

Continue reading about the Intel Core i9 series announcement!

Author:
Manufacturer: ARM

ARM Refreshes All the Things

This past April ARM invited us to visit Cambridge, England so they could discuss with us their plans for the next year.  Quite a bit has changed for the company since our last ARM Tech Day in 2016.  They were acquired by SoftBank, but continue to essentially operate as their own company.  They now have access to more funds, are less risk averse, and have a greater ability to expand in the ever growing mobile and IOT marketplaces.

dynamiq_01.png

The ARM of today certainly is quite different than what we had known 10 years ago when we saw their technology used in the first iPhone.  The company back then had good technology, but a relatively small head count.  They kept pace with the industry, but were not nearly as aggressive as other chip companies in some areas.  Through the past 10 years they have grown not only in numbers, but in technologies that they have constantly expanded on.  The company became more PR savvy and communicated more effectively with the press and in the end their primary users.  Where once ARM would announce new products and not expect to see shipping products upwards of 3 years away, we are now seeing the company be much more aggressive with their designs and getting them out to their partners so that production ends up happening in months as compared to years.

Several days of meetings and presentations left us a bit overwhelmed by what ARM is bringing to market towards the end of 2017 and most likely beginning of 2018.  On the surface it appears that ARM has only done a refresh of the CPU and GPU products, but once we start looking at these products in the greater scheme and how they interact with DynamIQ we see that ARM has changed the mobile computing landscape dramatically.  This new computing concept allows greater performance, flexibility, and efficiency in designs.  Partners will have far more control over these licensed products to create more value and differentiation as compared to years past.

dynamiq_02.png

We have previously covered DynamIQ at PCPer this past March.  ARM wanted to seed that concept before they jumped into more discussions on their latest CPUs and GPUs.  Previous Cortex products cannot be used with DynamIQ.  To leverage that technology we must have new CPU designs.  In this article we are covering the Cortex-A55 and Cortex-A75.  These two new CPUs on the surface look more like a refresh, but when we dig in we see that some massive changes have been wrought throughout.  ARM has taken the concepts of the previous A53 and A73 and expanded upon them fairly dramatically, not only to work with DynamIQ but also by removing significant bottlenecks that have impeded theoretical performance.

Continue reading our overview of the new family of ARM CPUs and GPU!

Author:
Subject: Processors
Manufacturer: Various

Application Profiling Tells the Story

It should come as no surprise to anyone that has been paying attention the last two months that the latest AMD Ryzen processors and architecture are getting a lot of attention. Ryzen 7 launched with a $499 part that bested the Intel $1000 CPU at heavily threaded applications and Ryzen 5 launched with great value as well, positioning a 6-core/12-thread CPU against quad-core parts from the competition. But part of the story that permeated through both the Ryzen 7 and the Ryzen 5 processor launches was the situation surrounding gaming performance, in particular 1080p gaming, and the surprising delta  that we see in some games.

Our team has done quite a bit of research and testing on this topic. This included a detailed look at the first asserted reason for the performance gap, the Windows 10 scheduler. Our summary there was that the scheduler was working as expected and that minimal difference was seen when moving between different power modes. We also talked directly with AMD to find out its then current stance on the results, backing up our claims on the scheduler and presented a better outlook for gaming going forward. When AMD wanted to test a new custom Windows 10 power profile to help improve performance in some cases, we took part in that too. In late March we saw the first gaming performance update occur courtesy of Ashes of the Singularity: Escalation where an engine update to utilize more threads resulted in as much as 31% average frame increase.

ping-amd.png

As a part of that dissection of the Windows 10 scheduler story, we also discovered interesting data about the CCX construction and how the two modules on the 1800X communicated. The result was significantly longer thread to thread latencies than we had seen in any platform before and it was because of the fabric implementation that AMD integrated with the Zen architecture.

This has led me down another hole recently, wondering if we could further compartmentalize the gaming performance of the Ryzen processors using memory latency. As I showed in my Ryzen 5 review, memory frequency and throughput directly correlates to gaming performance improvements, in the order of 14% in some cases. But what about looking solely at memory latency alone?

Continue reading our analysis of memory latency, 1080p gaming, and how it impacts Ryzen!!

Manufacturer: EKWB

Introduction and Technical Specifications

Introduction

02-block-with-mount-profile.jpg

Courtesy of EKWB

EK's Supremacy line of CPU waterblocks are well known for their performance and style. Their latest version in this block line, the Supremacy MX, advances their design in the hopes of getting more optimized performance out of a less costly version of their award winning block series. The base Supremacy MX CPU waterblock is a copper and plexi construction using the same jet-impingement and micro-channel design as that used in their previous block versions. The block comes fully assembled from the factory with a single CPU mounting bracket type (in this case, the Intel version). Note that additional CPU mounting kits are available for purchase. With an MSRP of $54.99, the Supremacy MX waterblock offers a compelling purchase in light of its performance potential.

03-block-closeup.jpg

Courtesy of EKWB

04-block-flyapart.jpg

06-block-mounted-lit.jpg

Courtesy of EKWB

The block is assembled with hex-head screws going through the copper base plate with rubber grommets ensuring the integrity of the block internals. The top aluminum cover plate is held to the plexi top using short hex-head screws that thread directly into the plexi top plate. The center inlet feeds the micro-channels embedded in the copper base plate through the jet-impingement assembly. The mounting bracket sits in between the top plexi plate and the copper base plate, making any an interesting upgrade if you want to switch out the CPU mount plate to use the block on a different CPU family (like going from Intel to AMD Ryzen for example). The aluminum top plate gives the block a sleek appearance and acts to redirect illumination from the side mounted LEDs (if you choose to use LEDs with the block that is).

Continue reading our review of the EK Supremacy MX CPU waterblock!

Author:
Subject: Processors
Manufacturer: AMD

The real battle begins

When AMD launched the Ryzen 7 processors last month to a substantial amount of fanfare and pent up excitement, we already knew that the Ryzen 5 launch would be following close behind. While the Ryzen 7 lineup was meant to compete with the Intel Core i7 Kaby Lake and Broadwell-E products, with varying levels of success, the Ryzen 5 parts are priced to go head to head with Intel's Core i5 product line. 

AMD already told us the details of the new product line including clock speeds, core counts and pricing, so there is little more to talk about other than the performance and capabilities we found from our testing of the new Ryzen 5 parts. Starting with the Ryzen 5 1600X, with 6 cores, 12 threads and a $249 price point, and going down to the Ryzen 5 1400 with 4 cores, 8 threads and a $169 price point, this is easily AMD's most aggressive move to date. The Ryzen 7 1800X at $499 was meant to choke off purchases of Intel's $1000+ parts; Ryzen 5 is attempting to offer significant value and advantage for users on a budget.

Today we have the Ryzen 5 1600X and Ryzen 5 1500X in our hands. The 1600X is a 6C/12T processor that will have a 50% core count advantage over the Core i5-7600K it is priced against but a 3x advantage in thread count because of Intel's disabling of HyperThreading on Core i5 desktop processors. The Ryzen 5 1500X has the same number of cores as the Core i5-7500 it will be pitted against, but 2x the thread count. 

01.jpg

How does this fare for AMD? Will budget consumers finally find a solution from the company that has no caveats?

Continue reading our review of the AMD Ryzen 5 1600X and 1500X processors!!

Author:
Subject: Processors
Manufacturer: AMD

Tweaks for days

It seems like it’s been months since AMD launched Ryzen, its first new processor architecture in about a decade, when in fact we are only four weeks removed. One of the few concerns about the Ryzen processors centered on its performance in some gaming performance results, particularly in common resolutions like 1080p. While I was far from the only person to notice these concerns, our gaming tests clearly showed a gap between the Ryzen 7 1800X and the Intel Core i7-7700K and 6900K processors in Civilization 6, Hitman and Rise of the Tomb Raider.

hitman.png

A graph from our Ryzen launch coverage...

We had been working with AMD for a couple of weeks on the Ryzen launch and fed back our results with questions in the week before launch. On March 2nd, AMD’s CVP of Marketing John Taylor gave us a prepared statement that acknowledged the issue but promised changes come in form for game engine updates. These software updates would need to be implemented by the game developers themselves in order to take advantage of the unique and more complex core designs of the Zen architecture. We had quotes from the developers of Ashes of the Singularity as well as the Total War series to back it up.

And while statements promising change are nice, it really takes some proof to get the often skeptical tech media and tech enthusiasts to believe that change can actually happen. Today AMD is showing its first result.

The result of 400 developer hours of work, the Nitrous Engine powering Ashes of the Singularity received an update today to version 26118 that integrates updates to threading to better balance the performance across Ryzen 7’s 8 cores and 16 threads. I was able to do some early testing on the new revision, as well as with the previous retail shipping version (25624) to see what kind of improvements the patch brings with it.

Stardock / Oxide CEO Brad Wardell had this to say in a press release:

“I’ve always been vocal about taking advantage of every ounce of performance the PC has to offer. That’s why I’m a strong proponent of DirectX 12 and Vulkan® because of the way these APIs allow us to access multiple CPU cores, and that’s why the AMD Ryzen processor has so much potential,” said Stardock and Oxide CEO Brad Wardell. “As good as AMD Ryzen is right now – and it’s remarkably fast – we’ve already seen that we can tweak games like Ashes of the Singularity to take even more advantage of its impressive core count and processing power. AMD Ryzen brings resources to the table that will change what people will come to expect from a PC gaming experience.”

Our testing setup is in line with our previous CPU performance stories.

Test System Setup
CPU AMD Ryzen 7 1800X
Intel Core i7-6900K
Motherboard ASUS Crosshair VI Hero (Ryzen)
ASUS X99-Deluxe II (Broadwell-E)
Memory 16GB DDR4-2400
Storage Corsair Force GS 240 SSD
Sound Card On-board
Graphics Card NVIDIA GeForce GTX 1080 8GB
Graphics Drivers NVIDIA 378.49
Power Supply Corsair HX1000
Operating System Windows 10 Pro x64

I was using the latest BIOS for our ASUS Crosshair VI Hero motherboard (1002) and upgraded to some Geil RGB (!!) memory capable of running at 3200 MHz on this board with a single BIOS setting adjustment. All of my tests were done at 1080p in order to return to the pain point that AMD was dealing with on launch day.

Let’s see the results.

ashes-1.png

ashes-2.png

These are substantial performance improvements with the new engine code! At both 2400 MHz and 3200 MHz memory speeds, and at both High and Extreme presets in the game (all running in DX12 for what that’s worth), the gaming performance on the GPU-centric is improved. At the High preset (which is the setting that AMD used in its performance data for the press release), we see a 31% jump in performance when running at the higher memory speed and a 22% improvement with the lower speed memory. Even when running at the more GPU-bottlenecked state of the Extreme preset, that performance improvement for the Ryzen processors with the latest Ashes patch is 17-20%!

DSC02636.jpg

It’s also important to note that Intel performance is unaffected – either for the better or worse. Whatever work Oxide did to improve the engine for AMD’s Ryzen processors had NO impact on the Core processors, which is interesting to say the least. The cynic in me would believe there is little chance that any agnostic changes to code would raise Intel’s multi-core performance at least a little bit.

So what exactly is happening to the engine with v26118? I haven’t had a chance to have an in-depth conversation with anyone at AMD or Oxide yet on the subject, but at a high level, I was told that this is what happens when instructions and sequences are analyzed for an architecture specifically. “For basically 5 years”, I was told, Oxide and other developers have dedicated their time to “instruction traces and analysis to maximize Intel performance” which helps to eliminate poor instruction setup. After spending some time with Ryzen and the necessary debug tools (and some AMD engineers), they were able to improve performance on Ryzen without adversely affecting Intel parts.

ping-amd.png

Core to core latency testing on Ryzen 7 1800X

I am hoping to get more specific detail in the coming days, but it would seem very likely that Oxide was able to properly handle the more complex core to core communication systems on Ryzen and its CCX implementation. We demonstrated early this month how thread to thread communication across core complexes causes substantially latency penalties, and that a developer that intelligently manages threads that have dependencies on the core complex can improve overall performance. I would expect this is at least part of the solution Oxide was able to integrate (and would also explain why Intel parts are unaffected).

What is important now is that AMD takes this momentum with Ashes of the Singularity and actually does something with it. Many of you will recognize Ashes as the flagship title for Mantle when AMD made that move to change the programming habits and models for developers, and though Mantle would eventually become Vulkan and drive DX12 development, it did not foretell an overall shift as it hoped to. Can AMD and its developer relations team continue to make the case that spending time and money (which is what 400 developer hours equates to) to make specific performance enhancements for Ryzen processors is in the best interest of everyone? We’ll soon find out.

Author:
Subject: Processors, Mobile
Manufacturer: Qualcomm

A new start

Qualcomm is finally ready to show the world how the Snapdragon 835 Mobile Platform performs. After months of teases and previews, including a the reveal that it was the first processor built on Samsung’s 10nm process technology and a mostly in-depth look at the architectural changes to the CPU and GPU portions of the SoC, the company let a handful of media get some hands-on time with development reference platform and run some numbers.

To frame the discussion as best I can, I am going to include some sections from my technology overview. This should give some idea of what to expect from Snapdragon 835 and what areas Qualcomm sees providing the widest variation from previous SD 820/821 product.

Qualcomm frames the story around the Snapdragon 835 processor with what they call the “five pillars” – five different aspects of mobile processor design that they have addressed with updates and technologies. Qualcomm lists them as battery life (efficiency), immersion (performance), capture, connectivity, and security.

slides1-6.jpg

Starting where they start, on battery life and efficiency, the SD 835 has a unique focus that might surprise many. Rather than talking up the improvements in performance of the new processor cores, or the power of the new Adreno GPU, Qualcomm is firmly planted on looking at Snapdragon through the lens of battery life. Snapdragon 835 uses half of the power of Snapdragon 801.

slides2-11.jpg

Since we already knew that the Snapdragon 835 was going to be built on the 10nm process from Samsung, the first such high performance part to do so, I was surprised to learn that Qualcomm doesn’t attribute much of the power efficiency improvements to the move from 14nm to 10nm. It makes sense – most in the industry see this transition as modest in comparison to what we’ll see at 7nm. Unlike the move from 28nm to 14/16nm for discrete GPUs, where the process technology was a huge reason for the dramatic power drop we saw, the Snapdragon 835 changes come from a combination of advancements in the power management system and offloading of work from the primary CPU cores to other processors like the GPU and DSP. The more a workload takes advantage of heterogeneous computing systems, the more it benefits from Qualcomm technology as opposed to process technology.

slides2-22.jpg

Continue reading our preview of Qualcomm Snapdragon 835 performance!

Author:
Subject: Processors
Manufacturer: AMD

Here Comes the Midrange!

Today AMD is announcing the upcoming Ryzen 5 CPUs.  A little bit was known about them from several weeks ago when AMD talked about their upcoming 6 core processors, but official specifications were lacking.  Today we get to see what Ryzen 5 is mostly about.

ryzen5_01.png

There are four initial SKUs that AMD is talking about this evening.  These encompass quad core and six core products.  There are two “enthusiast” level SKUs with the X connotation while the other two are aimed at a less edgy crowd.

The two six core CPUs are the 1600 and 1600X.  The X version features the higher extended frequency range when combined with performance cooling.  That unit is clocked at a base 3.6 GHz and achieves a boost of 4 GHz.  This compares well to the top end R7 1800X, but it is short 2 cores and four threads.  The price of the R5 1600X is a very reasonable $249.  The 1600 does not feature the extended range, but it does come in at a 3.2 GHz base and 3.6 GHz boost.  The R5 1600 has a MSRP of $219.

ryzen5_04.png

When we get to the four core, eight thread units we see much the same stratification.  The top end 1500X comes in at $189 and features a base clock of 3.5 GHz and a boost of 3.7 GHz.  What is interesting about this model is that the XFR is raised by 100 MHz vs. other XFR CPUs.  So instead of an extra 100 MHz boost when high end cooling is present we can expect to see 200 MHz.  In theory this could run at 3.9 GHz in the extended state.  The lowest priced R5 is the 1400 which comes in at a very modest $169.  This features a 3.2 GHz base clock and a 3.4 GHz boost.

The 1400, 1500, and 1600 CPUs come with Wraith cooling solutions.  The 1600X comes bare as it is assumed that users want to use something a bit more robust.  The R5 1400 comes with the lower end Wraith Stealth cooler while the R5 1500X and R5 1600 come with the bigger Wraith Spire.  The bottom 3 SKUs are all rated at 65 watts TDP.  The 1600X comes in at the higher 95 watt rating.  Each of the CPUs are unlocked for overclocking.

ryzen5_03.png

These chips will provide a more fleshed out pricing structure for the Ryzen processors and provide users and enthusiasts with lower cost options for those wanting to invest in AMD again.  These chips all run on the new AM4 platform which are pretty strong in terms of features and I/O performance.

ryzen5_02.png

AMD is not shipping these parts today, but rather announcing them.  Review samples are not in hand yet and AMD expects world-wide availability by April 11.  This is likely a very necessary step for AMD as current AM4 motherboard availability is not at the level we were expecting to see.  We also are seeing some pretty quick firmware updates from motherboard partners to address issues with these first AM4 boards.  By April 11 I would expect to see most of the issues solved and a healthy supply of motherboards on the shelves to handle the influx of consumers waiting to buy these more midrange priced CPUs from AMD.

What they did not cover or answer would be how the four core products would be presented.  Would each be a single CCX and only 8 MB of L3 cace, or would AMD disable two cores in each CCX and present 16 MB of L3?  We currently do not have the answer to this.  Considering the latency between accessing different CCX units we can surely hope they only keep one CCX active.

ryzen5_05.png

Ryzen has certainly been a success for AMD and I have no doubt that their quarter will be pretty healthy with the estimated sales of around 1 million Ryzen CPUs since launch.  Announcing these new chips will give the mainstream and budget enthusiasts something to look forward to and plan their purchases around.  AMD is not announcing the Ryzen 3 products at this time.

Update: AMD got back to me this morning about a question I asked them about the makeup of cores, CCX units, and L3 cache.  Here is their response.

1600X: 3+3 with 16MB L3 cache. 1600: 3+3 with 16MB L3 cache. 1500X: 2+2 with 16MB L3 cache. 1400: 2+2 with 8MB L3 cache. As with Ryzen 7, each core still has 512KB local L2 cache.

Manufacturer: RockIt Cool

Introduction

Introduction

With the introduction of the Intel Kaby Lake processors and Intel Z270 chipset, unprecedented overclocking became the norm. The new processors easily hit a core speed of 5.0GHz with little more than CPU core voltage tweaking. This overclocking performance increase came with a price tag. The Kaby Lake processor runs significantly hotter than previous generation processors, a seeming reversal in temperature trends from previous generation Intel CPUs. At stock settings, the individual cores in the CPU were recording in testing at hitting up to 65C - and that's with a high performance water loop cooling the processor. Per reports from various enthusiasts sites, Intel used inferior TIM (thermal interface material) in between the CPU die and underside of the CPU heat spreader, leading to increased temperatures when compared with previous CPU generations (in particular Skylake). This temperature increase did not affect overclocking much since the CPU will hit 5.0GHz speed easily, but does impact the means necessary to hit those performance levels.

Like with the previous generation Haswell CPUs, a few of the more adventurous enthusiasts used known methods in an attempt to address the heat concerns of the Kaby Lake processor be delidding the processor. Unlike in the initial days of the Haswell processor, the delidding process is much more stream-lined with the availability of delidding kits from several vendors. The delidding process still involves physically removing the heat spreader from the CPU, and exposing the CPU die. However, instead of cooling the die directly, the "safer" approach is to clean the die and underside of the heat spreader, apply new TIM (thermal interface material), and re-affix the heat spreader to the CPU. Going this route instead of direct-die cooling is considered safer because no additional or exotic support mechanisms are needed to keep the CPU cooler from crushing your precious die. However, calling it safe is a bit of an over-statement, you are physically separating the heat spreader from the CPU surface and voiding your CPU warranty at the same time. Although if that was a concern, you probably wouldn't be reading this article in the first place.

Continue reading our Kaby Lake Relidding article!

Subject: Processors
Manufacturer: AMD

** UPDATE 3/13 5 PM **

AMD has posted a follow-up statement that officially clears up much of the conjecture this article was attempting to clarify. Relevant points from their post that relate to this article as well as many of the requests for additional testing we have seen since its posting (emphasis mine):

  • "We have investigated reports alleging incorrect thread scheduling on the AMD Ryzen™ processor. Based on our findings, AMD believes that the Windows® 10 thread scheduler is operating properly for “Zen,” and we do not presently believe there is an issue with the scheduler adversely utilizing the logical and physical configurations of the architecture."

  • "Finally, we have reviewed the limited available evidence concerning performance deltas between Windows® 7 and Windows® 10 on the AMD Ryzen™ CPU. We do not believe there is an issue with scheduling differences between the two versions of Windows.  Any differences in performance can be more likely attributed to software architecture differences between these OSes."

So there you have it, straight from the horse's mouth. AMD does not believe the problem lies within the Windows thread scheduler. SMT performance in gaming workloads was also addressed:

  • "Finally, we have investigated reports of instances where SMT is producing reduced performance in a handful of games. Based on our characterization of game workloads, it is our expectation that gaming applications should generally see a neutral/positive benefit from SMT. We see this neutral/positive behavior in a wide range of titles, including: Arma® 3, Battlefield™ 1, Mafia™ III, Watch Dogs™ 2, Sid Meier’s Civilization® VI, For Honor™, Hitman™, Mirror’s Edge™ Catalyst and The Division™. Independent 3rd-party analyses have corroborated these findings.

    For the remaining outliers, AMD again sees multiple opportunities within the codebases of specific applications to improve how this software addresses the “Zen” architecture. We have already identified some simple changes that can improve a game’s understanding of the "Zen" core/cache topology, and we intend to provide a status update to the community when they are ready."

We are still digging into the observed differences of toggling SMT compared with disabling the second CCX, but it is good to see AMD issue a clarifying statement here for all of those out there observing and reporting on SMT-related performance deltas.

** END UPDATE **

Editor's Note: The testing you see here was a response to many days of comments and questions to our team on how and why AMD Ryzen processors are seeing performance gaps in 1080p gaming (and other scenarios) in comparison to Intel Core processors. Several outlets have posted that the culprit is the Windows 10 scheduler and its inability to properly allocate work across the logical vs. physical cores of the Zen architecture. As it turns out, we can prove that isn't the case at all. -Ryan Shrout

Initial reviews of AMD’s Ryzen CPU revealed a few inefficiencies in some situations particularly in gaming workloads running at the more common resolutions like 1080p, where the CPU comprises more of a bottleneck when coupled with modern GPUs. Lots of folks have theorized about what could possibly be causing these issues, and most recent attention appears to have been directed at the Windows 10 scheduler and its supposed inability to properly place threads on the Ryzen cores for the most efficient processing. 

I typically have Task Manager open while running storage tests (they are boring to watch otherwise), and I naturally had it open during Ryzen platform storage testing. I’m accustomed to how the IO workers are distributed across reported threads, and in the case of SMT capable CPUs, distributed across cores. There is a clear difference when viewing our custom storage workloads with SMT on vs. off, and it was dead obvious to me that core loading was working as expected while I was testing Ryzen. I went back and pulled the actual thread/core loading data from my testing results to confirm:

SMT on usage.png

The Windows scheduler has a habit of bouncing processes across available processor threads. This naturally happens as other processes share time with a particular core, with the heavier process not necessarily switching back to the same core. As you can see above, the single IO handler thread was spread across the first four cores during its run, but the Windows scheduler was always hitting just one of the two available SMT threads on any single core at one time.

My testing for Ryan’s Ryzen review consisted of only single threaded workloads, but we can make things a bit clearer by loading down half of the CPU while toggling SMT off. We do this by increasing the worker count (4) to be half of the available threads on the Ryzen processor, which is 8 with SMT disabled in the motherboard BIOS.

smtoff4workers.png

SMT OFF, 8 cores, 4 workers

With SMT off, the scheduler is clearly not giving priority to any particular core and the work is spread throughout the physical cores in a fairly even fashion.

Now let’s try with SMT turned back on and doubling the number of IO workers to 8 to keep the CPU half loaded:

smton8workers.png

SMT ON, 16 (logical) cores, 8 workers

With SMT on, we see a very different result. The scheduler is clearly loading only one thread per core. This could only be possible if Windows was aware of the 2-way SMT (two threads per core) configuration of the Ryzen processor. Do note that sometimes the workload will toggle around every few seconds, but the total loading on each physical core will still remain at ~%50. I chose a workload that saturated its thread just enough for Windows to not shift it around as it ran, making the above result even clearer.

Synthetic Testing Procedure

While the storage testing methods above provide a real-world example of the Windows 10 scheduler working as expected, we do have another workload that can help demonstrate core balancing with Intel Core and AMD Ryzen processors. A quick and simple custom-built C++ application can be used to generate generic worker threads and monitor for core collisions and resolutions.

This test app has a very straight forward workflow. Every few seconds it generates a new thread, capping at N/2 threads total, where N is equal to the reported number of logical cores. If the OS scheduler is working as expected, it should load 8 threads across 8 physical cores, though the division between the specific logical core per physical core will be based on very minute parameters and conditions going on in the OS background.

By monitoring the APIC_ID through the CPUID instruction, the first application thread monitors all threads and detects and reports on collisions - when a thread from our app is running on the same core as another thread from our app. That thread also reports when those collisions have been cleared. In an ideal and expected environment where Windows 10 knows the boundaries of physical and logical cores, you should never see more than one thread of a core loaded at the same time.

app01.png

Click to Enlarge

This screenshot shows our app working on the left and the Windows Task Manager on the right with logical cores labeled. While it may look like all logical cores are being utilized at the same time, in fact they are not. At any given point, only LCore 0 or LCore 1 are actively processing a thread. Need proof? Check out the modified view of the task manager where I copy the graph of LCore 1/5/9/13 over the graph of LCore 0/4/8/12 with inverted colors to aid viewability.

app02-2.png

If you look closely, by overlapping the graphs in this way, you can see that the threads migrate from LCore 0 to LCore 1, LCore 4 to LCore 5, and so on. The graphs intersect and fill in to consume ~100% of the physical core. This pattern is repeated for the other 8 logical cores on the right two columns as well. 

Running the same application on a Core i7-5960X Haswell-E 8-core processor shows a very similar behavior.

app03.png

Click to Enlarge

Each pair of logical cores shares a single thread and when thread transitions occur away from LCore N, they migrate perfectly to LCore N+1. It does appear that in this scenario the Intel system is showing a more stable threaded distribution than the Ryzen system. While that may in fact incur some performance advantage for the 5960X configuration, the penalty for intra-core thread migration is expected to be very minute.

The fact that Windows 10 is balancing the 8 thread load specifically between matching logical core pairs indicates that the operating system is perfectly aware of the processor topology and is selecting distinct cores first to complete the work.

Information from this custom application, along with the storage performance tool example above, clearly show that Windows 10 is attempting to balance work on Ryzen between cores in the same manner that we have experienced with Intel and its HyperThreaded processors for many years.

Continue reading our look at AMD Ryzen and Windows 10 scheduling!

Author:
Subject: Processors
Manufacturer: AMD

The right angle

While many in the media and enthusiast communities are still trying to fully grasp the importance and impact of the recent AMD Ryzen 7 processor release, I have been trying to complete my review of the 1700X and 1700 processors, in between testing the upcoming GeForce GTX 1080 Ti and preparing for more hardware to show up at the offices very soon. There is still much to learn and understand about the first new architecture from AMD in nearly a decade, including analysis of the memory hierarchy, power consumption, overclocking, gaming performance, etc.

During my Ryzen 7 1700 testing, I went through some overclocking evaluation and thought the results might be worth sharing earlier than later. This quick article is just a preview of what we are working on so don’t expect to find the answers to Ryzen power management here, only a recounting of how I was able to get stellar performance from the lowest priced Ryzen part on the market today.

The system specifications for this overclocking test were identical to our original Ryzen 7 processor review.

Test System Setup
CPU AMD Ryzen 7 1800X
AMD Ryzen 7 1700X
AMD Ryzen 7 1700
Intel Core i7-7700K
Intel Core i5-7600K
Intel Core i7-6700K
Intel Core i7-6950X
Intel Core i7-6900K
Intel Core i7-6800K
Motherboard ASUS Crosshair VI Hero (Ryzen)
ASUS Prime Z270-A (Kaby Lake, Skylake)
ASUS X99-Deluxe II (Broadwell-E)
Memory 16GB DDR4-2400
Storage Corsair Force GS 240 SSD
Sound Card On-board
Graphics Card NVIDIA GeForce GTX 1080 8GB
Graphics Drivers NVIDIA 378.49
Power Supply Corsair HX1000
Operating System Windows 10 Pro x64

Of note is that I am still utilizing the Noctua U12S cooler that AMD provided for our initial testing – all of the overclocking and temperature reporting in this story is air cooled.

DSC02643.jpg

First, let’s start with the motherboard. All of this testing was done on the ASUS Crosshair VI Hero with the latest 5704 BIOS installed. As I began to discover the different overclocking capabilities (BCLK adjustment, multipliers, voltage) I came across one of the ASUS presets. These presets offer pre-defined collections of settings that ASUS feels will offer simple overclocking capabilities. An option for higher BCLK existed but the one that caught my eye was straight forward – 4.0 GHz.

asusbios.jpg

With the Ryzen 1700 installed, I thought I would give it a shot. Keep in mind that this processor has a base clock of 3.0 GHz, a rated maximum boost clock of 3.7 GHz, and is the only 65-watt TDP variant of the three Ryzen 7 processors released last week. Because of that, I didn’t expect the overclocking capability for it to match what the 1700X and 1800X could offer. Based on previous processor experience, when a chip is binned at a lower power draw than the rest of a family it will often have properties that make it disadvantageous for running at HIGHER power. Based on my results here, that doesn’t seem to the case.

4.0.PNG

By simply enabling that option in the ASUS UEFI and rebooting, our Ryzen 1700 processor was running at 4.0 GHz on all cores! For this piece, I won’t be going into the drudge and debate on what settings ASUS changed to get to this setting or if the voltages are overly aggressive – the point is that it just works out of the box.

Continue reading our look at overclocking the new Ryzen 7 1700 processor!

Author:
Subject: Processors
Manufacturer: AMD

AMD Ryzen 7 Processor Specifications

It’s finally here and its finally time to talk about. The AMD Ryzen processor is being released onto the world and based on the buildup of excitement over the last week or so since pre-orders began, details on just how Ryzen performs relative to Intel’s mainstream and enthusiast processors are a hot commodity. While leaks have been surfacing for months and details seem to be streaming out from those not bound to the same restrictions we have been, I think you are going to find our analysis of the Ryzen 7 1800X processor to be quite interesting and maybe a little different as well.

Honestly, there isn’t much that has been left to the imagination about Ryzen, its chipsets, pricing, etc. with the slow trickle of information that AMD has been sending out since before CES in January. We know about the specifications, we know about the architecture, we know about the positioning; and while I will definitely recap most of that information here, the real focus is going to be on raw numbers. Benchmarks are what we are targeting with today’s story.

Let’s dive right in.

The Zen Architecture – Foundation for Ryzen

Actually, as it turns out, in typical Josh Walrath fashion, he wrote too much about the AMD Zen architecture to fit into this page. So, instead, you'll find his complete analysis of AMD's new baby right here: AMD Zen Architecture Overview: Focus on Ryzen

ccx.png

AMD Ryzen 7 Processor Specifications

Though we have already detailed the most important specifications for the new AMD Ryzen processors when the preorders went live, its worth touching on them again and reemphasizing the important ones.

  Ryzen 7 1800X Ryzen 7 1700X Ryzen 7 1700 Core i7-6900K Core i7-6800K Core i7-7700K Core i5-7600K Core i7-6700K
Architecture Zen Zen Zen Broadwell-E Broadwell-E Kaby Lake Kaby Lake Skylake
Process Tech 14nm 14nm 14nm 14nm 14nm 14nm+ 14nm+ 14nm
Cores/Threads 8/16 8/16 8/16 8/16 6/12 4/8 4/4 4/8
Base Clock 3.6 GHz 3.4 GHz 3.0 GHz 3.2 GHz 3.4 GHz 4.2 GHz 3.8 GHz 4.0 GHz
Turbo/Boost Clock 4.0 GHz 3.8  GHz 3.7 GHz 3.7 GHz 3.6 GHz 4.5 GHz 4.2 GHz 4.2 GHz
Cache 20MB 20MB 20MB 20MB 15MB 8MB 8MB 8MB
Memory Support DDR4-2400
Dual Channel
DDR4-2400
Dual Channel
DDR4-2400
Dual Channel
DDR4-2400
Quad Channel
DDR4-2400
Quad Channel
DDR4-2400
Dual Channel
DDR4-2400
Dual Channel
DDR4-2400
Dual Channel
TDP 95 watts 95 watts 65 watts 140 watts 140 watts 91 watts 91 watts 91 watts
Price $499 $399 $329 $1050 $450 $350 $239 $309

All three of the currently announced Ryzen processors are 8-core, 16-thread designs, matching the Core i7-6900K from Intel in that regard. Though Intel does have a 10-core part branded for consumers, it comes in at a significantly higher price point (over $1500 still). The clock speeds of Ryzen are competitive with the Broadwell-E platform options though are clearly behind the curve when it comes the clock capabilities of Kaby Lake and Skylake. With admittedly lower IPC than Kaby Lake, Zen will struggle in any purely single threaded workload with as much as 500 MHz deficit in clock rate.

One interesting deviation from Intel's designs that Ryzen gets is a more granular boost capability. AMD Ryzen CPUs will be able move between processor states in 25 MHz increments while Intel is currently limited to 100 MHz. If implemented correctly and effectively through SenseMI, this allows Ryzen to get 25-75 MHz of additional performance in a scenario where it was too thermally constrainted to hit the next 100 MHz step. 

DSC02636.jpg

XFR (Extended Frequency Range), supported on the Ryzen 7 1800X and 1700X (hence the "X"), "lifts the maximum Precision Boost frequency beyond ordinary limits in the presence of premium systems and processor cooling." The story goes, that if you have better than average cooling, the 1800X will be able to scale up to 4.1 GHz in some instances for some undetermined amount of time. The better the cooling, the longer it can operate in XFR. While this was originally pitched to us as a game-changing feature that bring extreme advantages to water cooling enthusiasts, it seems it was scaled back for the initial release. Only getting 100 MHz performance increase, in the best case result, seems a bit more like technology for technology's sake rather than offering new capabilities for consumers.

cpu2.jpg

Ryzen integrates a dual channel DDR4 memory controller with speeds up to 2400 MHz, matching what Intel can do on Kaby Lake. Broadwell-E has the advantage with a quad-channel controller but how useful that ends of being will be interesting to see as we step through our performance testing.

One area of interest is the TDP ratings. AMD and Intel have very different views on how this is calculated. Intel has made this the maximum power draw of the processor while AMD sees it as a target for thermal dissipation over time. This means that under stock settings the Core i7-7700K will not draw more than 91 watts and the Core i7-6900K will not draw more than 140 watts. And in our testing, they are well under those ratings most of the time, whenever AVX code is not being operated. AMD’s 95-watt rating on the Ryzen 1800X though will very often be exceed, and our power testing proves that out. The logic is that a cooler with a 95-watt rating and the behavior of thermal propagation give the cooling system time to catch up. (Interestingly, this is the philosophy Intel has taken with its Kaby Lake mobile processors.)

lisa-29.jpg

Obviously the most important line here for many of you is the price. The Core i7-6900K is the lowest priced 8C/16T option from Intel for consumers at $1050. The Ryzen R7 1800X has a sticker price less than half of that, at $499. The R7 1700X vs Core i7-6800K match is interesting as well, where the AMD CPU will sell for $399 versus $450 for the 6800K. However, the 6800K only has 6-cores and 12-threads, giving the Ryzen part an instead 25% boost in multi-threaded performance. The 7700K and R7 1700 battle will be interesting as well, with a 4-core difference in capability and a $30 price advantage to AMD.

Continue reading our review of the new AMD Ryzen 7 1800X processor!!

Author:
Subject: Processors
Manufacturer: AMD

What Makes Ryzen Tick

We have been exposed to details about the Zen architecture for the past several Hot Chips conventions as well as other points of information directly from AMD.  Zen was a clean sheet design that borrowed some of the best features from the Bulldozer and Jaguar architectures, as well as integrating many new ideas that had not been executed in AMD processors before.  The fusion of ideas from higher performance cores, lower power cores, and experience gained in APU/GPU design have all come together in a very impressive package that is the Ryzen CPU.

zen_01.jpg

It is well known that AMD brought back Jim Keller to head the CPU group after the slow downward spiral that AMD entered in CPU design.  While the Athlon 64 was a tremendous part for the time, the subsequent CPUs being offered by the company did not retain that leadership position.  The original Phenom had problems right off the bat and could not compete well with Intel’s latest dual and quad cores.  The Phenom II shored up their position a bit, but in the end could not keep pace with the products that Intel continued to introduce with their newly minted “tic-toc” cycle.  Bulldozer had issues  out of the gate and did not have performance numbers that were significantly greater than the previous generation “Thuban” 6 core Phenom II product, much less the latest Intel Sandy Bridge and Ivy Bridge products that it would compete with.

AMD attempted to stop the bleeding by iterating and evolving the Bulldozer architecture with Piledriver, Steamroller, and Excavator.  The final products based on this design arc seemed to do fine for the markets they were aimed at, but certainly did not regain any marketshare with AMD’s shrinking desktop numbers.  No matter what AMD did, the base architecture just could not overcome some of the basic properties that impeded strong IPC performance.

52_perc_design_opt.png

The primary goal of this new architecture is to increase IPC to a level consistent to what Intel has to offer.  AMD aimed to increase IPC per clock by at least 40% over the previous Excavator core.  This is a pretty aggressive goal considering where AMD was with the Bulldozer architecture that was focused on good multi-threaded performance and high clock speeds.  AMD claims that it has in fact increased IPC by an impressive 54% from the previous Excavator based core.  Not only has AMD seemingly hit its performance goals, but it exceeded them.  AMD also plans on using the Zen architecture to power products from mobile products to the highest TDP parts offered.

 

The Zen Core

The basis for Ryzen are the CCX modules.  These modules contain four Zen cores along with 8 MB of shared L3 cache.  Each core has 64 KB of L1 I-cache and 32 KB of D-cache.  There is a total of 512 KB of L2 cache.  These caches are inclusive.  The L3 cache acts as a victim cache which partially copies what is in L1 and L2 caches.  AMD has improved the performance of their caches to a very large degree as compared to previous architectures.  The arrangement here allows the individual cores to quickly snoop any changes in the caches of the others for shared workloads.  So if a cache line is changed on one core, other cores requiring that data can quickly snoop into the shared L3 and read it.  Doing this allows the CPU doing the actual work to not be interrupted by cache read requests from other cores.

ccx.png

l2_cache.png

l3_cache.png

Each core can handle two threads, but unlike Bulldozer has a single integer core.  Bulldozer modules featured two integer units and a shared FPU/SIMD.  Zen gets rid of CMT for good and we have a single integer and FPU units for each core.  The core can address two threads by utilizing AMD’s version of SMT (symmetric multi-threading).  There is a primary thread that gets higher priority while the second thread has to wait until resources are freed up.  This works far better in the real world than in how I explained it as resources are constantly being shuffled about and the primary thread will not monopolize all resources within the core.

Click here to read more about AMD's Zen architecture in Ryzen!

Author:
Subject: Processors
Manufacturer: AMD

Get your brains ready

Just before the weekend, Josh and I got a chance to speak with David Kanter about the AMD Zen architecture and what it might mean for the Ryzen processor due out in less than a month. For those of you not familiar with David and his work, he is an analyst and consultant on processor architectrure and design through Real World Tech while also serving as a writer and analyst for the Microprocessor Report as part of the Linley Group. If you want to see a discussion forum that focuses on architecture at an incredibly detailed level, the Real World Tech forum will have you covered - it's an impressive place to learn.

zenpm-4.jpg

David was kind enough to spend an hour with us to talk about a recently-made-public report he wrote on Zen. It's definitely a discussion that dives into details most articles and stories on Zen don't broach, so be prepared to do some pausing and Googling phrases and technologies you may not be familiar with. Still, for any technology enthusiast that wants to get an expert's opinion on how Zen compares to Intel Skylake and how Ryzen might fare when its released this year, you won't want to miss it.

High Bandwidth Cache

Apart from AMD’s other new architecture due out in 2017, its Zen CPU design, there is no other product that has had as much build up and excitement surrounding it than its Vega GPU architecture. After the world learned that Polaris would be a mainstream-only design that was released as the Radeon RX 480, the focus for enthusiasts came straight to Vega. It’s been on the public facing roadmaps for years and signifies the company’s return to the world of high end GPUs, something they have been missing since the release of the Fury X in mid-2015.

slides-2.jpg

Let’s be clear: today does not mark the release of the Vega GPU or products based on Vega. In reality, we don’t even know enough to make highly educated guesses about the performance without more details on the specific implementations. That being said, the information released by AMD today is interesting and shows that Vega will be much more than simply an increase in shader count over Polaris. It reminds me a lot of the build to the Fiji GPU release, when the information and speculation about how HBM would affect power consumption, form factor and performance flourished. What we can hope for, and what AMD’s goal needs to be, is a cleaner and more consistent product release than how the Fury X turned out.

The Design Goals

AMD began its discussion about Vega last month by talking about the changes in the world of GPUs and how the data sets and workloads have evolved over the last decade. No longer are GPUs only worried about games, but instead they must address profession workloads, enterprise workloads, scientific workloads. Even more interestingly, as we have discussed the gap in CPU performance vs CPU memory bandwidth and the growing gap between them, AMD posits that the gap between memory capacity and GPU performance is a significant hurdle and limiter to performance and expansion. Game installs, professional graphics sets, and compute data sets continue to skyrocket. Game installs now are regularly over 50GB but compute workloads can exceed petabytes. Even as we saw GPU memory capacities increase from Megabytes to Gigabytes, reaching as high as 12GB in high end consumer products, AMD thinks there should be more.

slides-8.jpg

Coming from a company that chose to release a high-end product limited to 4GB of memory in 2015, it’s a noteworthy statement.

slides-11.jpg

The High Bandwidth Cache

Bold enough to claim a direct nomenclature change, Vega 10 will feature a HBM2 based high bandwidth cache (HBC) along with a new memory hierarchy to call it into play. This HBC will be a collection of memory on the GPU package just like we saw on Fiji with the first HBM implementation and will be measured in gigabytes. Why the move to calling it a cache will be covered below. (But can’t we call get behind the removal of the term “frame buffer”?) Interestingly, this HBC doesn’t have to be HBM2 and in fact I was told that you could expect to see other memory systems on lower cost products going forward; cards that integrate this new memory topology with GDDR5X or some equivalent seem assured.

slides-13.jpg

Continue reading our preview of the AMD Vega GPU Architecture!

Author:
Subject: Processors, Mobile
Manufacturer: Qualcomm

Semi-custom CPU

With the near comes a new push for performance, efficiency and feature leadership from Qualcomm and its Snapdragon line of mobile SoCs. The Snapdragon 835 was officially announced in November of last year when the partnership with Samsung on 10nm process technology was announced, but we now have the freedom to share more of the details on this new part and how it changes Qualcomm’s position in the ultra-device market. Though devices with the new 835 part won’t be on the market for several more months, with announcements likely coming at CES this year.

slides1-5.jpg

Qualcomm frames the story around the Snapdragon 835 processor with what they call the “five pillars” – five different aspects of mobile processor design that they have addressed with updates and technologies. Qualcomm lists them as battery life (efficiency), immersion (performance), connectivity, and security.

slides1-6.jpg

Starting where they start, on battery life and efficiency, the SD 835 has a unique focus that might surprise many. Rather than talking up the improvements in performance of the new processor cores, or the power of the new Adreno GPU, Qualcomm is firmly planted on looking at Snapdragon through the lens of battery life. Snapdragon 835 uses half of the power of Snapdragon 801.

slides2-2.jpg

The company touts usage claims of 1+ day of talk time, 5+ days of music playback, 11 hours of 4K video playback, 3 hours of 4K video capture and 2+ hours of sustained VR gaming. These sound impressive, but as we must always do in this market, you must wait for consumer devices from Qualcomm partners to really measure how well this platform will do. Going through a typical power user comparison of a device built on the Snapdragon 835 to one use the 820, Qualcomm thinks it could result in 2 or more hours of additional battery life at the end of the day.

We have already discussed the new Quick Charge 4 technology, that can offer 5 hours of use with just 5 minutes of charge time.

Continue reading our preview of the Qualcomm Snapdragon 835 SoC!

Author:
Subject: Processors
Manufacturer: Intel

Architectural Background

It probably doesn't surprise any of our readers that there has been a tepid response to the leaks and reviews that have come out about the new Core i7-7700K CPU ahead of the scheduled launch of Kaby Lake-S from Intel. Replacing the Skylake-based 6700K part as the new "flagship" consumer enthusiast CPU, the 7700K has quite a bit stacked against it. We know that Kaby Lake is the first in the new sequence of tick-tock-optimize, and thus there are few architectural changes to any portion of the chip. However, that does not mean that the 7700K and Kaby Lake in general don't offer new capabilities (HEVC) or performance (clock speed). 

The Core i7-7700K is in an interesting spot as well with regard to motherboards and platforms. Nearly all motherboards that run the Z170 chipset will be able to run the new Kaby Lake parts without requiring an upgrade to the newly released Z270 chipset. However, the likelihood that any user on a Z170 platform today using a Skylake processor will feel the NEED to upgrade to Kaby Lake is minimal, to say the least. The Z270 chipset only offers a couple of new features compared to last generation, so the upgrade path is again somewhat limited in excitement.

Let's start by taking a look at the Core i7-7700K and how it compares to the previous top-end parts from the consumer processor line and then touch on the changes that Kaby Lake brings to the table.

slides06.jpg

With the beginning of CES just days away (as I write this), Intel is taking the wrapping paper off of its first gift of 2017 to the industry. As you can see from the slide above, more than just the Kaby Lake-S consumer socketed processors are launching today, but other components including Iris Plus graphics implementations and quad-core notebook implementations will need to wait for another day.

slides10.jpg

For DIY builders and OEMs, Kaby Lake-S, now known as the 7th Generation Core Processor family, offer some changes and additions. First, we will get a dual-core HyperThreaded processor with an unlocked designation in the Core i3-7350K. Other than the aforementioned Z270 chipset, Kaby Lake will be the first platform compatible with Intel Optane memory. (To be extra clear, I was told that previous processors will NOT be able to utilize Optane in its M.2 form factor.)

slides11.jpg

Though we have already witnessed Lenovo announcing products using Optane, this is the first official Intel discussion about it. Optane memory will be available in M.2 modules that can be installed on Z270 motherboards, improving snappiness and responsiveness. It seems this will be launched later in the quarter as we don't have any performance numbers or benchmarks to point to demonstrating the advantages that Intel touts. I know both Allyn and I are very excited to see how this differs from previous Intel caching technologies.

  Core i7-7700K Core i7-6700K Core i7-5775C Core i7-4790K Core i7-4770K Core i7-3770K
Architecture Kaby Lake Skylake Broadwell Haswell Haswell Ivy Bridge
Process Tech 14nm+ 14nm 14nm 22nm 22nm 22nm
Socket LGA 1151 LGA 1151 LGA 1150 LGA 1150 LGA 1150 LGA 1155
Cores/Threads 4/8 4/8 4/8 4/8 4/8 4/8
Base Clock 4.2 GHz 4.0 GHz 3.3 GHz 4.0 GHz 3.5 GHz 3.5 GHz
Max Turbo Clock 4.5 GHz 4.2 GHz 3.7 GHz 4.4 GHz 3.9 GHz 3.9 GHz
Memory Tech DDR4 DDR4 DDR3 DDR3 DDR3 DDR3
Memory Speeds Up to 2400 MHz Up to 2133 MHz Up to 1600 MHz Up to 1600 MHz Up to 1600 MHz Up to 1600 MHz
Cache (L4 Cache) 8MB 8MB 6MB (128MB) 8MB 8MB 8MB
System Bus DMI3 - 8.0 GT/s DMI3 - 8.0 GT/s DMI2 - 6.4 GT/s DMI2 - 5.0 GT/s DMI2 - 5.0 GT/s DMI2 - 5.0 GT/s
Graphics HD Graphics 630 HD Graphics 530 Iris Pro 6200 HD Graphics  4600 HD Graphics 4600 HD Graphics  4000
Max Graphics Clock 1.15 GHz 1.15 GHz 1.15 GHz 1.25 GHz 1.25 GHz 1.15 GHz
TDP 91W 91W 65W 88W 84W 77W
MSRP $339 $339 $366 $339 $339 $332

Continue reading our review of the Intel Core i7-7700K Kaby Lake processor!!

Author:
Subject: Processors
Manufacturer: AMD
Tagged: Zen, ryzen, processor, cpu, amd

Ryzen coming in 2017

As much as we might want it to be, today is not the day that AMD launches its new Zen processors to the world. We’ve been teased with it for years now, with trickles of information at event after event…but we are going to have to wait a little bit longer with one more tease at least. Today’s AMD is announcing the official branding of the consumer processors based on Zen, previously code named Summit Ridge, along with a clock speed data point and a preview of five technology that will help it be competitive with the Intel Core lineup.

ryzen-22.jpg

The future consumer desktop processor from AMD will now officially be known as Ryzen. That’s pronounced “RISE-IN” not “RIS-IN”, just so we are all on the same page. CEO Lisa Su was on stage during the reveal at a media event last week and claimed that while media, fans and AMD fell in love with the Zen name, it needed a differentiation from the architecture itself. The name is solid – not earth shattering though I foresee a long life of mispronunciation ahead of it.

Now that we have the official branding behind us, let’s get to the rest of the disclosed information we can reveal today.

ryzen-24.jpg

We already knew that Summit Ridge would ship with an 8 core, 16 thread version (with lower core counts at lower prices very likely) but now we know a frequency and a cache size. AMD tells us that there will be a processor (the flagship) that will have a base clock of 3.4 GHz with boost clocks above that. How much above that is still a mystery – AMD is likely still tweaking its implementation of boost to get as much performance as possible for launch. This should help put those clock speed rumors to rest for now.

The 20MB of cache matches the Core i7-6900K, though obviously with some dramatic architecture differences between Broadwell and Zen, the effect and utilization of that cache will be interesting measure next year.

ryzen-10.jpg

We already knew that Ryzen will be utilizing the AM4 platform, but it’s nice to see it reiterated a modern feature set and expandability. DDR4 memory, PCI Express Gen3, native USB 3.1 and NVMe support – there are all necessary building blocks for a modern consumer and enthusiast PC. We still should see how many of these ports the chipset offers and how aggressive motherboard companies like ASUS, MSI and Gigabyte are in their designs. I am hoping there are as many options as would see for an X99/Z170 platform, including budget boards in the $100 space as well as “anything and everything” options for those types of buyers that want to adopt AMD’s new CPU.

Continue reading our latest preview of AMD Zen, now known as Ryzen!

Author:
Manufacturer: NVIDIA

A Holiday Project

A couple of years ago, I performed an experiment around the GeForce GTX 750 Ti graphics card to see if we could upgrade basic OEM, off-the-shelf computers to become competent gaming PCs. The key to this potential upgrade was that the GTX 750 Ti offered a great amount of GPU horsepower (at the time) without the need for an external power connector. Lower power requirements on the GPU meant that even the most basic of OEM power supplies should be able to do the job.

That story was a success, both in terms of the result in gaming performance and the positive feedback it received. Today, I am attempting to do that same thing but with a new class of GPU and a new class of PC games.

The goal for today’s experiment remains pretty much the same: can a low-cost, low-power GeForce GTX 1050 Ti graphics card that also does not require any external power connector offer enough gaming horsepower to upgrade current shipping OEM PCs to "gaming PC" status?

Our target PCs for today come from Dell and ASUS. I went into my local Best Buy just before the Thanksgiving holiday and looked for two machines that varied in price and relative performance.

01.jpg

  Dell Inspiron 3650 ASUS M32CD-B09
Processor Intel Core i3-6100 Intel Core i7-6700
Motherboard Custom Custom
Memory 8GB DDR4 12GB DDR4
Graphics Card Intel HD Graphics 530 Intel HD Graphics 530
Storage 1TB HDD 1TB Hybrid HDD
Case Custom Custom
Power Supply 240 watt 350 watt
OS Windows 10 64-bit Windows 10 64-bit
Total Price $429 (Best Buy) $749 (Best Buy)

The specifications of these two machines are relatively modern for OEM computers. The Dell Inspiron 3650 uses a modest dual-core Core i3-6100 processor with a fixed clock speed of 3.7 GHz. It has a 1TB standard hard drive and a 240 watt power supply. The ASUS M32CD-B09 PC has a quad-core HyperThreaded processor with a 4.0 GHz maximum Turbo clock, a 1TB hybrid hard drive and a 350 watt power supply. Both of the CPUs share the same Intel brand of integrated graphics, the HD Graphics 520. You’ll see in our testing that not only is this integrated GPU unqualified for modern PC gaming, but it also performs quite differently based on the CPU it is paired with.

Continue reading our look at upgrading an OEM machine with the GTX 1050 Ti!!

Author:
Subject: Processors
Manufacturer: Intel

Introduction

In August at the company’s annual developer forum, Intel officially took the lid off its 7th generation of Core processor series, codenamed Kaby Lake. The build up to this release has been an interesting one as we saw the retirement of the “tick tock” cadence of processor releases and instead are moving into a market where Intel can spend more development time on a single architecture design to refine and tweak it as the engineers see fit. With that knowledge in tow, I believed, as I think many still do today, that Kaby Lake would be something along the lines of a simple rebrand of current shipping product. After all, since we know of no major architectural changes from Skylake other than improvements in the video and media side of the GPU, what is left for us to look forward to?

As it turns out, the advantages of the 7th Generation Core processor family and Kaby Lake are more substantial than I expected. I was able to get a hold of two different notebooks from the HP Spectre lineup, as near to identical as I could manage, with the primary difference being the move from the 6th Generation Skylake design to the 7th Generation Kaby Lake. After running both machines through a gamut of tests ranging from productivity to content creation and of course battery life, I can say with authority that Intel’s 7th Gen product deserves more accolades than it is getting.

Architectural Refresher

Before we get into the systems and to our results, I think it’s worth taking some time to quickly go over some of what we know about Kaby Lake from the processor perspective. Most of this content was published back in August just after the Intel Developer Forum, so if you are sure you are caught up, you can jump right along to a pictorial look at the two notebooks being tested today.

At its core, the microarchitecture of Kaby Lake is identical to that of Skylake. Instructions per clock (IPC) remain the same with the exception of dedicated hardware changes in the media engine, so you should not expect any performance differences with Kaby Lake except with improved clock speeds.

Also worth noting is that Intel is still building Kaby Lake on 14nm process technology, the same used on Skylake. The term “same” will be debated as well as Intel claims that improvements made in the process technology over the last 24 months have allowed them to expand clock speeds and improve on efficiency.

core.jpg

Dubbing this new revision of the process as “14nm+”, Intel tells me that they have improved the fin profile for the 3D transistors as well as channel strain while more tightly integrating the design process with manufacturing. The result is a 12% increase in process performance; that is a sizeable gain in a fairly tight time frame even for Intel.

That process improvement directly results in higher clock speeds for Kaby Lake when compared to Skylake when running at the same target TDPs. In general, we are looking at 300-400 MHz higher peak clock speeds in Turbo Boost situations when compared to similar TDP products in the 6th generation. Sustained clocks will very likely remain voltage / thermally limited but the ability spike up to higher clocks for even short bursts can improve performance and responsiveness of Kaby Lake when compared to Skylake.

Along with higher fixed clock speeds for Kaby Lake processors, tweaks to Speed Shift will allow these processors to get to peak clock speeds more quickly than previous designs. I extensively tested Speed Shift when the feature was first enabled in Windows 10 and found that the improvement in user experience was striking. Though the move from Skylake to Kaby Lake won’t be as big of a change, Intel was able to improve the behavior.

The graphics architecture and EU (execution unit) layout remains the same from Skylake, but Intel was able to integrate a new video decode unit to improve power efficiency. That new engine can work in parallel with the EUs to improve performance throughput as well, but obviously at the expensive of some power efficiency.

peca-8.jpg

Specific additions to the codec lineup include decode support for 10-bit HEVC and 8/10-bit VP9 as well as encode support for 10-bit HEVC and 9-bit VP9. The video engine adds HDR support with tone mapping though it does require EU utilization. Wide Color Gamut (Rec. 2020) is prepped and ready to go according to Intel for when that standard starts rolling out to displays.

Performance levels for these new HEVC encode/decode blocks is set to allow for 4K 120mbps real-time on both the Y-series (4.5 watt) and U-series (15 watt) processors.

It’s obvious that the changes to Kaby Lake from Skylake are subtle and even I found myself overlooking the benefits that it might offer. While the capabilities it has will be tested on the desktop side at a later date in 2017, for thin and light notebooks, convertibles and even some tablets, the 7th Generation Core processors do in fact take advantage of the process improvements and higher clock speeds to offer an improved user experience.

Continue reading our look at Kaby Lake mobile performance!