Skylake Architecture Comes Through

Intel’s Speed Shift technology, as part of the Skylake architecture, is finally available to consumers.

When Intel finally revealed the details surrounding it's latest Skylake architecture design back in August at IDF, we learned for the first time about a new technology called Intel Speed Shift. A feature that moves some of the control of CPU clock speed and ramp up away from the operating system and into hardware gives more control to the processor itself, making it less dependent on Windows (and presumably in the future, other operating systems). This allows the clock speed of a Skylake processor to get higher, faster, allowing for better user responsiveness.

It's pretty clear that Intel is targeting this feature addition for tablets and 2-in-1s where the finger/pen to screen interaction is highly reliant on immediate performance to enable improved user experiences. It has long been known that one of the biggest performance deltas between iOS from Apple and Android from Google centers on the ability for the machine to FEEL faster when doing direct interaction, regardless of how fast the background rendering of an application or web browser actually is. Intel has been on a quest to fix this problem for Android for some time, where it has the ability to influence software development, and now they are bringing that emphasis to Windows 10.

With the most recent Windows 10 update, to build v10586, Intel Speed Shift has finally been enabled for Skylake users. And since you cannot disable the feature once it's installed, this is the one and only time we'll be able to measure performance in our test systems. So let's see if Intel's claims of improved user experiences stand up to our scrutiny.

What is Speed Shift?

I wrote about the technical details on Speed Shift back in August, but for those of you that might have missed it, here is a summary:

Easily the most interesting new feature in terms of power is called Intel Speed Shift Technology. This feature actually moves much of the control of P-states (performance states) from the operating system to the architecture itself. P-states are what tells the CPU to move between frequencies in order to balance performance and power consumption. In previous designs, Windows and other operating systems would perform the actual state changes. With Speed Shift, Intel is able to directly change the P-states on the processor and this results in a 30x improvement in the speed of that transition.

Image Credit: AnandTech

Why is this useful? First, the speed improvement in that transition should result in added “snappiness” in areas where the frequency needs to increase quickly as a result of user interaction or application need, lowering the apparent latency of some actions. Also, this gives the Skylake processors the ability to manage things like low residency workloads better. Take video recording as a good example of this type workload. Traditionally, a CPU would increase frequency to get through a set of work as quickly as possible to get to idle as fast as possible. For applications that run consistent and repeated, but non-demanding, workloads it might be more efficient to keep the CPU at a slightly higher frequency the entire time rather than spiking up and down repeatedly. Intel Speed Shift gives Skylake that capability.

There are some caveats of course – this only works with Windows 10 today as it requires some unique communication between the processor and OS. For older operating systems like Windows 8 or even other OS paths like Linux, Speed Shift won’t work out of the box. Intel says they have started engaging with the open source community to integrate support for it, which is great, but until then you’ll essentially be reverting to legacy P-state controls on Skylake hardware. Also, Intel's engineers told us that Speed Shift works within a "window" of OS-based performance states so it seems that Skylake does not have complete autonomy when it comes to selecting core frequencies.

Our Testing Methodology

Testing a feature like this is pretty complicated, and requires some new methods. In total we have three different measurements to show to you: one custom application built by Intel that creates a workload spike, a pair of standard benchmarks (SunSpider and WebXPRT) and then a high speed camera capture of us touch scrolling on Chrome and Edge web browsers. You really do need all three to see how the technology works and how it affects the total performance of Skylake-powered notebooks and tablets.

Custom Application Results

The custom application was built by Intel and used internally to evaluate the gains seen with Speed Shift technology. The work load is incredibly simple – the code puts a value in a register, then shifts it, and loops this way for a set number of instances. The code includes the "__asm" declaration to prevent compiler optimizations, thus making sure all the work is actually run as intended rather than being simplified out. Essentially this is an artificial way to keep the front end of the CPU busy and emulates a spike in action that you might see from a touch screen interaction.

The program then calculates the clock speed of the processor by using knowledge of how fast that work gets done – Intel knows how fast it should get done at a certain clock speed and thus it can determine what clock the CPU is running at in that instant by measuring the process time. So, to be clear, the clock speed measurements we are showing you below are being calculated, not reported by the CPU, etc. Still, the idea is to show us how a system with and without Speed Shift behaves when the same workload is being applied.

Let's see the results.

This is a full run of the application on a Microsoft Surface Pro 4 with a Core i5 processor. We ran the benchmark on the system before the most recent Windows 10 update (10420) and then after installing it (10586) to see how much difference we would get in clock speed scaling. The results are obvious at first glance – clearly the blue line that represents the system with Speed Shift enabled spikes to its peak of 3.0 GHz much more quickly than the green line from the previous version of Windows. 

Keep in mind that the total run time of the benchmark doesn't really matter here – we are just looking to see how quickly the CPUs can spike up to their peak clock speed.

Let's zoom in a little bit on the early part of that result. You can see that with Speed Shift on v10586 of Windows 10, the blue line starts at around 800 MHz and within 6ms is able to hit its peak of 3.0 GHz. In contrast, the green line of v10420 of Windows 10, without Speed Shift, takes over 60ms to reach the same clock speed, hitting and staying at an intermediate speed of ~2.1 GHz at just past 30ms. 

These are small time scales to be sure, but when you need immediate CPU performance for a scrolling browser window, for example, the gap between 6ms and 60ms for the animation to begin is going to be very noticeable. 

Here's another example, using a Lenovo Yoga 900 Skylake machine with a Core i7 processor. In this instance, the Speed Shift enabled OS is able to jump from ~500 MHz to 2.8 GHz is about 4ms and then hits 3.1 GHz by 12ms. In contrast, the non-Speed Shift enabled version of Windows 10 jumps to ~1200 MHz at the 32ms mark and is able to get to 3.1 GHz at about the 47ms mark. 

It's abundantly clear from this data that Speed Shift can do exactly what Intel said it would – decrease the amount of time it takes for CPU clock speeds to scale when performance is needed most. But how does that affect modern benchmarks?

Benchmark Results

To find out, I ran both the Surface Pro 4 and the Lenovo Yoga 900 through two standard Web-based tests: SunSpider and WebXPRT. Both are used by reviewers and hardware vendors to measure certain aspects of platform performance. 

In SunSpider, in both Chrome and Edge browsers, we see from 5-8% improvements in total scores. 

WebXPRT is a benchmark that runs through some common tasks like photo enhancement, stock pricing and sales graphs in order to measure platform performance in browsers. In both the Lenovo Yoga 900 and the Surface Pro 4 we see significant improvements in performance with the Speed Shift enabled version of Windows 10. Ranging from 22% up to 34%, these results indicate that the real-world benefits of Speed Shift apply to multiple workloads. 

Real-World Results

All of the graphs and data above are important, but easily the most conclusive evidence that I was able to find that Intel Speed Shift can improve the user experience came about by using a high speed camera to record interactions with these machines. By using a 240 FPS camera, a mirror and common web sites we can measure the time it takes for a system to respond to touch motion.

For those of you interested in the exact methodology here:

  • Place mirror at ~90 degree angle to the touch screen being tested – this allows you to see both the motion of animation and the exact moment contact is made with the screen or the exact moment movement begins.
     
  • Setup the 240 FPS camera (iPhone 6s in this case) to point at the screen/mirror 
     
  • Record several motions that are easily repeatable (in our case, simple swipes up)
     
  • Run 4 times for each test, closing browser each time, averaging results
     
  • Calculate latency of response by counting frames between finger motion and first movement on display. Each frame of the video takes 4.16ms. 30 frames x 4.16ms = 124.8ms total latency +/- 4.16ms. (Also note that v-sync delay takes part here but with multiple runs and averaging results we can more-or-less remove its variance.)

I tested scrolling on our own pcper.com in both Google Chrome and Microsoft Edge, then added in a scroll test of Google Maps in Chrome for good measure.

The results are better than I could have expected. In the Edge browser we saw our average latency from finger motion to actual pixel movement decrease from 158.1ms to 91.5ms, a 72% improvement! And in Chrome, with the same content, that time dropped from 170.5ms to 120.6ms, a 41% speedup. 

The Google Maps test, by far the most strenuous workload (we had it loaded with satellite data) saw a ~20% speed up in Google Chrome. 

When I set out with this testing I didn't know what to expect and I didn't really think I would be able to tell the difference just using the Surface devices. At one point in my testing I had the Surface Pro 4 updated to v10586 of Windows 10 while the Surface Book was still on v10420. I set them up side by side and loaded up pcper.com and the response difference was immediately evident. That 158ms to 91ms change is not an exaggeration of data.

Closing Thoughts

Without a doubt, Intel Speed Shift technology has improved the ability for the Intel + Windows platform to respond to user interaction dramatically. Though it will be most easily discernable with touch screen configurations, the same technology will apply to mouse based control. Intel has been oddly quiet about the inclusion of this feature with the new Windows 10 v10586 update and I can't quite understand why – the only reason might be questions about proper platform support for Speed Shift in the EFI of currently shipping systems. I did confirm with Intel that both of the MS Surface devices and the Yoga 900 had it properly enabled but I haven't found a way to detect the capability in other machines as of this writing. 

For desktop users I am still trying to figure out how the benefits of Speed Shift might apply – clearly the advantages for standard Windows 10 usage would still be there. But whether or not the motherboards and platforms are enabled for it is up in the air; I'm working on getting feedback both from Intel as well as the motherboard vendors currently. UPDATE: It looks like nearly all currently shipping motherboards already have support for Speed Shift enabled as well. All it takes is the upgrade to the latest Windows 10 v10586 to enable!

As it stands now, this is just another reason to see Skylake notebooks and tablets as improved over previous generations. If you were waiting for a reason to go ahead and let Windows 10 upgrade to last week's major release, I think you have your answer.