Core and Interconnect
The Skylake architecture is Intel’s first to get a full release on the desktop in more than two years. While that might not seem like a long time in the grand scheme of technology, for our readers and viewers that is a noticeable change and shift from recent history that Intel has created with the tick-tock model of releases. Yes, Broadwell was released last year and was solid product, but Intel focused almost exclusively on the mobile platforms (notebooks and tablets) with it. Skylake will be much more ubiquitous and much more quickly than even Haswell.
Skylake represents Intel’s most scalable architecture to date. I don’t mean only frequency scaling, though that is an important part of this design, but rather in terms of market segment scaling. Thanks to brilliant engineering and design from Intel’s Israeli group Intel will be launching Skylake designs ranging from 4.5 watt TDP Core M solutions all the way up to the 91 watt desktop processors that we have already reviewed in the Core i7-6700K. That’s a range that we really haven’t seen before and in the past Intel has depended on the Atom architecture to make up ground on the lowest power platforms. While I don’t know for sure if Atom is finally trending towards the dodo once Skylake’s reign is fully implemented, it does make me wonder how much life is left there.
Scalability also refers to the package size – something that ensures that the designs the engineers created can actually be built and run in the platform segments they are targeting. Starting with the desktop designs for LGA platforms (DIY market) that fits on a 1400 mm2 design on the 91 watt TDP implementation Intel is scaling all the way down to 330 mm2 in a BGA1515 package for the 4.5 watt TDP designs. Only with a total product size like that can you hope to get Skylake in a form factor like the Compute Stick – which is exactly what Intel is doing. And note that the smaller packages require the inclusion of the platform IO chip as well, something that H- and S-series CPUs can depend on the motherboard to integrate.
Finally, scalability will also include performance scaling. Clearly the 4.5 watt part will not offer the user the same performance with the same goals as the 91 watt Core i7-6700K. The screen resolution, attached accessories and target applications allow Intel to be selective about how much power they require for each series of Skylake CPUs.
The fundamental design theory in Skylake is very similar to what exists today in Broadwell and Haswell with a handful of significant and hundreds of minor change that make Skylake a large step ahead of previous designs.
This slide from Julius Mandelblat, Intel Senior Principle Engineer, shows a higher level overview of the entirety of the consumer integration of Skylake. You can see that Intel’s goals included a bigger and wider core design, higher frequency, improved right architecture and fabric design and more options for eDRAM integration. Readers of PC Perspective will already know that Skylake supports both DDR3L and DDR4 memory technologies but the inclusion of the camera ISP is new information for us.
Subject: Editorial, General Tech, Processors | August 3, 2011 - 06:11 AM | Scott Michaud
Tagged: Netburst, architecture
It is common knowledge that computing power consistently improves throughout time as dies shrink to smaller processes, clock rates increase, and the processor can do more and more things in parallel. One thing that people might not consider: how fast is the actual architecture itself? Think of the problem of computing in terms of a factory. You can increase the speed of the conveyor belt and you can add more assembly lines, but just how fast are the workers? There are many ways to increase the efficiency of a CPU: from tweaking the most common or adding new instruction sets to allow the task itself to be simplified; to playing with the pipeline size for proper balance between constantly loading the CPU with upcoming instructions and needing to dump and reload the pipe when you go the wrong way down an IF/ELSE statement. Tom’s Hardware wondered this and tested a variety of processors since 2005 with their settings modified such that they could only use one core and only be clocked at 3 GHz. Can you guess which architecture failed the most miserably?
Pfft, who says you ONLY need a calculator?
(Image from Intel)
Netburst architecture was designed to get very large clock rates at the expensive of heat -- and performance. At the time, the race between Intel and its competitors was clock rate: the higher the clock the better it was for marketers despite a 1.3 GHz Athlon wrecking a 3.2 GHz Celeron in actual performance. If you are in the mood for a little chuckle, this marketing strategy was all destroyed when AMD decided to name their processors “Athlon XP 3200+” and so forth rather than by their actual clock rate. One of the major reasons that Netburst was so terrible was branch prediction. Branch prediction is a strategy you can use to speed up a processor: when you reach a conditional jump from one chunk of code to another, such as “if this is true do that, otherwise do this”, you do not know for sure what will come next. Pipelining is a method of loading multiple commands into a processor to keep it constantly working. Branch prediction says: “I think I’ll go down this branch” and loads the pipeline assuming that is true; if you are wrong, you need to dump the pipeline and correct your mistake. One way that Pentium Netburst kept high clock rates was by having a ridiculously huge pipeline, 2-4x larger than the first generation of Core 2 parts which replaced it; unfortunately the Pentium 4 branch prediction was terrible keeping the processor stuck needing to dump its pipeline perpetually.
The sum of all tests... at least time-based ones.
(Image from Tom's Hardware)
Now that we excavated Intel’s skeletons to air them out it is time to bury them again and look at the more recent results. On the AMD side of things, it looks as though there has not been too much innovation on the efficiency side of things only now getting within range of the architecture efficiency that Intel had back in 2007 with their first introduction of Core 2. Obviously efficiency per core per clock means little in the real world as it tells you neither about raw performance of a part nor how power efficient it is. Still, it is interesting to see how big of a leap Intel made away from their turkey of an architecture theory known as Netburst and model the future around the Pentium 3 and Pentium M architectures. Lastly, despite the lead, it is interesting to note exactly how much work went into the Sandy Bridge architecture. Intel, despite an already large lead and focus outside of the x86 mindset, still tightened up their x86 architecture by a very visible margin. It might not be as dramatic as their abandonment of Pentium 4, but is still laudable in its own right.