Introduction to the Nehalem Architecture

Intel’s Nehalem-based Core i7 processors have been the discussion of the tech world for over two years as we have tracked their development, the technology inside the architecture and the performance they might bring. No more guessing is required though: the Core i7 CPUs are here and we have all the performance and analysis you need to understand Intel’s newest processor technology.


We Begin Once More

This is an article we have all been anticipating for years now as it introduces the most dramatic shift in Intel processing technology since the introduction of the front-side bus.  And ironically, it is this shift that will finally remove the FSB from Intel products for good.  The Nehalem core architecture has been the focus of most of Intel’s Developer Forums for the last 24 months and the culmination of the technology, marketing and products begins today.

Intel’s Core i7 processors will bring a dramatic set of changes to the enthusiast and PC community in general including a new processor, new CPU socket, new memory architecture, new chipset, new motherboards and new overclocking methods.  All of that and more will be addressed in our review today so be prepared for a LOT of valuable information. 

The Nehalem Architecture – Years of data summed up

We have done more than our share of technical documentation of the architecture and design, enough so that I feel that duplicating all of it here would be somewhat of a disservice to our frequent readers.  I will highlight the most important architectural shifts in the Nehalem design here but I still encourage you to read over my much more in-depth look at the processor design published in August: Inside the Nehalem: Intel’s New Core i7 Microarchitecture.

Nehalem Revolution: Intel's Core i7 Processor Complete Review - Processors 102

Here you can see a die shot of the new Nehalem processor – in this iteration a four core design with two separate QPI links and large L3 cache in relation to the rest of the chip.  The primary goal of Nehalem was to take the big performance advantages that the Core 2 CPUs have and modularize them.  Now with the Nehalem design, which will be branded as the Intel Core i7, Intel can easily create a range of processors from 1 core to 8 cores depending on the application and market demands.  Eight core CPUs will be found in servers while you’ll find dual core machines in the mobile market several months after the initial desktop introduction.  QPI (Quick Path Interlink) channels can also vary in order improve CPU-to-CPU communication. 

Nehalem Revolution: Intel's Core i7 Processor Complete Review - Processors 103

At a high level the Nehalem core adds some key features to the processor designs we currently have with Penryn.  SSE instructions get the bump to a 4.2 revision, better branch prediction and pre-fetch algorithms and simultaneous multi-threading (SMT) makes a return after a brief hiatus with the NetBurst architecture. 

HyperThreading Returns

Nehalem Revolution: Intel's Core i7 Processor Complete Review - Processors 104

I mentioned before that Intel is using Nehalem to mark the return of HyperThreading to its bag of weapons in the CPU battle; the process is nearly identical to that of the older NetBurst processors and allows two threads to run on a single CPU core.  But SMT (simultaneous multi-threading) or HyperThreading is also a key to keeping the 4-wide execution engine fed with work and tasks to complete.  With the larger caches and much higher memory bandwidth that the chip provides this is a very important addition.

Intel claims that HyperThreading is an extremely power efficient way to increase performance – it takes up very little die area on Nehalem yet has the potential for great performance gains in certain applications.  This is obviously much more efficient than adding another core to the die but just as obviously has some drawbacks to that method.

Nehalem Revolution: Intel's Core i7 Processor Complete Review - Processors 105

Here you can see Intel’s estimations of how much HyperThreading can help performance in specific applications.  Surprisingly one of the best performers is the 3DMark Vantage CPU test that simulates AI and physics on the processor while POV-Ray 3.7 still sees huge 30% boost in performance for this relatively small cost addition in logic.

Welcome to the Uncore, we got fun and games…

Nehalem Revolution: Intel's Core i7 Processor Complete Review - Processors 106

A new term Intel is bringing to world with this modular design is the “uncore” – basically all of the section of the processor that are separate from the cores and their self-contained cache.  Features like the integrated memory controller, QPI links and shared L3 cache fall into the “uncore” category.  All of these components that you see are completely modular; Intel can add cores, QPI links, integrated graphics (coming later in 2009) and even another IMC if they desired. 



« PreviousNext »