Review Index:

IDF 2012: Intel Haswell Architecture Revealed

Author: Ryan Shrout
Subject: Processors
Manufacturer: Intel

Microarchitecture and Graphics System Improvements

While the Haswell design is based mainly on the architecture we saw introduced with Sandy Bridge, there are some changes that Intel made to improve performance in the more typical fashion with an eye towards IPC (instructions per clock).

View Full Size

There were no changes in the key pipelines of Haswell but there were many areas that Intel said are "typical improvement points" for the company.  The branch predictor has been improved as this is usually the best return on time investment from a CPU-design stand point; Intel increased the buffers on the OOO (out of order) structures in order to help improve the ability for the processor to find parallelism and take advantage of it. 

Throughput also sees a boost, with 8 total ports on the reservation station with another ALU unit, another branching unit, and address store.  This gives Haswell some improved metrics like two branches per cycle and two floating point MADDs per cycle – both improvements over what we saw in Sandy Bridge and Ivy Bridge processors. 

View Full Size

New compute instructions expand on AVX, doubling both single precision and double precision FLOPs per core per cycle.  Other new instructions accelerate very specific algorithms with updates for extract and deposits, bit manipulation, rotates, etc. 

View Full Size

The cache implementation also sees interesting changes with Haswell including a doubling of the bandwidth to 32-bits wide and one L2 cache read every cycle.  Seeing both L1 and L2 cache bandwidths double in a single generation without changing the organization and size of those structures is impressive, though it needs more explanation as well.

View Full Size

Another big upcoming change is the introduction of transactional synchronization extensions (TSX).  TSX is a method to improve concurrency and multi-threadedness with as little work for the programmer as possible.  By using these new ISA extensions, a developer can apply simple prefixes and suffixes to code blocks to indicate that they are independent and can be run in parallel.  Hardware is then capable of managing transactional updates and restart execution if the required block isn't able to be run. 

While this might be pretty specific to discuss with our audience, the implications are impressive.  Increasing the parallelization of software is one of the key issues holding back innovation on many levels.  We have seen the GPU vendors fight this (think CUDA) for years, and Intel's continued push into the MIC (many integrated core) markets will require it as well.

Graphics System Improvements

Perhaps more important than the x86 core changes are the improvements Intel has made with regards to the integrated processor graphics.  While Ivy Bridge was rumored to be the death knell for discrete GPUs in the mobile market, both NVIDIA and AMD were able to find a place to market and sell their parts.  Haswell looks to be much less forgiving.

View Full Size

The truth is that the graphics and media overview for Haswell is very similar to that of Ivy Bridge – including the same 6 domain partitioned architecture we saw at IDF last year.  Domain 1 includes the typical setup and front-end action, domain 2 handles rasterization, domain 3 has the compute units (shaders) that Intel calls Execution Units (EUs).  The fourth domain has CODEC engine, domain 5 is for video enhancement, and 6 is for displays.

This iteration will include support for DirectX 11, OpenCL 1.2, and OpenGL 4.0.

View Full Size

This segmentation of the processor graphics allows for the same kind of modularity that the entire Haswell design is dependent on.  While the GT1 and GT2 options will still exist (as they do today with Ivy Bridge) the new hotness is the GT3 option that essentially doubles the computing power of the GPU; Intel calls this a "slice".

As I mentioned before, Haswell has decoupled the ring interconnect from the CPU so the GPU is able to pull more power over that bus to increase memory bandwidth without increasing the voltage to the CPU cores. Doing so lowers the required power consumption.

View Full Size

Obviously the setup stages of the processor graphics needed to be improved in order to handle the increased performance of the GT3 iteration, so Intel has doubled the performance of most fixed function units.  The setup is able to push about 500 GB/s of internal bandwidth, and should be enough to keep the execution units (EUs) of the GT3 feed.

View Full Size

Finally, the texture sampler on the new processor graphics will see as much as a 4x improvement for some modes.

During Dadi Perlmutter's keynote today, we did see a comparison between Ivy Bridge and Haswell running the DX11 Unigine Heaven benchmark – though no specific settings were given.

View Full Size

Though you can't really see it in a still photo, the Haswell result was easily a 2-3x improvement in frame rate based solely on the animation appearance.  While we will likely have to wait until Q1 or later in 2013 to get the full details, Haswell's graphics performance looks like impressive.

September 11, 2012 | 09:30 PM - Posted by tbone (not verified)

AMD take notes!

September 12, 2012 | 01:27 PM - Posted by amadsilentthirst

I'm still rocking a 3Ghz E8400

Wondering if I can hang on until Haswell, not seen my CPU under much strain to be honest, even when gaming...but a new Motherboard RAM etc is on the cards soon anyway.

Do we have a rough idea on release of desktop chips?
June 2013?

September 12, 2012 | 05:22 PM - Posted by Ryan Shrout

I think that time frame is a pretty good guess, based on the amount of information we have been giving at IDF and the Q1 2013 availability launch time given during the keynotes.

January 12, 2013 | 02:54 PM - Posted by arbiter

Leaked intel road map you had here on the site said q2, but Being how close it is, but being its only a few months or so off. If you can make due with that old e8000 might well wait.

September 13, 2012 | 11:43 AM - Posted by Nebulis01

I'm i the same boat. I won a Intel Core 2 Extreme Q9650 and built a system around that. I've yet to find any game that's truly pushed it. I was kind of hoping to move back with Piledriver but it will probably be a Haswell based upgrade.

June 7, 2013 | 05:30 PM - Posted by Hal Haswell (not verified)

With a surname of HASWELL I am mildly curious as to where the codename HASWELL came from. Anyone in a position to address this?

June 7, 2013 | 08:02 PM - Posted by Jeremy Hellstrom

at a guess, Haswell, CO, USA ... they've been using small cities as naming conventions for a while

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.