AMD Phenom 9600 and 9900 Review: Barcelona on the Desktop
The Phenom CPU Architecture
Today marks one of the most pivotal days in AMD's recent history -- the launch of their new desktop core architecture to replace the ever-lasting and ever-loved Athlon 64 core. Since 2003, the on-die memory controller and 64-bit addressing have made AMD a dominant player in the CPU world and not just a footnote in Intel's resume. Recent months have been hard on the Athlon 64 though as Intel's Core 2 series of processors are powerful and power efficient leaving the aging architecture from AMD's past behind.
AMD is hoping that Barcelona, aka Agena, aka Phenom will help push them back into the performance fold.
Introducing AMD's Agena / Barcelona Architecture
The architecture of the new AMD Phenom processor is based on the Barcelona core, otherwise known as the Agena core for the desktop market. We have taken a couple of looks at the Barcelona technology before, but for the final release of the desktop Phenom processor we'll touch on the detailed changes in the designs with more detail.
Today's Phenom processor is based on the 65nm process technology as shown above and features just about 450 million transistors and a die size of about 288 mm^2. Compared to the Intel Yorkfield, which has over 800 million transistors, the Phenom core looks like a light weight though when you take the 45nm process that Intel uses into account, Yorkfield actually has a smaller die of 214 mm^2.
HT 3.0 provides "up to" 20.8 GB/s of raw bandwidth if the frequency and bus width are topped out; previous HT 1.0 and 2.0 revisions had 6.4 GB/s and 8.0 GB/s of raw bandwidth. The new HT 3.0 bus on the Phenom processors will run at default clock rate of 3.6 GHz, compared to the 2.0 GHz that the HT 2.0 on current Athlon AM2 processors though the Phenom CPUs can work in motherboards with HT 1.0 and HT 2.0 as the interconnect specs are backwards compatible.
The new memory controller hasn't been changed much over the Athlon X2 CPUs -- it still runs DDR2 memory officially at 800 MHz but more or less unofficially at 1066 MHz. AMD has already stated that DDR3 memory will probably be coming in late 2008 or early 2009, when the pricing of it matches parity with DDR2.
Phenom processors actually have TWO memory controllers on them -- both of which are 64-bit. They can work in tandem (known as ganged) with matched DIMMs and thus provide a full 128-bit memory access. Where the DIMMs do not match, or when the user wants to adjust a BIOS setting, they can work independently to provide standard dual-channel 64-bit memory accesses. The performance variances of these two settings is still something that is questionable, because of reasons we'll explain in a bit.
The new L3 cache is a shared cache used by the four cores of the Phenom processor to communicate with each other as well as to reduce the amount of time required for dynamic memory access. The cache is also used to buffer data being written to memory in order to reduce the frequency of writes to system memory. This should in turn improve the performance of memory reads, which are typically more important for system speed.
The Phenom also implements the same C1E power state that Intel's Core 2 Duo processor use for power savings; important because Windows and other operating systems utilize it for more power efficiency. AMD's chips can use it when all four cores are idle and then physically disconnect the HyperTransport links, put the memory modules in a lower power mode and lower internal clocks as well.
We should note of course that even though all of these new power features are great for users of motherboards and chipsets that support them, the Phenom chips are also backwards compatible with parallel VID control used in the current generation of AM2 motherboards.
There are other performance enhancements to the Barcelona core as well including wider data paths from the memory controller to the cores and 128-bit wide floating point units. These 128-bit units can be used to execute pairs of 64-bit SSE instructions, for example, as a single instruction.