Some Initial Thoughts on Propus, AMD's Quad-core Athlon
Lower Price, Better Margins (hopefully)
A few days ago Jeremy linked to an article which showed off the Propus die from AMD, as well as a few tidbits of information. Propus is a quad core design that is based on the Phenom II architecture, but with a few changes. This initial analysis of Propus is assuming that the die shot is real, and the guy's information about die size is correct. At the beginning of this month I reviewed two new dual core products from AMD. The Phenom II X2 550 is not all that new, as it is a full quad core Phenom II part (Deneb), but with two cores disabled (making it a Callisto). It still features the full 6 MB of L3 cache of Deneb, so overall it is a very fast part for being dual core. The other product is far more interesting in many ways, and it is the Athlon II X2 250. This product, codenamed Regor, is also based on the Phenom II architecture, but it does not feature any L3 cache and it has larger L2 caches, which are 1 MB per core.
The full Deneb core which powers the Phenom II series of parts. It is one large part, but it has very little wasted space.
The performance of the Athlon II X2 250 is fairly close to that of the much larger, and more expensive, Phenom II X2. The two reasons why the Phenom II is faster is due to the 100 MHz clockspeed advantage, and the large L3 cache that the cores can communicate with. The Athlon X2 still does very well though, and it is a relatively small part as compared to the approximately 264 mm squared Phenom II X4/X3/X2. At 117.5 mm squared, it is nearly the size of the 45 nm Core 2 Duo which is around 107 mm squared. Looking at the die shot, it becomes readily apparent that the Propus utilizes 512 KB of L2 cache per core, down from the 1 MB of L2 with Regor. This cuts down the transistor count per core by a significant amount, as well as the overall die size of the processor. If the information is correct, Propus has a die size that is a relatively slender 164 mm squared, which is pretty terrific for a modern, monolithic quad core part. In fact, it is far below the 214 mm squared that a Core 2 Quad sports.
The only real problem here is that AMD has not done a whole lot to the Phenom II architecture, and the lack of large L2 caches, as well as the total lack of a L3 cache, will make the product slower than a comparably clocked Phenom II. The architecture certainly appears to enjoy larger caches, even though it has a pretty fast integrated memory controller. Plus the lack of L3 cache in a quad core design makes for interesting cache coherency issues. From my understanding, the crossbar that is built into the architecture does not actually work as intended, and cache coherency is handled either by the L3 or main memory. This is not exactly a deal breaker, as the Core 2 Quads have to utilize main memory for cache coherency as well (the two Core 2 Duo dies used in the Core 2 Quad products are connected by the FSB, and not by any kind of internal crossbar or shared L3 cache). How much slower is a big question, as it also depends on what they decide to clock the northbridge portion of the chip. A faster northbridge, as compared to the standard clocked 2 GHz found on the Phenom IIs, will lower overall latency to main memory, and help to better utilize available bandwidth. Obviously no official word on clockspeeds has been given, but we can expect the cores to be clocked from 2.4 GHz to 3.0 GHz at release.
Propus will only feature 1 HyperTransport connection, as it will not be used in the server market. There is a bit of wasted die space around the edges where the other HT units would usually be located. The entire southern edge of the die is dominated by the memory controller.
The last thing that needs covered here is the potential TDP of these products. Since the Propus core is around 60% of the size of the Phenom II, we can expect the lower 2.4 GHz versions to be around 65 watts TDP, and potentially as low as 45 watts TDP. The higher 3 GHz version would probably hit around 89 watts in the worst case scenario. Something else to consider is that with the continual advances on GF's 45 nm process, and the lack of large caches on the chip, we may even see TDPs dip into the 75 watt area for a 3 GHz part. This would make the part very competitive with what Intel has to offer, and because the die size is so much smaller than competing quad cores from Intel, we can expect prices to be in the sub $160 range for the highest clocked Athlon II X4 (and obviously X3 versions as well).
AMD has started to clean up their sub $160 product offerings, and we have already seen the Athlon X2 7750 quietly fade away from retail availability. By the time of the release of this product, I doubt we will see many 65 nm parts from AMD in any significant quantities. This move should improve AMD's margins, while still allowing them to be more competitive with Intel in the value market.
Regor, otherwise known as the Athlon II X2, is a very compact and efficient design. If AMD had decided to cut performance down another notch and go with 512 KB L2 caches, it likely would have been a 100 mm squared chip at most. A smaller die, however, would have posed other engineering issues, such as how to further shrink the memory controller and HyperTransport controller so that they will fit around the edge. By adding that extra 512 KB of L2 cache per core, AMD got around having to redesign those units and having enough space around the edge to insert them essentially unchanged from the earlier Deneb/Shanghai designs.
It appears as though we will not see the new Athlon II X3 and X4 products until later this Summer. My guess would be a late July/early August launch so as to get some momentum in the back to school crowd. Considering how the Athlon II X2 has performed, I am looking forward to the release of the other members of the Athlon II family. While they will not beat the Core 2 Quads at the same clockspeed, my guess is that they certainly will beat what Intel has at those specific price points.