Review Index:
Feedback

AMD FX-8150 Processor Review - Can Bulldozer Unearth an AMD Victory?

Author: Ryan Shrout
Subject: Processors
Manufacturer: AMD

Bulldozer Architecture (continued)

The below text was taken from Bulldozer at ISSCC 2011 - The Future of AMD Processors.

The second topic covered at ISSCC was that of “40-Entry unified Out-of-Order Scheduler and Integer Execution Unit for the AMD Bulldozer x86-64 Core”.  Single thread performance is still of great importance for modern processors, and this has been an area where AMD has lacked as compared to the competition.  The first work to help achieve better single thread performance was that of the fetch/prefetch, branch prediction, and decode.  AMD has still not covered those portions in depth, other than we know that a lot of work has been done to each individual unit.

Each integer unit has its own scheduler.  Each integer unit is comprised of two execution units, and then two address generation units.  The execution units are further divided so that one handles multiply and the other divide.   These are again newly designed units which have very little in common with previous processor architectures.

The schedulers have some very interesting wrinkles to them.  First off is the support for 40 entries, out of order scheduling.  It also supports up to 4 x 64 bit instructions in flight.  Michael Golden presented the paper, and his quote about the clock characteristics of these tightly knit units is as follows:

The out-of-order scheduler must efficiently pick up to four ready instructions for execution and wake up dependent instructions so that they may be picked in the next cycle. The execution units must compute results in a single cycle and forward them to dependent operations in the following cycle. All of this is required so that the module gives high architectural performance, measured in the number of instructions completed per cycle (IPC).

What is perhaps the most interesting aspect of these new designs is the use of standard cells vs. fully custom cells.  Place and route of standard cells can be automated, and it is relatively easy to create complex designs fairly quickly.  Custom cell layout is very complex and time consuming, but it has the advantage of being very efficient in terms of power consumption, and has a higher switching speed than standard cell designs.  Somehow AMD has taken a standard cell design utilized on GLOBALFOUNDRIES 32 nm SOI process, and made it perform at custom cell levels.  The integer execution units and the scheduler can run at the same 3.5 GHz+ speed as the rest of the chip, even though it has portions of the design made with standard cells.

View Full Size

The full 8-core / 4-module Bulldozer Architecture found in AMD FX.

This apparently has allowed AMD to quickly and rapidly prototype these designs.  This has the advantage of being able to deliver to market faster than going with a fully custom part, and it also allows AMD to further test the performance and attributes of the standard cell design and possibly change it without the time and manpower constraints of custom cell.  How AMD has achieved this is beyond me.  Being able to implement standard cell design rules and achieve custom cell performance has been the holy grail of CPU/GPU design.  Obviously this has limitations, as the entire processor is not comprised of all standard cells.  I believe that Intel also utilizes some standard cell features in their latest series of processors, so AMD is not exactly alone here.

Power

Previous AMD processors were not designed from the ground up to implement complex and efficient power saving schemes.  Since Bulldozer is a new design altogether, the engineers are able to more effectively implement power saving into the processor.  Throughout the years we have seen small jumps forward from AMD with power saving techniques, but Bulldozer will be the first desktop/server product that will have a fully comprehensive suite of power saving technologies.

The CPU, in typical workloads (obviously does not include "Furmark" in SLI/Crossfire situations), takes up the majority of power in a system.  By being able to reduce a significant percentage of power draw at that one component will decrease the overall system draw to a great degree.

AMD now has fully gated power to the individual cores, which allows them to be completely turned off when not in use.  The replication of functional units (such as fetch and decode) for the individual cores also cuts down on the complexity, and thereby power draw, of the overall processor as compared to how many logical cores it has.  The clock grid (which provides the clock signals throughout the processor) also has been radically redesigned so as to be less of a power sink, and still be efficient in keeping the processor clicking along.

Clock gating, which turns off individual components such as execution units, has been much more thoroughly implemented.  There is something like 30,000 clock enables throughout the design, and it should allow an unprecedented amount of power savings (and heat reduction) even when the CPU is at high usage rates.  Even though a processor might be at 100% utilization, not all functional units are being used or need to be clocked.  By having a highly granular control over which units can be gated, overall TDP and heat production can be reduced dramatically even at high utilization rates.

AMD Turbo Core will also receive a great amount of attention.  The current Turbo Core we see in the X6 processors is somewhat underwhelming when we look at the overall complexity of AMD’s implementation.  For example, when three cores or less are being utilized on the X6 1090T, those cores will clock up to 3.6 GHz, while the other three go down to 800 MHz.  There is no real fine tuning of performance or TDP here, just an “on/off” switch for clocking half of the cores 400 MHz higher while downclocking the rest.  This is fairly basic as compared to Intel’s system.  Now it seems that AMD is implementing a system much like Intel’s.  We should see Turbo frequencies with differing numbers of cores which will be much more similar to what Intel offers with Sandy Bridge.

Due to the ground up design of Bulldozer, and the focus on decreasing power draw and heat production, we will see a nice reduction in power being utilized across the entire processor.

 

Bulldozer is a comprehensive blank sheet design which is very similar to the jump the company took going from the K5/K6 to the original Athlon.  AMD certainly hopes that it will be able to more adequately compete with Intel in terms of overall performance per watt, as well as die size and transistor count.  When the Phenom was originally detailed, many thought that it would prove to be the counter to the Core 2 that AMD needed, but unfortunately that design was not forward thinking enough in terms of design to adequately compete.  Up through the current generation of parts, Intel was able to use fewer transistors and a smaller die size to create products that were significantly faster than what AMD was able to provide.

October 12, 2011 | 01:23 PM - Posted by Oskars (not verified)

The only other redeaming ansver could be that production techniques for bulldozer wafers must be dirt cheap and fast paced. Some Interlagos 16-core processors where mentioned to be around 85w on 1.6 ghz, so in some "lucky wafer" cases it could be considered an efficient chip. Hard to fathom if that is worth anything.
But really, if software wasn't ready for a propper Bulldozer computing scenarios, they could just have made an advanced 8-core thuban, that was based on Llano cores just with the additional L3 cache and already enlarged 1MB L1 cache per core. That shouldn't have taken more than 1.5 billion transistors. There's just the questione of production cost...

July 8, 2013 | 12:06 AM - Posted by Maybelle (not verified)

Of course at some point. But as I wrote in Sierra magazine at the time.
They also make small utility things like chairs, tables, desks,
Doll houses, dog houses, car ports, bird feeders, or even a woodworking
oregon artist might be able to move on to a little massive one.

As I landed in Asia after a nearly 24-hour flight from the United States to test new
X-ray technology that can see you earn really substantial amounts.

Visit my page :: woodworking classes denver (http://mywoodworkingtips.org/)

October 13, 2011 | 05:30 AM - Posted by Anonymous (not verified)

could it be because most of applications are compiled using intel compiler and therefore are optimized on that architecture?

i've read this before. removing something from intel's compiler boosted via cpu to almost 50% performance gain.

kindly explain ryan. tnx.

October 13, 2011 | 02:25 PM - Posted by mpiggott (not verified)

I was wondering the same, for a long time Intel was purposefully limiting SSE optimization to CPUs which returned Intel's manufacturer string. (Instead of using CPU flags as intended)

I believe Intel agreed to end this practice last year but depending on when they actually implemented released it and when affected benchmarks were released....

Unfortunately as end users may also be using software compiled with the bogus compilers the results shown may be representative (until people stop using old software)

October 13, 2011 | 07:07 PM - Posted by Ryan Shrout

No guys, that is not the issue here. And anyone that says the compiler is getting 50% perf advantage is probably lying.

October 13, 2011 | 02:13 PM - Posted by mpiggott (not verified)

I would imagine a large chunk of the transistor difference is from the difference in L2 cache sizes.

October 12, 2011 | 09:13 PM - Posted by Anonymous (not verified)

I have a system with a GTX580 and an older Intel i5-750 processor. I can run all of the games that I've seen tested with almost exactly (within 4FPS(on games running 60FPS or less)) the same frame rate as systems with better processors. (Al-though, if you want SLi/Crossfire older processors like mine may not keep up)

If you are running a system just for gaming it seems to be more useful to have a beefy GPU. I think that the way the games where tested in this review are perfectly acceptable, because they show REAL WORLD gaming performance.

(I run all my games at 1080P. My i5-750 is clocked at 3.8Ghz with turbo on (air cooling)). I use an EVGA P55FTW motherboard. My GTX580 is running at stock clocks.)

Nice review, keep it up guys!

October 13, 2011 | 12:24 AM - Posted by Bill (not verified)

Not everyone is running 1080p monitors yet. I run at 1680x1050 and will do so for quite a while yet. So to me at least, the cpu performance in a game definitely matters. When I saw just how pathetic the 8150 did in these benchmarks I couldn't believe it.

October 13, 2011 | 06:39 AM - Posted by mike2977 (not verified)

Ok, so for my next computer, I have all the parts except the CPU and motherboard. I was planning to go Bulldozer instead of Sandy Bridge, but now I'm wondering if that's the best decision. All things being equal, and price not an issue, would one want to go top of the line FX or top of the line i7? All around machine; some games; some digital processing.

Then there's the SSD issue. About a year ago I put an SSD into an HP Core i7 desktop and reinstalled windows on 64gb with everything else on a 1tb drive. I was thinking of going the same way with the new computer (larger 3rd generation SSD), but now with the new Intel chipset and motherboards capable of caching with a small SSD, that enters into the equation of deciding between Intel and AMD. Again, which would be the 'better' machine? For gaming? For video processing?

Perhaps a good discussion topic for This week in Computer Hardware?

Mike Ungerman
Orlando, Fl

October 13, 2011 | 07:29 AM - Posted by Anonymous (not verified)

I’m more interested in the performance with a virtual environment. How does it perform with VM, VirtualBox, Hyper-Visor, etc? From a cost point view will it handle my needs or should I spend the extra $$$$ for intel? Thanks

October 13, 2011 | 07:34 AM - Posted by KILLfactor

1st, thanks for testing and showing a Core2Quad in your review... many people still have the Core2Duo/Quads as they pretty much put Intel on the map again a few years ago and are still to this day very good CPU's.

I have a Q9550 @ 4Ghz on hair and its perfect.

2nd, but disappointed with the gaming benchmarks and reviewing.
Because you used very few games and you also used only 1x video card.
What the results show is a GPU limitation and are not really testing the CPU.

This kind of testing only shows 1 thing which is pretty damn obvious, that at high resolutions and settings in games even a single GTX580 is limited, the CPU is idling.

These tests do not show the strengths and weaknesses of a CPU as the CPU is not working hard at all (gpu limit).

You either need to lower resolution to show how well the various games use the cores and respond to different CPU's or use SLI/Crossfire cards/setups which DO often put A LOT more stress on the CPU and separate the sheep from the lions :)

Please do SLI or Crossfire testing and lets see how this CPU holds up!

October 14, 2011 | 08:22 AM - Posted by Prodeous (not verified)

Have you used the supplied ASUS motherboard that was supplied as part of a kit from AMD? If so there might be some issues related to the MB. Post below.

http://www.tomshardware.co.uk/forum/315775-10-asus-crosshair-giving-bias...

There is some information that the Asus crosshair is not performing as well. Two sites used other motherboards, AsRock as well as Gigabyte Motherboards, and showed much different picture of performance.

I would really like to see a verification from my trusted site.

October 14, 2011 | 10:01 AM - Posted by Ryan Shrout

I guess I'll ask around, but I am about 99.99% sure that the motherboard isn't making a big difference here. If the large majority of sites saw the same results and none of us thought anything fishy was going on, chance are it wasn't.

But like I said, I can test another board from MSI or Gigabyte after the weekend when I return home.

October 14, 2011 | 12:58 PM - Posted by Craig (not verified)

I belive that AMD might be holding out a little here.

Think about it long term, AMD (unless I am misinformed) Have stopped production on everything that doesnt use the new bulldozer design. I think they did it a while ago.

Now they have these Bulldozers comming in equal to the Phenom 2 x6's. Piledriver is due out Q1 next year, I'm thinking that either-

Bulldozer is ment to replace all current AMD chips, This brings all AMD users upto the same platform (AM3+) And all there factories can focus on streamlining the manufacturing of these new FX models. And then Piledriver will come in, replacing the FX8150 as the flagship and be so far up intels smoke pipe, that they sit there and think WTF just happend here.

OR

AMD scrapped all previouse AM3/AM2+/AM2 munufacturing stated making these, then relized sommin was up and they were not performing, so to buy themselfs some time, they release these (witch arnt bad, there not great but not bad either)And are now working there asses off perfecting it with piledriver, letting intel snigger for the moment, As AMD Have another Athlon 64 up there sleave they just need to fix the kinks.

Or option 3

AMD cpu devision is now run by trained chimps and AMD cpu devision is about to sink.

October 14, 2011 | 04:29 PM - Posted by 3dbomb (not verified)

Option 4

AMD realise the real sustainable money is in servers. Everything about the Bulldozer points to AMD migrating slowly from client to server. You really think AMD's plan to bring out a great CPU for gamers was to go for an 8 core model when games just aren't that well threaded and that's likely to be the case for years to come?

The marketing spin from AMD is transparent to anyone that knows tech. They're trying to sell you A CPU that's transitioning over to being a full on server design. Massively threaded, just what the server world wants.

I think they have a good few years yet of trying to squeeze every last bit of profit out of the value market, the gamers and the enthusiasts but their plan appears to be simple. Slowly increase the clock speed of Bulldozer over the next x years and make a real play for the high end server market.

Think about it this way, AMD is a small company compared to Intel. It doesn't have the resources to develop CPU's that will win big in all the different markets CPU's play in. So why not try and sell server CPU's to the clueless, use the bad parts to sell to the value market and with all real resources focused on making the best server CPU's. They can beat Intel on price and Intel can do nothing but lower prices to compete, something they've never wanted to do in the server space.

It's interesting watching this play out. All this nonsense about compilers and Windows 8 unleashing the true power of Bulldozer. The real story for me here is how AMD has managed to convince at least two markets that its making CPU's for them with just a little spin when in reality its passing off its R&D to gamers and enthusiasts (AMD knew it would take several years to build up to really fast server class CPU's that are massively parallel, why not sell that research along the way as Bulldozer FX-8150, the 8 core super CPU for a new generation!) Finally using its failures at the factory to supply the value market with a few cores, they don't need more and its pretty much free money to keep the server machine fueled.

It's pure genius really if you stop and look at the big picture. Pretty crappy for the AMD fans that have supported then all these years but maybe the moral there is, don't think of huge corporations as your best friend, heh.

October 14, 2011 | 06:08 PM - Posted by drbaltazar (not verified)

http://www.hardwareheaven.com/reviews/1285/pg1/amd-fx-8150-black-edition...
these guy gained some!
asus board are am3 and fx are am3+
also cpugpu against cpu(i72600k vs fx-8150)
come on if there are not cpu then all website shouldnt do benchmark gees not compared a cpugpu vs a cpu.it doesnt make sense.
i cant enumerate all the stuff but sufice it say that most of the banchmark on the web are bogus,hardwareheaven didnt use the amd kit but they still compared it to the i7 2600k lol so in the end even if they got better number it is still useless data .
i sure hope website compare happle with apple gees like the i7 960 to 990 serie they are cpu no cpugpu!

October 17, 2011 | 08:43 AM - Posted by RickCain (not verified)

I suspect its a mere problem of the software having to catch up with the hardware. It took quite some time before the AMD64 even had 64-bit software to run, and initial tests had 32-bit equivalents spanking the 64-bit systems.

Bad hardware, no just bad programming.

AMD went out on a limb with a completely new architecture. intel is just squeezing what's left out of core2.

November 4, 2011 | 02:18 PM - Posted by Kenmore (not verified)

The way I look at it is this. Soon your pc will be gone and you will be running your monitors off of thin clients. So if amd can beef up their cpu's to run several thin clients even in a gaming way, then they will be way ahead and everyone will be looking back at them thinking, wow amd was right on the money with switching to 8+ core CPUs especially since most games get their speed from gpus anyway these days. Dunno just a thought.

I can just imagine everyone having a main server in their house. I am already in the process of setting that up as we speak. Still in the planning stage, but I think it only makes sense, outside of my gaming rig that is. Just need to figure out a few details. But I am thinking I may use the bulldozer as the CPU in the server unless something else comes out that's better by then. My house is already hardwired with cat5 in everyroom so it makes sense to me unless anyone else has a better suggestion.

November 21, 2011 | 11:18 AM - Posted by meshrakh (not verified)

Its just a matter of time when AMD will regain the king of the hill where Intel has already been since C2D. But will the water turns its tide if Intel has already washed every shore of opportunity with their vast amount of resources. Let's face it, even though this chip seems to be a failure, it had opened up a whole new thing on the computer world. Multi-threading is the thing of the past, "Multi-Core" functionality is soon to rise. Let us be thankful that a company such as AMD has the guts to restructure the processor, that we can see new insight coming out of it. Bulldozer may not compete with the SB i5 and i7 but it will give software developers specially Microsoft the idea to utilize those monstrous 8 core chip for a better performing computer. Remember "two is better than one", time will come when computers will recognize that 1 is not 2, more sensible.

January 11, 2012 | 06:04 AM - Posted by Anonymous (not verified)

But some thing we need to keep in mind 8 core will be only fully support by windows 8 right now amd and microsoft working hand in hand to place a patch for at least bost the 8 core in windows 7

March 6, 2012 | 07:47 AM - Posted by Anonymous (not verified)

There's a patch for all Windows out already its called Linux!

April 4, 2012 | 05:01 PM - Posted by Jeff (not verified)

Well I can say one thing after looking at how many programs are compiled, most are optimized with Intel's instrution set and not amd's AMD has it's own set of CPU instruction for the FX chip and as of yet no... programs or benchmarks writen with them compiled. with the ne MSVC 2011 the AMD instrutions will be avalible for DEV's but will take some time to get on the market.

April 4, 2012 | 05:03 PM - Posted by Jeff (not verified)

Need to correct myself windows 8 preview does have some of the AMD instrution precompiled.

May 8, 2012 | 03:30 PM - Posted by skeptic007 (not verified)

people bash on AMD to much, it really isn't that bad, its mainly for multi-purpose for doing many things all at once on one computer. AMD is so much better then intel in that section,
understand that intel has a set standard so its easy to work with. You can overclock but its not really meant for that,
Amd is meant for overclocking, i don't know a single AMD product that's not overclocked, and what I've notice when I've done test with an AMD product is that the more i have on my screen the fast it gets. AMD is a product that needs to be worked, while intel has that set standard, and my test with intel is that the more i have and do on my screen the slower it gets, but i do considered intel to have the advantage cause people want that set standard because if intel is working at 80% it'll stay there while AMD will be at 70% and needs to be worked to get there so if u want AMD to work faster open up a lot of pages and start working it
if your looking for just gaming intel is the way to go, but if your cool and do a bunch of other stuff AMD is for you
and i would recommend gtx graphics cards there best in my opinion but i haven't worked with AMD graphics cards so i cant give a comment on that its just what i us

June 14, 2012 | 10:25 PM - Posted by Madhavi (not verified)

Sir,
I want new PC for animation, graphics designing purpose. I am not that technically sound. Someone suggested me FX 8150. Can u help me?

July 4, 2012 | 08:11 AM - Posted by honestann (not verified)

Just to throw in a comment that is a bit special-case, but certainly matters to me. I've been writing a 3D game/simulation engine for a while now, and all of a sudden I notice my linux computer (with FX8150) was much faster than my windoze computer (with slightly older 4-core phenom2 CPU at same clock speed).

When I tracked down the reason, it was because the older phenom2 cannot execute the 256-bit AVX/FMA instructions. I have 32-bit and 64-bit versions of key SIMD assembly-language routines, and the 256-bit AVC/FMA versions are almost twice as fast! Since they are fairly key routines, this one advantage of the FX8150 (AVX/FMA), makes a huge difference to me!

I just bought a new motherboard and another FX8150 for my windoze computer, so it is on a level playing field with my linux system.

PS: From my perspective, 64-bit SIMD with 16 ymm registers and AVX/FMA instructions is a BIG deal. True, many people couldn't care less, and many application that could benefit - haven't been rewritten to take advantage of these new instructions.

Oh, and BTW, the speed comparison on these routines between my assembly-language routines and compiled C code with optimization turned up to maximum is hilarious --- as in 6 to 12 times faster!

July 14, 2012 | 11:17 PM - Posted by Anonymous (not verified)

Comparisons are invalid unless you use 1866 memory with the 8150. The 1090 does not support 1866. Why would you dumb down for a comparison when you could show the 8150 with 1866 vs a 1090 with 1333... why not show the best they both can do?

A major advantage of the 8150 is the ability to run faster memory.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.