Review Index:
Feedback

AMD's Processor Shift: The Future Really is Fusion

Author: Josh Walrath
Subject: Editorial
Manufacturer: AMD

The Future is Still Fusion

So where am I going with all this?  Well, the answer is obvious and has been looking us in the face.  Up and down the line, AMD is going with the APU.  We just have not gotten to the point where this is applicable to every marketplace.  There are also some very large hurdles still in the way.

Kaveri will be the first true incarnation of the Fusion idea.  It will feature the hUMA support which allows for shared memory address support for CPU and GPU portions, as well as other tricks and features which will allow the GPU portion handle greater workloads when necessary outside of graphics.  Kaveri will be hampered at the beginning due to the lack of software support.  The HSA foundation still has not finalized the specification, and that is expected next year.  Software coding is still in the relative dark ages and will stay that way until tools like HSAIL (HSA Intermediate Layer) are implemented.  C++ and Java are on their way to natively supporting hUMA type architectures, but they again are not quite there yet.  Most of the heavy lifting on the software side will appear in late 2014.

View Full Size

The Gigabyte F2A88X-UP4 is a very full featured board and aimed directly at enthusiasts.  It also will likely be quite affordable as well.

This situation is actually quite similar to the transition to 64 bit processing.  AMD released the Athlon 64 series into a marketplace without a functional 64 bit OS on the desktop side.  Microsoft eventually came out with WinXP 64, but it was far from a runaway success and support for many peripherals was often quite lacking.  It was not until Vista and Win7 that we had fully functioning 64 bit OS.  By the time these were released, Intel had regained the lead in performance with 64 bit processing.  AMD will be the first out of the gate with a fully functional hUMA APU.  Intel does have a heterogeneous processing product in Knights Corner, but this is not a desktop product.  In fact, Knights Corner sits in a similar position to the consumer as Itanium did back in 2004.  This is not to say that Intel does not have a plan for heterogeneous computing, as we have seen their integrated graphics portion become much more serious in terms of performance and compatibility.  So far Intel has not laid out plans to implement a solution like HSA/hUMA, but as with desktop 64 bit computing in the early 2000s Intel is holding their cards close to their chest.

AMD is moving away from AM3+ as the enthusiast platform, and we are seeing the first stages of this with FM2+.  Just as Intel moved away from socket 1366 to socket 1156/1155/1150, AMD will be doing the same for FM2+ and the enthusiast.  The first signs of this actually come from Gigabyte with their latest A88X/FM2+ announcements.  These are most certainly enthusiast level products which will leverage the advantages of Kaveri over the previous Trinity/Richland parts (namely PCI-E 3.0, hUMA).  The only real problem here is that GLOBALFOUNDRIES’ 28 nm process is late to market for AMD and their CPUs.  Not only that, but the die shrink from 32 nm to 28 nm is not enough for AMD to implement a full GCN unit with a four module CPU (8 threads) economically and at a 100 to 125 watt TDP envelope.  This is the primary reason why we will continue to see AM3+ Vishera processors sold so that AMD has a higher performance, high thread count CPU that they can offer the market.  This could potentially be offset by a little clue that AMD gave some time back on a roadmap; there is the distinct possibility of releasing a 3 module Kaveri part next year well after the two module units have been on the market a while.  If this 3 module part comes to market in a timely manner, it could potentially be a viable enthusiast level part in terms of thread count and performance.

View Full Size

The G1.Sniper A88X is the very top of the line FM2+ that Gigabyte is showing.  This board is very much over the top and will be one of the first truly high end FM2+ boards available.  We have yet to see if companies like Asus will release other high end boards that will compete in this space.

30% of AMD’s business is still the non-APU designs.  This means that AMD has to support these platforms which require higher performance and thread counts than what upcoming APUs can handle.  One of the hurdles that AMD has yet to get over is to effectively integrate APUs into a server environment.  Theoretically these parts are ideal for high performance computing, but AMD has a ways to go before they have the infrastructure in place to launch the APU into this space.  The building blocks are there, as evidenced by their purchase of SeaMicro.  Not only does AMD have a leg up in the micro-server market, but they have some significant IP that they can leverage with larger chips.  The Fabric that SeaMicro uses to communicate between nodes is high speed featuring relatively low latency.  AMD could beef up the design to work with the larger server chips and provide enough bandwidth and low latency communication as to render HypreTransport obsolete.  AMD could then focus all of their designs on using PCI-E 3.0 interconnects rather than have two families of CPUs/APUs which require either PCI-E 3.0 or HyperTransport.

AMD still has a lot of low level design work to do if they are planning a full scale assault on the server market with APUs.  Cache coherency using the SeaMicro Fabric, balancing NUMA vs. hUMA memory support, validating new southbridge products for the server market, and a variety of other issues keep AMD from releasing new Opterons based on Steamroller/GCN APUs over the next year.  Excavator looks to be the architecture that will pave the way towards a true Opteron APU that will compete against Xeons of the time.

During the next several years, core counts should not go up dramatically on the desktop and notebook market.  Tablets and handhelds will follow suit and not go crazy with core counts (except MediaTek and that pesky 8 core unit they are pushing).  Multi-core aware software is still not entirely common, much less applications which can utilize more than four threads.  Instead, I think we will see both sides continue to dedicate die space to more graphics functionality.  More CPU cores do not necessarily mean better overall performance and value in most workloads.  I think Intel certainly sees that with their desktop parts as evidenced by Haswell.  The majority of Intel’s CPUs in the desktop space support four threads maximum, until they get to the enthusiast level parts which will support eight to twelve threads.

View Full Size

HSA/hUMA has the potential to really change the landscape of personal computing (desktops, notebooks, and tablets/handhelds).  Many things have to fall into place before it can really take off though.  It is hard to say when that will happen and if Intel will catch up quickly.

2014 will certainly be a year of transition for AMD.  By the end of 2015 I fully expect their lineup of products from top to bottom be APU based.  For the higher module count parts, we have to wait for 22/20 nm process nodes to open up for AMD.  This should happen in late 2014.  This transition will not be seamless, nor will it be smooth.  AMD has to continue to convince users that their higher end offerings for AM3+ and C32/G34 are good enough to compete with Intel, all the while working on getting the APU above the $150 price point.  Eventually at 20/22 nm AMD will have a four module (or more) APU available that will satisfy enthusiasts and power users alike, but that is some time off.  Until then, AMD will stave off the competition by pushing Kaveri, keeping Kabini competitive, and keeping the royalties from the semi-custom group rolling in.  AMD continues to have a solid foundation of products, but they have certainly lost mindshare from some very vocal groups.  While many have criticized AMD for purchasing ATI when they did, the technology and expertise from ATI is really one of the major things keeping AMD afloat at this time.   The future really is fusion, and while AMD struggles with CPU performance, they are a couple steps ahead of the competition in creating a truly heterogeneous solution.  The situation is eerily similar to the transition to 64 bit computing back in 2003/2004.

September 4, 2013 | 06:45 PM - Posted by 16stone

I agree!

September 4, 2013 | 07:02 PM - Posted by Coupe

This is one of the most well written articles I have read in a long time.

September 4, 2013 | 08:09 PM - Posted by boothman

Was really hoping for an updated AM3+ chipset (995FX?) and perhaps the commercial release of the 95W FX-8300, or a further tweaked Vishera (not 9590 btw). Something to keep high core users engaged until Kaveri or Carrizo is ready with 6-8 core offerings.

I agree with Coupe, very well written article Josh. An excellent read!

September 4, 2013 | 08:31 PM - Posted by ArcRendition

Fantastic article Josh. Informative, concise and clearly articulated.

September 4, 2013 | 09:28 PM - Posted by praack

nice article, i remember being very much in support with amd's first white papers on processor design into the apu space, but then saw the result and the push to entry level.

so never bought into the apu because the business side kept it as an entry level part.

if they can get the speeds up, get the graphics up and allow them to communicate with more than low end parts - maybe it will be the processor amd needs- if not - well more people will move to intel

September 4, 2013 | 09:48 PM - Posted by Brett from Australia (not verified)

Very well written article Josh we have been waiting for some time to see where AMD was going with their CPU's\APU's. I was particularly interested to see what lay ahead with FM2+ parts, interesting roadmap lies ahead.

September 4, 2013 | 10:33 PM - Posted by Bill (not verified)

What I got out of that article is if you're waiting for AMD to come out with something better to compete with Intel, stop waiting and buy Intel now. As usual, AMD will only state that something better is on the horizon and it never materializes.

September 4, 2013 | 11:00 PM - Posted by ArcRendition

That's the opposite of what the article was conveying. AMD isn't trying to compete with Intel directly. They're trying to cultivate a new "Fusion" APU standard with hUMA and consequently change the landscape of CPU dynamism.

September 9, 2013 | 09:56 AM - Posted by Anonymous Coward (not verified)

The problem is that nobody needs "fusion", or "huma", whatever the hell that is.

The bottom end of the market cares about cost only and nothing else, and margins are so slim it doesn't really matter how many units you sell.

For the midrange and up the performance of the CPU cores matters. AMD seems hell bent on using words to convince others that this doesn't matter, but right now it's ALL that matters.

People want fast cores, otherwise what's the point in upgrading? 8 cores? Who cares, the market for 8 slow cores is almost zero. The market for huma is exactly zero.

September 9, 2013 | 10:22 AM - Posted by Josh Walrath

Qualcomm seems to be doing pretty good with low cost/low margin chips.

September 19, 2013 | 12:14 PM - Posted by Anonymous (not verified)

CPU's are obsoleted already. Intel does not own any high-end GPU's and does not own high-end GPU IP.

The current Intel CPU architecture is already obsoleted and not much you can enhance it, besides improving the chip foundry process which is very expensive.

AMD has a new architecture that will be really released by the beginning of next year. The current AMD APU architecture is work in progress.

AMD APU fusion design is the way to go. Integrating the CPU and GPU functions into one seamless processor is the way to remote all the current architecture bottlenecks. i.e. two different memory banks, data clogging the bus, etc.

Soon the AMD APU's CPU/GPU elements will share the same memory, will be able to use GDDR5 for GPU performance, etc.

Intel doesn't have anything close to the AMD APU's, and they don't have any high performance GPU's.

Systems with only an Intel CPU are complete dogs. Intel desperately needs and is totally dependent on NVIDIA and AMD high-end GPU's in a system to be able to be considered an high end system.

AMD has both great CPU's, great GPU's, great Crossfire technology for multi-GPU performance scaling, etc.

Intel's marketing catches all naive people that think that what makes a high-end system is the CPU.

Any high end system needs a good CPU and one or more high-end GPU's, which Intel do not own and have failed miserably in designing one (Larrabbee, etc).

So there is no high-end system made of Intel only parts, but THERE ARE HIGH END SYSTEMS MADE OF AMD ONLY PARTS...!!

Intel needs NVIDIA and AMD to be able to become a high end system, AMD DOES NOT need any other company.

So the bottom line is there are no INTEL high end systems, because they all need NVIDIA and AMD high end GPU's or it will not be called a high end gaming/etc. system.

September 25, 2013 | 03:33 PM - Posted by RJ (not verified)

Intel will do exactly the same thing AMD did - buy GPU company (nVidia will be easy pray). They will then take a while to chew it up and to integrate everything, like AMD did. I am afraid, however, that at the end of that misstep Intel will still have superior product to AMD, if nothing else - because it has better (less nm) production.

Its going to be the story of 2000-2005 all over again - AMD comes up with great idea but after couple of years INTC's money stash will make up the difference and Intel's technological processes will propel it to undisputed leader again.

January 1, 2014 | 06:29 AM - Posted by Eric Slattery (not verified)

Intel is not allowed to buy Nvidia or AMD because AMD is their only big competitor (Monopoly law) and Nvidia is with Tegra, so they cannot be owned by anyone else, and technically it is cheap Nvidia Graphics on the Intel die, they are just that terrible though.......AMD is developing exclusively for itself, and then other people in the HSA (ARM, Samsung, etc etc) are all going to work to utilize the hardware, making it widely available, and being integrated. AMD may have patented its current idea to bridge the CPU and GPU for parallel processing and offloading, making Intel have to pay them or come up with something on their own to use that set up, just as Intel pays AMD for x64 architecture to manufacture. Right now APUs will be best used in the mobile market, but they will eventually catch up and be used on the Desktop, wide stream, and we are already seeing integration on servers too. AMD is benefiting at the moment too that their GPUs are being bought like hot cakes for cryptomining. If intel was the sole provider, we would get a processor every year that is 3-5% better than last year, with maybe a slightly smaller die. At least AMD has made innovations with their chips, and now that software is taking advantage of the CPU modules of Piledriver and Bulldozer, their multithreaded performance is booming in many applications, especially with the FX-8350 keeping up with the 300-500 dollar i7s. Intel has not really done much innovation in the CPU side sadly......they did not push the smaller manufacturing sizes, they have someone else do the manufacturing process, and then there has to be a market for it, which is the mobile market and that is driving lower power usage, but they have been rather stale for a few years.

September 5, 2013 | 12:38 AM - Posted by ezjohny

I think AMD made a good decision, but they need some work still with there "Single Core Speed" for the processor. For the desktop side lets hope they get the hard core gamers on APU's!!!

September 5, 2013 | 02:12 AM - Posted by puppetworx (not verified)

Cool article Josh. I like this prospective, analysis-based type of article and I'd like to see more of them. Some of this information and analysis would have been included in product reviews but a lot of it wouldn't and besides it's nice to have it all brought together and extrapolated on. Very cool.

This all makes me wonder how and if NVIDIA is going to gain any traction

@ezjohnny Single core speed should with any luck be less significant in the near future, at least for gaming. Yeah I know, we've been hearing that one for years, but with the Xbone and PS4 both housing 8-core processors clocked at a measly 1.6GHz multi-core optimization simply has to happen.

Hardcore gamers on APUs though... you crazy!

September 5, 2013 | 02:57 AM - Posted by Anonymous (not verified)

While I can certainly appreciate the fact that AMD's plans are coming along nicely, as a tech enthusiast living in South Africa this doesn't give me any solace in the fact that they are working on better products. I'm planning on upgrading my PC next year, but what exactly do I have to look forward to?

I mean, I have the choice of Haswell and a new chipset, AM3+ with outdated everything (but more cores) or FM2+, which will taper out at four cores and won't see anything capable of addressing six or more threads. I'd like to remain with AMD as my Athlon X3 has served me well, but at the rate they've been moving I might as well move to Intel.

And that sucks. I don't like gimped products that have been restricted primarily because Intel knows they have the market by the balls.

September 5, 2013 | 03:47 AM - Posted by imadman

Superb article, congrats Josh!

September 5, 2013 | 06:07 AM - Posted by Anonymous (not verified)

Wawsaw will be piledriver based, basically a refresh of current opterons. So just a few tweaks, like richland is to trinity. Look at the roadmap pic here:http://www.pcper.com/news/General-Tech/AMDs-plans-keep-their-ARMs-server-room

The thing that bothers me about going all-APU is the fact that an APU will never be as powerful as discrete CPU+GPU setup, simply because it's easier to make two 250mm chips than it is to make one 500mm chip. So for gamers and enthusiasts who want a discrete CPU, AMD won't have anything on offer (assuming kaveri isn't a CPU jesus that crushes intel's performance), and and even intel's chips will have 50% wasted on an unused GPU.

Perhaps some day someone will figure out how to make use of the iGPU for compute even while a discrete graphics card is in use, or perhaps we might see multi socket motherboards that you can plug 2-4 APUs into in some sort of crossfire arrangement?

It really bugs me to have so much wasted silicon lying around.

September 6, 2013 | 10:17 AM - Posted by Josh Walrath

I will be interested to finally hear what Warsaw actually is.  They haven't done well in educating the public/press about it.

All APU is not necessarily a bad thing.  With HSA there really could be some usage case scenarios where that GPU portion is utilized to a large degree, even with a standalone GPU in the system.  Collision and physics in game could really use that type of horsepower without having to dedicate the GPU to those calculations (and thereby potentially decreasing performance over a hUMA solution due to context switching, copying memory, etc.).  AMD has a long ways to go before it can be used in such a situation, but I think it is coming.  The APU is a good low end processor as well as you still get decent 3D graphics performance essentially for free.

October 15, 2013 | 05:35 AM - Posted by Anonymous (not verified)

The thing that bothers me about going all-APU is the fact that an APU will never be as powerful as discrete CPU+GPU setup, simply because it's easier to make two 250mm chips than it is to make one 500mm chip. So for gamers and enthusiasts who want a discrete CPU, AMD won't have anything on offer (assuming kaveri isn't a CPU jesus that crushes intel's performance), and and even intel's chips will have 50% wasted on an unused GPU.

Perhaps some day someone will figure out how to make use of the iGPU for compute even while a discrete graphics card is in use...

DUDE, this is exactly what HSA means!
The CPU part covers the floating point calculations, while the iGPU covers the integer calcs. The discrete GPU then has more than enough headroom to take care of the rest.

So basically AMD turns their biggest disadvantage into their biggest advantage. Their CPUs excel at floating point calcs and they offset their poor integer calcs with the power of the igpu, making the APUs far more capable at pretty much everything than every traditional CPU could ever be.

January 8, 2014 | 12:30 PM - Posted by HikingMike (not verified)

Switch integer and floating point.

September 5, 2013 | 09:17 AM - Posted by gamerk2 (not verified)

"Multi-core aware software is still not entirely common, much less applications which can utilize more than four threads."

I take issue with this. Ever take a look at application thread counts in task manager/process explorer?

The main issue is that most of the work is serial; you do things in a logical order, with very little opportunity to break up processing. This results in non-scaling applications, irrespective of how many threads they use. (The side-effect of this is giving Intel a performance edge in most tasks, another reason why AMD would be wise to abandon BD and it's derivatives). The things that do easily scale, such as media encoding, physics, and rendering are already offloaded to the GPU, leaving the CPU with all the non-scaling workloads.

Likewise, thread management is not the domain of the developer, but the OS. Windows assigns threads to cores. At any given instant, the highest priority thread(s) that are ready to run are executed. The scheduler does a lot of work behind the scenes to adjust priorities, but that's the gist of windows thread management. After the developer invokes CreateThread() (or one of its many wrappers), their job ends.

Point being: any application that uses more then one thread (and almost all do) is, by definition, multi-core aware. The issue is one of processing: Can you logically break up processing in such a way as to perform processing on two different tasks at the same time, and not reduce performance due to thread overhead, deadlocks, and all the other performance penalties that start to crop up as you utilize more threads? For most tasks, the answer is no; after two or three threads, it becomes to difficult to break up work in such a way where you gain performance. Hence the current state of software.

/rant

September 6, 2013 | 10:23 AM - Posted by Josh Walrath

I think we are actually saying the same thing, but you were able to codify this a whole lot better than I did!  When multi-core CPUs first came out and MS announced that Windows supported multi-threaded environments, many people automatically assumed that Windows would magically cause many programs to go faster just because it could divide up the workload in any application.  Of course, this assumption is incorrect because as you mentioned above, not all programs will benefit from multi-threading due to their workload being more serial.

One of the things I constanly am reading is "all these lazy programmers who can't take the time to make their app more multi-thread friendly" and that of course is not entirely true.  Sure, there are some lazy programmers who could potentially extract more parallelism from their code, but for the most part these guys do understand their programs and have already done a lot of work to improve performance in a multi-core environment.  Plus, as mentioned before, some programs just can't be broken up to address multiple threads.

Great comments though!

September 5, 2013 | 12:29 PM - Posted by TopHatKiller (not verified)

"Anon" is correct; Warsaw is a base?-respun piledrive, done for cost no doubt. 2015 AMD launches new 2/4p server architecture and platform; intergrated gen3 so nearer fm then am? Very high end desktop cpu could/will be derived from that. Core count: up to 20/chip on 0.2. SteamrollerB or Excavator. Seems certain no am3 follow up though - this chip willbe fm2+ or 3. AMD has made clear their commitment to the GROWING high end server market, not just micro, and 15h chips will be released in this market, it is foolish and overly pesimistic to assume otherwise.
This made me laugh: Hasnot is a quad because Intel knows high core counts do not translate into higher performance on the desktop? Giggle, you silly silly man, Intel is screwing us all delivering pathetic low-cost parts, keeping price skyhigh and pocketing massive profits.So many sites and journos not only never combat intel on its policies, in particular the hugh screw-up over trigate,but say tar thanks oh mighty one for screwing my bankaccount.

September 6, 2013 | 10:28 AM - Posted by Josh Walrath

You should watch our podcast, we actually discuss how Intel is actually screwing us over.  I believe it is in the Ivy Bridge E section that we actually take Intel to task for not delivering the goods in a timely manner and making us pay for it out the teeth.  Consider the overall perf. difference between a i7 2600/2700K and the latest Haswell based 4770K.  Sure, there is a perf difference, but it is pretty marginal considering the amount of years that the 2600K has been out.  We saw this back in the days of the K6/Pentium II, and Intel is acting remarkably similar (conservative roadmaps for CPUs with high prices attached to them).

Tri-gate is not a bad idea, but I think that Intel did not expect the huge jump in power when the speed was raised above 3.9 GHz with these chips.  It looks like planar/FD-SOI will actually have superior power/switching characteristics at 22/20 nm.  Sorta wish Intel had utilized FD-SOI, but they prefer to stay with bulk silicon and keep margins higher.

September 6, 2013 | 12:47 PM - Posted by TopHatKiller (not verified)

Thanks for the reply. Sadly I just never listen to podcasts. Perhaps the 'silly' repetition was a little rude.
I'm still expecting 2015 to be AMD payback time, but God knows AMD has cancelled so many cpus over the last coupla years - who can really say what the high end server / culled desktop parts will actually be anymore?

September 6, 2013 | 01:20 PM - Posted by Josh Walrath

I think they are on a good track with Steamroller.  We will see some nice IPC improvements as well as much better thread handling (namely two modules can simultaneously handle 4 threads, unlike current iterations which can do one thread per clock per module).  Can't wait to get our hands on Kaveri and see what the combination of Steamroller and GCN can give desktop uers.

September 5, 2013 | 02:00 PM - Posted by MarkT (not verified)

I'll be honest I was sad the entire time reading this article, nothing gave me hope, why, oh why this is good news.....

September 5, 2013 | 02:27 PM - Posted by Computer Ed (not verified)

Welcome to the church brothers, nice to see others arriving at this future. The thing I love is so many web sites starting to finally see this and I was talking about this very move in 2011 :-)

Better late than never, welcome to the show...

September 5, 2013 | 03:15 PM - Posted by Anonymouspipm1 (not verified)

It'll be disappointing if a 4-threaded Kaveri APU still doesn't have any L3 cache & doesn't out-performance 4-thread Vishera with L3 cache CPU. :(

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.