Review Index:
Feedback

AMD A8-7600 Kaveri APU Review - HSA Arrives

Author:
Subject: Processors
Manufacturer: AMD

The AMD Kaveri Architecture

Kaveri: AMD’s New Flagship Processor

How big is Kaveri?  We already know the die size of it, but what kind of impact will it have on the marketplace?  Has AMD chosen the right path by focusing on power consumption and HSA?  Starting out an article with three questions in a row is a questionable tactic for any writer, but these are the things that first come to mind when considering a product the likes of Kaveri.  I am hoping we can answer a few of these questions by the end of this article, but alas it seems as though the market will have the final say as to how successful this new architecture is.

AMD has been pursuing the “Future is Fusion” line for several years, but it can be argued that Kaveri is truly the first “Fusion” product that completes the overall vision for where AMD wants to go.  The previous several generations of APUs were initially not all that integrated in a functional sense, but the complexity and completeness of that integration has been improved upon with each iteration.  Kaveri takes this integration to the next step, and one which fulfills the promise of a truly heterogeneous computing solution.  While AMD has the hardware available, we have yet to see if the software companies are willing to leverage the compute power afforded by a robust and programmable graphics unit powered by AMD’s GCN architecture.

(Editor's Note: The following two pages were written by our own Josh Walrath, dicsussing the technology and architecture of AMD Kaveri.  Testing and performance analysis by Ryan Shrout starts on page 3.)

Process Decisions

The first step in understanding Kaveri is taking a look at the process technology that AMD is using for this particular product.  Since AMD divested itself of their manufacturing arm, they have had to rely on GLOBALFOUNDRIES to produce nearly all of their current CPUs and APUs.  Bulldozer, Piledriver, Llano, Trinity, and Richland based parts were all produced on GF’s 32 nm PD-SOI process.  The lower power APUs such as Brazos and Kabini have been produced by TSMC on their 40 nm and 28 nm processes respectively.

View Full Size

Kaveri will take a slightly different approach here.  It will be produced by GLOBALFOUNDRIES, but it will forego the SOI and utilize a bulk silicon process.  28 nm HKMG is very common around the industry, but few pure play foundries were willing to tailor their process to the direct needs of AMD and the Kaveri product.  GF was able to do such a thing.  APUs are a different kind of animal when it comes to fabrication, primarily because the two disparate units require different characteristics to perform at the highest efficiency.  As such, compromises had to be made.

Continue reading our review of the new AMD Kaveri A8-7600 APU!!

GPUs perform best using high density transistors running at lower speeds, as more parallel units can be packed into a chip.  The lower clock speeds are not necessarily a hindrance to these massively parallel processors, so the focus is primarily that of maximizing transistor count to die space.  CPUs on the other hand seem to work better with more spacing between transistors and being able to run at a higher clock speed without breaking any power and TDP envelopes.  These are generalizations, but the truth of the matter is that CPUs and GPUs are very different beasts when it comes to design considerations at a very low level.

The 28 nm bulk/HKMG process at GF is more of a compromise that is optimized for good performance for both the GPU and CPU.  It offers good enough density and good enough speed to make for a competitive product in the marketplace.  It is a bit more biased towards the GPU portion, as the CPU takes a hit when it starts to run at the higher TDPs.  So at 95 watts, the CPU portion of Kaveri is running as fast as it can while being constrained by TDP concerns.  Even though 28 nm HKMG in theory should offer a little more headroom than the previous 32 nm PD-SOI based process, in the end Kaveri will run oh-so-slightly slower than the previous generation Richland in terms of raw CPU clockspeed.  The GPU portion will run significantly slower than the previous VLIW4 based part in Richland.  These are not necessarily bad things, because the efficiency improvements in both the CPU and GPU offset the clockspeed disadvantages.

 

Steamroller Improvements

Some years back AMD decided to go the CMT (clustered multi-threading) route for multi-threaded efficiency vs. die cost.  The first product to sport these new cores was the Bulldozer based FX-8150.  The results were not very positive.  The part showed some real issues with power consumption, heat production, and single threaded performance.  While it did very well in heavily multi-threaded apps, it was not exactly a winning formula.  The next update to the architecture was Piledriver.  This is found in both the Trinity/Richland line of APUs as well as the FX 8300/6300/4300 series of parts.  Piledriver had some small improvements in performance per clock, but the biggest improvement was power.  Piledriver did not get as hot or pull as much power per clock as did Bulldozer.

View Full Size

Kaveri introduces the new Steamroller architecture for the CPU portion of the APU.  Steamroller is another improvement over Piledriver, especially in terms of performance per clock.  Kaveri is comprised of two Steamroller modules which each contain two cores, so a two module unit can address four threads.  The front end of the module was reworked in a very significant manner to improve not only single thread performance, but also multi-threaded performance as well.

The biggest improvement is the addition of another decoder.  Previous iterations only had one instruction decode unit per module, so each module was limited to one thread per clock.  We can see right off the bat that single threaded performance will suffer because a good portion of the execution units in each core will be waiting for instructions every clock.  Multi-threading also suffers because each module only addresses half of the potential threads vs. core count.

AMD did not just stop there.  They improved essentially every piece of the front end, as well as how the D-caches handle and store data.  The integer and floating point units look to be left untouched, but every other aspect of the chip was touched upon and improved by AMD’s engineers.  The integer and floating point/SIMD units were seemingly fast enough for the job, but they just could not be fed data and instructions effectively and efficiently.

AMD showed us estimates of a peak 20% improvement in performance per clock.  They then told us that in most real-world situations that number is likely to be 10%.  Still, this is a pretty big jump in single thread performance, and it will be able to handle multi-threaded loads more efficiently as well.

Power does not seem to be an issue with this design, though as mentioned in the process section AMD did take a hit in CPU performance in the high TDP range.  With more tweaking of the process we can expect faster parts to be released down the line, but for the now the A10-7850K will be the top SKU for this introduction.  Also, AMD will be offering these products in the 15 watt TDP range later on this year.  That is a pretty significant range of TDPs for essentially a single design.  AMD did disclose all of the power saving features, but they seem to be very comparable to what was introduced with Richland.

 

Definition of Compute Cores

AMD is coming out with a new description for cores with Kaveri.  Compute cores were bandied about during the tech day, and they actually make a bit of sense.  At CES, NVIDIA came out with their “192 core” Tegra K1, but that actually seems a bit of a misnomer as compared to how AMD is defining “cores”.  Those Tegra cores are more akin to SIMD units than standalone cores.  My understanding is that a single SMX unit could be considered a “compute core”.

View Full Size

On the other hand, AMD’s GCN compute clusters can be defined as cores in a more historical sense.  The top end APU has a total of 12 compute cores; 4 of them are the CPU cores in the Steamroller modules, while the other 8 are the GCN units.  Each GCN unit contains 4 x 16 wide vector units (SIMD), a single scalar unit, branch and message unit, a scheduler, texture and texture fetch units, and a bunch of cache.  Each GCN unit has around 146 KB of cache divided between vector registers, a scalar register, local data share, and L1 cache.  It also has such basics as a program counter, which certainly fits in with their traditional definition of cores.  Each GCN unit can theoretically assign new jobs/work to the CPU when needed.  While you certainly can’t boot up an OS from a GCN core, it can do a significant amount of work independently from the CPU.

January 14, 2014 | 05:56 AM - Posted by TinkerToyTech

"Now that it is 2014, AMD has marked off the PCI-E 3.0 checkbox for their OEM partners and have opened the door for future, higher performing FX processors utilizing the FM2+ socket infrastructure."

Editor - should this be FM3+ ??

January 14, 2014 | 06:38 AM - Posted by Josh Walrath

Not that we know of.  FM2+ is what Kaveri is based on, and it supports PCI-E 3.0.  That socket should be around a long time.

January 15, 2014 | 05:35 AM - Posted by Prodeous

I think the issue was when reading "higher performing FX processors utilizing the FM2+ socket infrastructure."

First thing comes to mind is the FX-xxxx series, not the Axx-xxxx series. hence the reference to AM3+. maybe change "FX" to "steamroller" or some other reference outside of "FX"?

January 15, 2014 | 05:37 AM - Posted by Prodeous

well it seems the person was refering to FM3+ not AM3+ socket.. my bad. got confused :P

January 15, 2014 | 05:49 AM - Posted by Josh Walrath

Sources at AMD have stated that FX branded processors will be back, but AM3 is a dead end.  These things point to AMD eventually releasing a FX processor on FM2+.  Now, this FX processor will be a APU and not the traditional FX products we have seen so far.

January 14, 2014 | 06:30 AM - Posted by AMDbumlover (not verified)

why are you the only reviewer who didnt get the a10? also what about comparisons with iris pro, is that still in the works?

January 14, 2014 | 08:52 AM - Posted by renz (not verified)

TR doesn't get A10 either

January 14, 2014 | 12:42 PM - Posted by Ryan Shrout

We definitely were not the only ones to NOT get an A10 part, but we were short on time after CES to go out and source one from a different place, that's for sure.  

January 14, 2014 | 06:44 AM - Posted by gamerk2 (not verified)

CPU wise, not much of an upgrade. Kaveri is sill relegated to low end PC's and laptops.

January 14, 2014 | 07:14 AM - Posted by Anonymous (not verified)

Sadly, this seems true. I understand now why there won't be an FX Steamroller CPU; it's just nowhere near competitive to the Intel counterparts. As a longtime AMD enthusiast, I am saddened by this, but by the same token, if I were the CEO of AMD, it would be hard for me to make a business case to invest the engineering resources to catch up (strictly referring to integer performance). The future seems to be phablets, tablets, and convertibles.

January 14, 2014 | 07:38 AM - Posted by collie (not verified)

but that is kinda the point isnit? race to the bottom.
Most users but the cheapest option possible, that is why there are so many people with atom (pre silvermont) and low end celeron systems, constantly complaining how shity their laptop is.
A good enough cpu {and lets face it modern day low end cpu's are more than powerful enough for 90% of home users} with a good entry level gpu for around $600 will be the biggest sellers. It's just good business. AMD pushing the low end by making said low end systems good enough to play games at very decent quality will encourage the pc gaming ecosystem to once and for all dominate the console.

January 14, 2014 | 02:29 PM - Posted by Anonymous (not verified)

The money(Profits) have never been in High End gamimg, for Intel, at least, and have never been in high end gaming. Intel has always developed for the server, and mainstream market. Intel bases it chips for the enthusiast market arouund its server SKUs, with the server specific functionality removed, or fused off! Intel has always subsidized its gaming SKUs, with its profits from its server, and mainstream sales! AMD can not afford to do this subsidizing, and never really could, to the degree that Intel could! The whole profitable part of the market has shifted to the Moble Tablet/Phone, and low cost laptop/chromebook markets, that is where the money is, and AMD currently can only remain viable as a ongoing concern, by shifting its resources towards the GPU/APU market where it beats Intel, and competes with Nvidia! AMD does provide Intel with plenty of competition in the LOW cost, low to midrange(With Karvei) CPU/APU market! Intel is in deep trouble, in the low cost x86 market, and currently is not a factor in the Mobile CPU/APU market!

Loan AMD half a billion to restart its high end development, and you better have a few billion in reserve for a revolving line of funding, beacuse that is the level of subsidizing that gaming high end development costs!

Put your big money where your pouts are!

January 14, 2014 | 02:35 PM - Posted by Anonymous (not verified)

Edit: gaming high end development
TO: gaming high end CPU development

January 14, 2014 | 08:07 AM - Posted by mAxius

well this cements amd's mobile shift they probably have nothing to compete with intel's performence on the desktop/server side till excavator or after... I will give them credit though they were handed a bunch of lemons and made the best lemonade they could.

January 14, 2014 | 10:54 AM - Posted by Chipshot (not verified)

Congrats to AMD! A $120 Kaveri and siblings that beats higher priced Intel i3/i5 at PCMark8 and even the i7-4770k at games like Battlefield 4.

The last time AMD won in PCMark8 over equivalent Intel CPUs they took the majority of market share and with HSA acceleration, TrueAudio and Mantle, they are likely to do it again.

January 15, 2014 | 02:30 AM - Posted by SAnonymous (not verified)

Erm....which graphs are you looking at? Get off the weed

January 16, 2014 | 09:15 AM - Posted by StewartGraham (not verified)

Probably referring to the 7850k APU

January 14, 2014 | 11:03 AM - Posted by Dreadteir

I've always wondered with AMD how long it would be before they try and make a push to have a GDDR5 memory slot included on Fusion Motherboards. Presumably it would give the built in GCN cores quite a boost in performance and gaming.

January 14, 2014 | 11:47 AM - Posted by Anonymous (not verified)

Intel IPC has been at a virtual standstill since Sandy Bridge (2009) with most of the improvements that do exist coming from tangental features such as dynamic turbo/power modes.

AMD, what the hell are you doing?? Five years. Intel has gifted you five_goddamn_years of making virtually zero IPC progress, and you still can't catch them?

It's like a modern version of the goddamn tortoise and the hare with the hare deliberately doing everything it can to stall on purpose, and yet is still unable to lose the race.

January 14, 2014 | 09:48 PM - Posted by Nilbog

To be fair in the end, the tortoise won the race. Though your point still stands.

I also don't think Intel is stalling for them to catch up (though that would be nice). Intel just plain doesn't care, they consider ARM the threat now. They know AMD isn't going to be catching up for quite a while. Given the way things are going, they can just sit on new stuff until someone actually comes to compete.

January 15, 2014 | 09:53 AM - Posted by Anonymous (not verified)

Intel is not in the technology improving business, they are in the technology Milking business, Milking those customers for Profit business!!

January 16, 2014 | 09:17 AM - Posted by StewartGraham (not verified)

I'm sure AMD would be just fine if they had Intel's budget.

January 14, 2014 | 12:29 PM - Posted by Dude (not verified)

Everyone is focusing on the APU stuff.

I'm surprised that im about to update my 1st gen i7 940 to a haswell 4770k, and the actual performance gain is *only* 2x. For a 5-6 years update cycle, that is quite disappointing :/

quick question btw, is there a haswell cpu around that price range with no gpu but more power? i wish i could drop the GPU entirelly and spend the same or a tad more money on a faster cpu.

January 14, 2014 | 12:44 PM - Posted by Ryan Shrout

You can find some models that don't have the GPU portion but they aren't going to run faster than the models WITH processor graphics.  :/

January 14, 2014 | 12:51 PM - Posted by nashathedog (not verified)

Haswell-e is not due yet, Sandy-e or the newer Ivy-bridge-e are the only options. Because Ivy-e uses the same platform as Sandy-e I decided to go with a 4770k for now and I'm saving for when Haswell-e is released. From what I've read that will have a fairly decent performance improvement. Ivy-e is not much better than Sandy-e just like the Sandy/Ivy-Haswell improvements are minimal.

January 14, 2014 | 12:57 PM - Posted by Anonymous (not verified)

Please do not forget to define AMD's mobile tablet based APUs use of the Mobile/Full versions of openCL, openGL, etc, as Nivida's Tegra K1, now supports the desktop/descrete GPU, full versions of openCL, etc., on Nvidia's new mobile Tegra K1 based platforms! AMD needs to offer Full version support of OpenCL, etc. drivers on any SKUs that compete with the Nivida Tegra K1s! In the Future, with respect to any reviews of Mobile devices built around the AMD Kaveri mobile APUs in competition with the Nvidia Tegra K1 based mobile devices based "APU Type" CPU/GPU systems, please make sure to tell the reader if the mobile device will allow loading of a full Linux distro, and if that mobile divice's CPU/GPU APU, or "APU" type(Nvidia k1, etc) system supports the full openCL, etc. versions of the drivers! I am seriously looking for the K1 based tablet devices, and their K1 based ability to run full desktop style applications via full version openCL, OpenGL driver support, to allow me to run Blender 3d(Light Mesh Modeling) and Gimp for graphics, on a mobile tablet, that can run a full Linux Distro. Full openCL, openGL driver support on a Tablet/Mobile processor(K1, YES), (Kaveri, ??), is big news, and I look forward to your device reviews.

January 14, 2014 | 01:22 PM - Posted by Sean (not verified)

here's the problem, One chip, one price, just like how the Nvidia didn't get the consoles.. all three.. they will not be able to beat the price/performance point. I own two NVidia cards. from a business stand point, this spells bad news for Nvidia and anyone else in the smaller factor market. lots of power, all in one, less Watts. this is not high end. I hope they release high end chips but.. I'm beginning to think its not going to happen in the economy

January 14, 2014 | 03:18 PM - Posted by Anonymous (not verified)

What did the post, that you replied to, have to say, about what you are talking about? the Poster needs full Linux capability, and full OpenCL, OpenGL from a Mobile "APU"/APU type device(tablet) Nvidia's K1, can provide FULL openCL, openGL support, and the poster hopes AMD can provide an equivalent level of support, with its competiting SKUs!
The poster will buy any tablet, even it was made by marvin the martian, if it meets the posters needs, the poster wants a tablet that runs a full linux rooted distro, that can run Blender 3d, and gimp(both reguire Full OpenGL, and Maybe some Full OpenCL support, and run under windows[Hell no], or Linux)! the Poster would prefer a mobile x86 AMD platform(if it has full OpenGL, openCL, etc. support like The K1), but will use the K1 if there is A blender 3d, and Gimp, Arm based build available, to run on the linux distro based ARM platform! The K1 will compete very well in its intended form factor against the AMD kaveri tablet APUs! BUT no Blender 3d, no Gimp, and No full linux rooted distro, on the tablet, NO BUY!

January 16, 2014 | 09:24 AM - Posted by StewartGraham (not verified)

The Poster's statement was poorly written and incredibly verbose making it more difficult to distill relevant content.

January 14, 2014 | 04:15 PM - Posted by Anonymous (not verified)

Good upgrade when compared to the previous generation... it is matching and at times even beat the A10 5800K despite being a lower end part. Yes, it doesn't beat a core i3 in single threaded stuff but the multi-threaded performance is decent.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.