Feedback

AMD Spills more Kaveri Beans: AMD APU13

Author:
Subject: Processors
Manufacturer: AMD

More Details from Lisa Su

The executives at AMD like to break their own NDAs.  Then again, they are the ones typically setting these NDA dates, so it isn’t a big deal.  It is no secret that Kaveri has been in the pipeline for some time.  We knew a lot of the basic details of the product, but there were certainly things that were missing.  Lisu Su went up onstage and shared a few new details with us.

View Full Size

Kaveri will be made up of 4 “Steamroller” cores, which are enhanced versions of the previous Bulldozer/Trinity/Vishera families of products.  Nearly everything in the processor is doubled.  It now has dual decode, more cache, larger TLBs, and a host of other smaller features that all add up to greater single thread performance and better multi-threaded handling and performance.   Integer performance will be improved, and the FPU/MMX/SSE unit now features 2 x 128 bit FMAC units which can “fuse” and support AVX 256.

However, there was no mention of the fabled 6 core Kaveri.  At this time, it is unlikely that particular product will be launched anytime soon. 

Click to read the entire article here!

The GCN cores in Kaveri are exactly as advertised, but we were not entirely certain how many were going to be included.  Lisa mentioned that there will be up to 8 GCN compute units, which comes out to be around 512 stream units.  GCN has turned out to be a very flexible and efficient architecture for AMD, and a lot of stream units can be fit per mm-squared.  These GCN units are also DX 11.2 compliant.  Lisa did not give us the speeds that we would see running, but they did show off performance running BF4 at medium quality settings and 1080P.  The Kaveri chip was running between 24 and 30 fps, compared to an Intel i7-4770K paired with the GT 630 graphics card.  The Intel/NVIDIA combination was providing 12 to 14 fps in performance in the same scene with the same quality settings.

View Full Size

Kaveri will be the first HSA enabled part to be released (though the specification is not yet finalized).  It will contain the hUMA and HQ specifications that we have been made aware of over the past few years.  Kaveri will also be the first APU to support shared physical and virtual memory spaces between the CPU and GPU.  In addition, this chip will feature the TrueAudio functionality introduced with the latest AMD standalone GPUs (Hawaii and Bonaire).  This DSP technology will accelerate audio functions when combined with the necessary middleware.  From all indications, adding this functionality entails a very small die size hit.

AMD also talked about Mantle support for Kaveri.  This low-level rendering API will give Kaveri a nice boost in performance with games that support Mantle technology.  While it will not be faster under all situations as a standalone budget or midrange card, it will be a free performance boost for people who game on their APU. 

Kaveri will also be the first PCI-E 3.0 APU from AMD.  While rumor had it that Trinity supported PCI-E 3.0, it was never certified. AMD is pushing PCI-E 3.0 with Kaveri and the latest FM2+ motherboards.

View Full Size

Software support for heterogeneous computing is on the rise.  AMD detailed the progress they have made and what products are coming up that will help programmers and developers integrate heterogeneous features into their products.  Throughout the next year, more tools and features will be released so that it will become more and more transparent to programmers to implement parallel processing in appropriate scenarios.

Kaveri at the top end will feature around 856 GFlops of processing power.  This is well up from the 779 GFlops of the A10-6800K.  Kaveri also throws in support for up to 32 GB of memory that is shared between the CPU and GPU.  There should be a smaller latency hit for parallel loads as compared to the previous memory setup with Trinty/Richland.

View Full Size

Kaveri is going to be a very big product for AMD.  While it likely will not compete in CPU performance with the i5 and i7 4000 series from Intel, it will certainly be a big jump up in terms of graphics performance.  Nobody is sure where CPU performance will land, but it will certainly be a big improvement over the current iterations present in Vishera and Richland based products.  All indications point to it eclipsing the very competent Intel Iris Pro graphics component that is found only in certain Ultrathin notebooks, including Apple's new MacBook Pro 15'.  It is also the first shot across the bow of the industry when it comes to serious heterogeneous computing.  AMD is doing their best to make sure the software ecosystem is there, and the HSA group has gained a lot of momentum with the addition of Qualcomm and Samsung to the group (not to mention ARM and MediaTek being some of the original founders).

Lisa was pretty emphatic that they would be shipping product in 2013.  This may be true, but we really have no idea exactly how much product will be shipped.  They could certainly ship a couple thousand APUs at the very end of December and they would be keeping their word.  Launch looks to be January 14, 2014, but availability is not known at this time.  Press samples will probably be available some 3 weeks before launch. 

This is a very significant moment for AMD, and one very much akin to what the original Athlon 64 meant for the company.  They are treading on new territory here, and their implementation is logical and open to the industry.  Kaveri may not be an Intel killer, but it will certainly insure the survival of AMD if it performs as expected.


November 11, 2013 | 10:03 PM - Posted by AMDbumlover (not verified)

can't wait for a "balanced" review from PcPer...

November 11, 2013 | 11:45 PM - Posted by AMDbumlover (not verified)

Because, clearly, they cannot report on something without the anti-fanboys complaining.
Here's a cup of water.
Now shut the full cup.

November 12, 2013 | 12:02 AM - Posted by AMDbumlover (not verified)

I am so bipolar, here I am having a discussion with myself!

November 12, 2013 | 12:48 AM - Posted by Josh Walrath

But at least it is entertaining!

November 12, 2013 | 04:07 AM - Posted by JohnGR (not verified)

Who is winning?

November 12, 2013 | 01:43 PM - Posted by Anonymous (not verified)

The consumer.

November 11, 2013 | 10:04 PM - Posted by derz

You did not disappoint Josh. Nice recap article.

November 11, 2013 | 10:09 PM - Posted by jatrias (not verified)

Thank you,Josh!

November 11, 2013 | 10:25 PM - Posted by johnc (not verified)

is this a FX CPU replacement? If this is will we see some SLI boards?

November 11, 2013 | 10:48 PM - Posted by Josh Walrath

I am guessing we will find out more about those soon.  They won't necessarily be FX replacements, especially since the core count does not go above 2 modules.  It will be interesting to see where AMD places these, as well as if they will get a SLI license for the processor from NVIDIA.

November 13, 2013 | 09:17 AM - Posted by nabokovfan87

is it at all possible for you guys to have someone on from AMD to go over this and possibly peg them about fx parts or a 8370 part?

November 13, 2013 | 10:45 AM - Posted by Josh Walrath

I'll mention it to Ryan and see what we can dig up.

November 12, 2013 | 12:03 AM - Posted by snook

nice write up josh. you always break it down so that I can get my head around it. thanks

November 12, 2013 | 12:56 AM - Posted by SteeloYangster

I'm very very excited and interested in the Kaveri chips. I've had a lot of experience with the first and second gen APU's and can't wait to see programs utilizing HSA/hUMA. I hope they catch on!

November 12, 2013 | 02:43 AM - Posted by capawesome9870

Crystal ball time (dramatic sounding) **dun - dun - Duuunnn**
how long before before they start pushing out chips similar to the Xbox One and Play Station 4 to the PC market. 3-4 modules with 16+ GCN Compute Units.

November 12, 2013 | 04:33 AM - Posted by Melvar

How many Jaguar cores will fit in the same die area as a steamroller module?

I had assumed that they needed to go with the much weaker Jaguar cores on the Xbone & PS4 in order to have enough transistors left for that level of graphics.

November 12, 2013 | 10:24 AM - Posted by Josh Walrath

We do not yet know the die size of Kaveri, but it will be interesting to compare/contrast size vs. performance on the XBOne and PS4 units.  My gut feeling here is that MS and Sony were hoping to go really wide on the CPU but run it at lower speeds, so TDP headroom can be afforded to the graphics portion.

November 13, 2013 | 01:55 AM - Posted by PsiAmp

Theoretically it will make sense as soon as 20nm is available. So 1H 2015 is quite possible for a new APU with XBO chip performance.

CPU wise 4 cores @3.7 GHz Kaveri is faster then 8 cores @1.8 GHz Jaguar in PS4/XBO.

November 12, 2013 | 03:11 AM - Posted by Gadgety

"BF4 at medium quality settings and 1080P. The Kaveri chip was running between 24 and 30 fps, compared to an Intel i7-4770K paired with the GT 630 graphics card. The Intel/NVIDIA combination was providing 12 to 14 fps in performance in the same scene with the same quality settings"

Wow! Better than I expected. The most interesting chip launch in a long while to me. Depending on price, it'll hopefully be great for building a value for money entry gaming rig for my kid, and of course for laptops.

November 12, 2013 | 03:27 AM - Posted by capawesome9870

what is the memory bandwidth for these new Kaveri chips?

i am thinking it is dual channel DDR3-2133 which which would give 34GB/s (17GB/s per channel).

when are these APUs going to move to dedicated DDR5 soldered to the board to give 70+GB/s? Or even better DDR5 memory sticks made by AMD for their APUs.

fun fact the 7770 (640 stream processors 1.24TFlops) and 7750 (512 stream processors at 819GFlops) have a 72GB/s memory bandwidth.

November 12, 2013 | 10:29 AM - Posted by Josh Walrath

Memory bandwidth is at a premium with integrated graphics.  We do not yet know all the details behind this APU, but it seems like AMD did spend a lot of time on the memory controller to squeeze every ounce of bandwidth out of it.  Also, hopefully low latency as well.

There was talk and some whitepapers at one time about Kaveri and GDDR-5.  They even released a specification for a DIMM design for GDDR-5 that was nearly identical to DDR-4.  It would be a great boon for both CPU and GPU performance on this platform, but getting the rest of the industry behind it was apparently troublesome.

November 12, 2013 | 11:21 AM - Posted by Principle (not verified)

Yeah, there may be an embedded and perhaps mobile variant with GDDR5 where 4GB is the normal amount of RAM and never gets upgraded. I don't think you will see it for desktops at all, and the next iteration will just use DDR4.

The bandwidth was much improved with Kaveri, and with DDR3-1600 it was able to achieve something like 17GB/s compared to a Richland APU with DDR3-2133 only getting about 12GB/s. So I cannot wait to see what DDR3-2133 does for Kaveri. I would also assume it comes standard with an 1866 controller, but haven't seen any benches on engineering samples with 1866.

November 12, 2013 | 01:35 PM - Posted by capawesome9870

here is to hoping that AMD does do DDR5 on the APUs. the faster memory the APUs get the better games play.

November 12, 2013 | 05:46 AM - Posted by ET3D (not verified)

856 GFlops is around 10% more than 779. That's not a huge difference.

November 12, 2013 | 10:31 AM - Posted by Josh Walrath

It isn't a huge difference in theoretical performance, but in real world applications it is going to be a lot more efficient.  I think perf on graphics is going to exceed that 10% difference.

November 12, 2013 | 11:17 AM - Posted by Principle (not verified)

Yes, those are theoretical numbers, the A10-6800K never actually hit that. Kaveri with Steamroller, HUMA and GCN will come closer to the actual theoretical value.

November 12, 2013 | 06:09 AM - Posted by Ploutonas (not verified)

I have a bad feeling about it, I hope it's not a bulldozer 2

I am in the process to upgrade (with 4770k now), but I will give them a chance by waiting 1 month for some reviews... Only if its true, I may buy it.

November 12, 2013 | 10:32 AM - Posted by Josh Walrath

It will perform better than current Richland/Vishera products per clock.  The question is... how much?  I don't think that AMD is going to move past Intel in terms of IPC, but it will be a pretty big improvement over the previous AMD parts.  4770K is still a really strong CPU, especially if you are going to use standalone graphics.

November 12, 2013 | 08:15 AM - Posted by Anonymous (not verified)

How many execution ports per core(interger,Floating Point,etc), on the CPU? How deep is the execution pipline(Per Core)? how many integer and floating poin instructions can each CPU core retire per clock? Does the CPU have 4 fat cores, with no shared execution units, and a fat on die BUS between the CPU and the GPU, and if so, then this could be the start of something good. Great about the mantle support, but what about the OpenCL and OpenGL support, What is AMD's commitment to continued support of OpenGL/OpenCL (As many open source software apps use openGL, and I hope soon openCL!) Will AMD help the open source community intigrate Mantle support into open source applications such as Blender/Gimp/ETC.? I am looking forward to PCPER's review, and I hope it reads like the best whitepapers from the Hot Chips symposium, and please, include any links to any white papers and processor data sheets, that you may come across in your research. Please, in the future, do more Blender benchmarks, as well as other open source graphics software benchmarks! I think AMD is on to something great with HSA, and combineing CPUs with GPUs for gaming as well as graphics.

November 12, 2013 | 10:42 AM - Posted by Josh Walrath

That's a lot of questions... many of the answers are still unknown, even after fun events like Hotchips where information on Steamroller was presented.  What we do know is this...

It will have fewer shared resources than previous Vishera/Trinity processors.  Dual decode units is the really big deal here, and will more effectively feed the int pipes.  FMACs are again shared, but flow is supposedly a lot better with dual decode and retire.  The memory controller and crossbars are all new and improved, so communication should be a lot better.  AMD expects to see a big boost in IPC and multi-threaded efficiency.  Remember, with previous cores, they could really only handle one thread per clock per module due to single decode and other decisions in the pipeline.  This updated design *should* allow for two threads per clock per module, so again we will see a nice boost in multi-core efficiency.

This is very similar to what AMD did in forging ahead with 64 bit computing with AMD64 (that Intel adopted and called EM64T- crosslicensing it from AMD).  AMD has a good thing going with the HSA Foundation with some really large companies behind it in the Android/Linux world.

November 12, 2013 | 11:24 AM - Posted by Principle (not verified)

Wow, are you on crack? This was opening day at APU13, not a benchmarkfest. Why would you even question AMD's commitment to OpenCL/GL?? It has always been the best, and this article even says they continue their top notch support for it, for heterogenous support across all platforms, or did you not read the article?

November 12, 2013 | 03:39 PM - Posted by Anonymous (not verified)

Supply the links to any info you have on this APU from APU13, and any AMD statments concerning OpenCL/GL as, "heterogenous support across all platforms" is not specific enough! Do you mean that OpenCL/GL calls are going to be converted into HSAIL and executed on GPUs as well as CPUs! My current laptop has an Intel CPU and an AMD descrete GPU, Blender will peg out all 8 threads on the CPU for rendering, but Blender is very slow with high polygon meshes in 3D edit mode, especially mesh relaxation/editing on high polygon count meshes, and I would love to see mesh editing/refactoring take advantage of all the CPU and GPU power, simultaneously, through HSA aware hardware. It is great that AMD is working with and leading the development of HSA hardware CPU/GPU, but how long will it take the HSA tools and developments to be implemented in the open source codebase?

---I DO NOT SMOKE HUBBA ROCKS, in high polygon mesh Hell---

November 12, 2013 | 04:08 PM - Posted by Anonymous (not verified)

P.S Intel, Nvidia and AMD, etc. are corporations, and not salvation religions, and I will always question their commitiment to the maintaining of open and standard openGL/CL drivers, with all this talk of proprietary Cuda/Mantle/whatever drivers/APIs!!!! Now let the, NDAs be damned, and the benchmarkfest begin, you dogmatic fanboy, high priest of the unholy, do not question my right to question!!!

November 12, 2013 | 12:13 PM - Posted by Anonymous (not verified)

Josh, how would this CPU do as a workhorse? Specifically Lightroom and Photoshop compared to current i7's?

I'm delaying upgrading after reading this, this CPU looks great on paper.

Did AMD happen to mention pricing?

November 12, 2013 | 01:01 PM - Posted by Josh Walrath

You are going to be limited to 4 threads on this APU in most traditional workloads.  So the i7 4770K will beat it.  Now, do you know if those applications will utilize HSA in any significant way?  OpenCL?  Java 8/9?  If you are ready to buy now, I don't think you will be doing a disservice to yourself for not waiting on Kaveri, because there is simply a lot of work to be done to get HSA/hUMA mainstream in terms of programming.

No pricing was mentioned, but I really doubt that it will move far from what we are seeing currently from APUs.

November 12, 2013 | 12:19 PM - Posted by Anonymous (not verified)

By the way, who won the Halloween give away contest?

November 12, 2013 | 01:05 PM - Posted by Coupe

These beans are good Josh, but I demand more of them!

PS: Great write up as usual.

PPS: beans and Java announcing contributor status, coincidence or is Josh an oracle. Kappa

November 12, 2013 | 01:26 PM - Posted by Josh Walrath

Purely a coincidence on the beans.  I'm not that smart.  Or clever.

November 12, 2013 | 07:01 PM - Posted by Anonymous (not verified)

AMD's top bin Kaveri parts max out at 856 GFLOPS, compared to AMD 2012 target of 1050 GFLOPS, so what do you think led to the reduction? Is this dew to yield problems at AMD's FAB partner/s or other technical issues (1).

(1)

http://semiaccurate.com/2013/11/12/amd-misses-expectations-kaveri/

November 12, 2013 | 08:17 PM - Posted by Anonymous (not verified)

If I had only pressed the reload button while I was typing this, Scott already has this covered!

November 13, 2013 | 06:38 AM - Posted by Anonymous (not verified)

BSN has some benchmarks of Kaveri ES and it performs as a Core i5

http://www.brightsideofnews.com/news/2013/11/4/what-to-expect-from-kaver...

November 14, 2013 | 04:09 PM - Posted by Daniel Meier Nielsen (not verified)

So a small question here to you fine folks. Does AMD have a strong Desktop CPU in the pipeline?

I'm ordering a i7 4770k, and just wanted some input if its a bad time to do so. And if AMD has anything coming out soon that would rival it.

November 15, 2013 | 12:44 AM - Posted by meganerd

This is still a fine time to buy an Intel. None of this matters if you are looking at putting together a desktop PC, with an i7 4770k paired with a discreet GPU. The i7 is still a better CPU. This will show up on the low end system leaderboard and possibly the mid range one as long as you can pair a moderately powerful discreet (AMD) GPU with it.

Even if HSA performs as announced, it will still be some time before the software catches up in supporting it.

November 15, 2013 | 06:47 AM - Posted by Daniel Nielsen (not verified)

Right now i have a i5 2500k. And with the coming consoles and hopefully more multithread heavy games, i feel that the 2500k won't be cutting it in the long run.

So the 4770k seems like a nice step up. I have a GTX 680 right now, might switch it for a third party R9 290/290x. So the integrated graphics won't be used in my case.

The main reason, why im hesitant about it though, is that AMD's 8 core FX processors have shown some really good benchmarks in Battlefield 4, utilizing all he cores, and are only a few digits why of beating the more expensive intel parts. And i'm not sure if this will be a trend now.

November 15, 2013 | 06:47 AM - Posted by Daniel Nielsen (not verified)

Right now i have a i5 2500k. And with the coming consoles and hopefully more multithread heavy games, i feel that the 2500k won't be cutting it in the long run.

So the 4770k seems like a nice step up. I have a GTX 680 right now, might switch it for a third party R9 290/290x. So the integrated graphics won't be used in my case.

The main reason, why im hesitant about it though, is that AMD's 8 core FX processors have shown some really good benchmarks in Battlefield 4, utilizing all he cores, and are only a few digits why of beating the more expensive intel parts. And i'm not sure if this will be a trend now.

November 15, 2013 | 06:47 AM - Posted by Daniel Nielsen (not verified)

Right now i have a i5 2500k. And with the coming consoles and hopefully more multithread heavy games, i feel that the 2500k won't be cutting it in the long run.

So the 4770k seems like a nice step up. I have a GTX 680 right now, might switch it for a third party R9 290/290x. So the integrated graphics won't be used in my case.

The main reason, why im hesitant about it though, is that AMD's 8 core FX processors have shown some really good benchmarks in Battlefield 4, utilizing all he cores, and are only a few digits why of beating the more expensive intel parts. And i'm not sure if this will be a trend now.

November 15, 2013 | 06:48 AM - Posted by Daniel Nielsen (not verified)

Right now i have a i5 2500k. And with the coming consoles and hopefully more multithread heavy games, i feel that the 2500k won't be cutting it in the long run.
So the 4770k seems like a nice step up. I have a GTX 680 right now, might switch it for a third party R9 290/290x. So the integrated graphics won't be used in my case.
The main reason, why im hesitant about it though, is that AMD's 8 core FX processors have shown some really good benchmarks in Battlefield 4, utilizing all he cores, and are only a few digits why of beating the more expensive intel parts. And i'm not sure if this will be a trend now.

March 11, 2014 | 04:38 PM - Posted by Buster (not verified)

I am extremely impressed with your writing skiills and also
with the layout on your weblog. Is this a paid theme oor did you customize
it yourself? Anyway keep up the nice quality writing, it is rare to see a nice blog like this one nowadays.

Here is myy blog ... increase twitter followers

November 15, 2013 | 07:40 PM - Posted by ericore (not verified)

"This is a very significant moment for AMD"

No, its not as significant as you media outlets make it to be. HSA has not proved itself, yet you welcome it like a king. Let's not act so desperate for competition; beggars.
If you knew anything about APUs, you'd know the introduction of DDR4 is far more important than HSA for AMD. That and there next FX CPU which will be in 2015 Q3-Q4.

January 4, 2014 | 06:11 AM - Posted by StewartGraham (not verified)

Um... Josh knows quite a bit about APU's. The FX line will likely fade into obscurity as AMD has made their focus on HSA/hUMA/Mantle very clear.

November 16, 2013 | 03:13 PM - Posted by Anonymous (not verified)

I'd love for these apu's to become suitable for use in a gaming desktop paired with discrete graphics cards...but here's what I think is necessary first.

1, game engines have to become HSA enabled so that they can run complex code like game Ai containing both serial and parallel on the apu.

2, Amd's cpu cores have to become more powerful in terms of IPC or the Mantle api has to become widely adopted so games are less heavy on the cpu.

3, I'd like to see a hybrid of the xbox one and ps4 apu's...the hybrid would have the embedded RAM that the xbox apu has but this would be used exclusively as a large low latency cache by the cpu portion, and I'd like the apu to have access to GDDR5 as with the ps4 apu to provide the bandwidth for the gpu portion.

In fact I'm going to send this wish list to santa.

November 19, 2013 | 08:39 AM - Posted by ezjohny

Question with what graphic card could you team up with Kaveri. Hope you could team up with a AMD 7870 or AMD Radeon HD 8950 series card.

November 23, 2013 | 02:25 AM - Posted by ezjohny

sorry I posted already

December 14, 2013 | 03:29 AM - Posted by Anonymous (not verified)

I can imagine a 6 core Carizo with 832 SP units
on 20 nm
that is called the X FX

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.