AMD Plans Two GPUs in 2016

Subject: Graphics Cards | November 16, 2015 - 09:34 PM |
Tagged: amd, radeon, GCN

Late last week, Forbes published an editorial by Patrick Moorhead, who spoke with Raja Koduri about AMD's future in the GPU industry. Patrick was a Corporate Vice President at AMD until late 2011. He then created Moor Insights and Strategy, which provides industry analysis. He regularly publishes editorials to Forbes and CIO. Raja Koduri is the head of the Radeon Technologies Group at AMD.

View Full Size

I'm going to be focusing on a brief mention a little more than half-way through, though. According to the editorial, Raja stated that AMD will release two new GPUs in 2016. “He promised two brand new GPUs in 2016, which are hopefully going to both be 14nm/16nm FinFET from GlobalFoundries or TSMC and will help make Advanced Micro Devices more power and die size competitive.”

We have been expecting AMD's Artic Islands to arrive at some point in 2016, which will compete with NVIDIA's Pascal architecture at the high end. AMD's product stack has been relatively stale for a while, with most of the innovation occurring at the top end and pushing the previous top-end down a bit. Two new GPU architectures almost definitely mean that a second one will focus on the lower end of the market, making more compelling products on smaller processes to be more power efficient, cheaper per unit, and include newer features.

Add the recent report of the Antigua architecture, which I assume is in addition to AMD's two architecture announcement, and AMD's product stack could look much less familiar next year.

Source: Forbes

Video News

November 16, 2015 | 09:46 PM - Posted by Anonymous (not verified)

Are we going to get mid and low-end parts first and then the big die part later? At 14 or 16 nm, even a mid-range part may compete with the high-end 28 nm parts. How big would a ~15 billion transistor part be at 14 nm? It would be very interesting if they could fit two of them on an interposer. 2016 definitely should be an interesting year.

November 16, 2015 | 11:32 PM - Posted by anonymous (not verified)

this was/is my line of thinking as well. and not only with big red. pascal's release cadence also gives me pause for concern. i wouldn't be too stoked if the first chips on these new process nodes, are similar to what nvidia did with gm107. would be nice n all, dont get me wrong. but wants to get straight down to business with these new chips, no foreplay necessary

November 16, 2015 | 11:21 PM - Posted by Anonymous (not verified)

AMD needs one discrete mobile SKU that can make its way into more laptops, unless AMD expects that APUs will fill that need! A Interposer based laptop APU SKU with HBM would really provide the bandwidth necessary for an APU to really replace the need for a discrete GPU in laptops! So Zen on an interposer with HBM for laptops, or get a line of discrete GCN mobile APUs that can get AMD more Laptop presence.

November 17, 2015 | 12:42 AM - Posted by Anonymous (not verified)

interposes and HBM Zen... same regurgitated shit you mention in every article... fuck off already

November 17, 2015 | 11:59 AM - Posted by Anonymous (not verified)

Why should I, especially if it irritates you so much, and that in itself makes it worth mentioning every single time. Man even Apple could go for a Zen based desktop/laptop SKUs on an interposer with HBM memory, and lots of motherboard space saved as well as power saving HBM! That HBM alone could feed any GPU's bandwidth needs, so an APU based on an interposer will allow for Zen CPUs( Fabricated on their own process node that best suits desktop CPUs) and GPUs(fabricated on the process node that best suits GPUs) to be joined together on a silicon interposer and wired up with uber wide parallel traces CPU to GPU, as well as GPU and CPU to HBM. All low clocked high effective bandwidth connection fabrics traced out on the interposer's substrate! Those Zen based APUs on an interposer will become used for all AMD's desktop, or mobile APUs eventually!

Yes even for low power APUs that may see the CPU/GPU portion fabricated together using high density design libraries and connected up on an interposer to HBM, allowing AMD to get more circuity packed on a 14nm node than Intel can get using its standard CPU low density design libraries! It's not as if Intel will be clocking its mobile CPU/SOC parts higher so what's the use in using low density high performance design libraries for low power parts anyways! AMD can get the same extra 30% CPU die area space savings out of the 14nm node as it got on 28nm with Carrizo's CPU cores, allowing AMD to use more GPU ACE units on its mobile SKUs! AMD could fit more more CPU cores on the 14nm node, or extra GPU ACE units just by utilizing those high density design libraries normally used for GPU cores, but in a very innovative way applied to CPU cores to get more space savings on any process node. For low power mobile parts, high density design libraries are the way to go!

November 17, 2015 | 03:39 PM - Posted by Anonymous (not verified)

if apple ever goes for another cpu its going to be their own. AMD is dead sooner or later, accept it

November 17, 2015 | 04:27 PM - Posted by Anonymous (not verified)

Not if Apple wants an x86 based CPU for its Mac Pro, or its high end MacBooks. Apple sells to some people that dual boot OSX with windows OSs, and Apple could probably replace its lowest end Macbook Airs with its custom ARM SOC, but Apple's custom ARM SKU is not ready to take on more powerful computing just yet(software reasons mostly)! Apple can't get an x86 license from Intel, or the x86/64 bit license from AMD, but Apple can certainly commission a custom x86 based Zen APU on an interposer from AMD's semi-custom APU division.

Look at Sony's and M$'s console SKUs, the development was paid for by Sony for their PS4, and M$ for their XBONE! Both Sony and M$'s have exclusive rights to use the designs that they commissioned! But neither Sony nor M$ can sell the chips to others or make them on their own, they have to go through AMD by contract, as AMD has an x86(32/64 bit) ISA license. Apple merely needs to reach down into their petty cash drawer and commission AMD's semi-custom division for some custom to Apple's exact specifications AMD Zen APUs on an interposer, and Apple can easily fund any extra R&D necessary to get the job done by AMD! Apple could get the level of product control that Apple needs via a contract with AMD, with the same sorts of exclusivity that M$ and Sony have with their console SKUs, and AMD will be happy to supply Apple's every need for the commissioned parts.

Oh the motherboard space saved using some Zen Cores on an Interposer with HBM, and that APU's graphics getting all the bandwidth it needs to eat all of Intel's lunch! Apple has a bit more cash than even Intel's large Wad-O-Cash! Intel will have to lick Apple's boots hard, and Intel still lacks AMD's graphics IP and GCN ACE units that can happily accelerate GPU compute as well as graphics! Zen based APUs on an interposer with HBM are a very attractive proposition for Apple's thin and light obsession in laptops, and Workstation SKUs!

November 17, 2015 | 04:53 PM - Posted by Anonymous (not verified)

I don't want to feed the mania here, but I wouldn't assume that Apple will stay x86. They have switched architectures several times in the past. Apple probably wants total control over their hardware, and x86 (really AMD64) is blocking them from having that control. They can not build an AMD64 processor themselves, and they can only get an AMD64 processor from Intel, AMD, or Via. If they switch totally to ARM, then can make the processors themselves, or purchase them from any of a large number of ARM license holders. AMD happens to be making an ARM processor based on Zen, which may be similar performance to Zen. Apple could use this as a temporary solution until their in house, high-performance ARM processors are ready. This would still be good for AMD.

November 17, 2015 | 10:47 PM - Posted by Anonymous (not verified)

Yes but AMD is not going to let Apple have their custom K12 micro-architecture, and Apple already has a top Tier ARMv8A architectural license from ARM Holdings. Apple's custom A9's are only using the licensed ARMv8a ISA, everything else about the Apple A9 cores is custom and designed by Apple's P.A. semiconductor folks through Apples acquisition of P.A. semiconductor some years back. Apple owns the Apple A 7/8/9 custom micro-architecture, but ARM Holdings' owns the rights to the ARMv8 ISA that Apple's custom A series micro-architecture runs!

AMD has cooked up a custom K12 micro-architecture of its own but AMD has to License the ARMv8A ISA from ARM Holdings. Why should Apple spend the billions/years to shift away from x86 when Apple can commission a custom x86 Zen based APU from AMD without having to have an x86 license, and AMD invented the x86 64 bit ISA extensions to the x86 ISA, not Intel, Intel only invented the x86 16/32 bit ISA, and Intel tried with its 64 bit Intel Itanium ISA but that flopped and Intel was forced to license AMD's 64 bit x86 ISA extensions! Both Intel and AMD would block Apple from using their ISA/ISA extensions. Apple does not need complete control over the x86 ISA to get a custom part made by AMD, under AMD's x86 license! Hell Jim Keller Worked for Apple and Jim Keller has already designed Zen, so Apple does not have to worry about designing its own x86 core, that's already done for AMD's Zen.

Incidentally Apple beat ARM Holdings(the inventor of the ARMv8A ISA) to market with the first 64 Bit ARMv8a ISA CPU core, the Apple cyclone core. But ARM holdings did not worry about getting beat by Apple to market, because ARM holdings does not build SOCs, they only design/license reference ARM cores, or License the ISA only, and let others build the actual SOCs(Custom or Refrence). ARM Holdings makes licensing revenue from every Apple SOC sold that uses the ARMv8a ISA.

Apple would be better off sales wise sticking to x86, AMD x86 based products, and AMD will build a custom Zen based x86 APU to Apples exact specifications if Apple commissions them. Apple does not need an x86 license to get a custom designed Zen x86 APU for Apple's exclusive use, just like the console makers get from AMD. Sure AMD's K12 is going to be more powerful than Apple's A9's, if AMD includes STM capabilities in its K12 custom ARMv8A running cores, but Apple has billions invested in its x86 based software ecosystem, and it takes a good few years to transfer over to another ISA. Besides it's AMD's graphics that make for a better APU, even more than Zen, or K12 CPU cores, those ACE units can crunch more than graphics workloads, so Apple would do better switching to AMD's x86 Zen based CPU cores, and Get the GCN HSA graphics/compute ACE units to add to the deal.

Sure K12 is going to be a nice custom ARMV8a ISA running core, but the software for OSX would take a few years to convert and why bother right away when a custom Zen based APU on an Interposer can be commissioned by Apple from AMD, for Apple's exclusive use!

November 17, 2015 | 05:15 PM - Posted by Anonymous (not verified)

Things are looking up for AMD. They have almost nowhere to go but up. If they actually declared bankruptcy, their assets would probably be purchased by another company similar to what happened with SGI. SGI went out, but Rackmount (I think: no time to lookup details right now) bout their assets, changed their name to SGI and continued to do business. For the enthusiast, there would probably be little change. AMD's assets would be very valuable, contrary to what a lot of idiots on forums seem to think. How does the performance of AMD chips compare with any others? They are in second place to Intel, but compared to the rest of the industry, they perform very well. There are probably a lot of ARM makers who would want AMD's k12 design. Apple buying them would be a disaster for the PC space though.

Intel, on the other hand, has almost nowhere to go but down. They don't seem to have that good of tech to compete with silicon interposers. They make massive margins off their large die parts. They are large due to a lot of cache and a lot of cores. How will a large on die cache compete with some HBM L4 cache? The other issue is that CPUs just are not that important anymore. If you look at gaming, a lower performance CPU is probably fine. Most games where people actually care about the performance are GPU limited. Intel will have to go into the GPU buisness. Does Intel want to move into the GPU buisness? Not really. If you look at the die sizes, Intel probably gets thousands of dollars for a CPU the size of a high end GPU. The high end GPUs along with some expensive memory and the card they go on sell for ~$700. Intel would sell assimilate amount of silicon for more like $7000 dollars.

This means that Intel will lose their high margins on their server products because of HBM on silicon interposers. On an interposer, you could even place a separate SRAM cache chip. It would be a lot cheaper than a single monolithic die. They will essentially be forced in to the much lower margin GPU market at the same time. Intel doesn't seem to be able to force their way into the mobile market either. All together, it looks like Intel will probably lose their dominant market position and their high margins. Companies like Apple an other computer makers do not like having no choice in suppliers, which means they are not going to like x86/AMD64. ARM may be the architecture to finally kill x86.

November 19, 2015 | 05:45 PM - Posted by Anonymous (not verified)

Probably one of the dumbest comments ever.

January 18, 2016 | 11:51 PM - Posted by Anonymous (not verified)

Not to be rude, but this is a pretty naive understanding of Intel as a company.

Not to mention the Pentium II had an interposer so that's hardly new technology.

November 17, 2015 | 12:59 AM - Posted by anonymous (not verified)

i love you?

November 17, 2015 | 03:01 AM - Posted by renz (not verified)

The problem is not the product. But AMD themselves. No matter how good APU with HBM is if AMD themselves that refuse OEM offer to use their product in laptops then we can't do nothing about it. Take pitcairn for example. Efficiency wise pitcairn is not that bad. In discrete gpu you can often see those 7800 series was about par with nvidia 660 in terms of average power consumption (look at TPU chart for info) if they reworked pitcairn for mobile AMD should have offering that can compete with GK106 solution in laptops. So the problem ia not about 'not having the right product' in laptop segments. It is AMD decision. When Rory Read become AMD CEO one of the change he did was to refuse OEM offer if OEM did not want to sell the product in certain volume. Because he deem that it is not worth the R&D (for designing) if the return were small. Nvidia actually have the same dilemma but they do it anyway because back then they need to gain more market share versus AMD because mobile segment was dominated by AMD. But what they did not expect was Rory Read decision in regards to OEM laptop.

Maybe things will change under Lisa Su but it might not as well. After all Lisa Su was supposed to continue what has been laid by Rory for her. And AMD supposed to focus more on their semi custom business like those console deal.

November 17, 2015 | 01:00 PM - Posted by Anonymous (not verified)

AMD needs some 35 watt only high performance laptop APUs that are tuned to run at 35 watts not 15 watts! AMD could make such a Zen based part on an interposer for gaming laptops, with the CPU cores fabricated on low density high performance design libraries, and the GPU separately fabricated on the normal GPU high density design libraries, with HBM supplying the memory.

The interposer technology will allow for this type of SKU, and the CPU can be wired up to the GPU via the interposer with thousands of parallel traces! Enough parallel traces to allow for full cache coherency and sharing of cache resources between CPU and GPU in addition to data transfer directly between CPU and GPU, in a manner that PCIe could never be able to achieve. PCI does not have enough parallel communication links to ever match what can be traced out in silicon on an interposer, and PCI will never be able to match the effective bandwidth that interposer based parallel traces can achieve at much lower clock rates! That also goes for DDR memory channels, with not enough parallel traces on a PCB to DDR memory.

Having unified memory address space could be sufficient for most transfers CPU to GPU with just some pointer passing going on, but having a wide parallel direct connection between CPU and GPU could eventually lead to the CPU being able to directly dispatch floating point and other work to the GPU along with blocks of data for the GPU to work on and return directly to the CPU for the fastest response high priority tasks when there is not enough CPU based FP/Other resources available for the task. Do not forget AMD's GPU ACE units will be doing more compute tasks in addition to graphics tasks, so the CPU will be less overwhelmed with compute workloads. For VR the fastest response is necessary, or VR will not work properly and cookies could be hurled!

It was the OEM's that mostly shoved Carrizo into 15 watt limited Thin and Light laptops, and the OEM's were again using AMD for the lowest end cheaply done Laptop SKUs. The OEMs could have allowed the users the option of varying the wattage from 15 to 35 watts, and provided the necessary laptop cooling to allow the Carrizo FX8800P to run at 35 watts for maximum performance. As the FX8800P stands even at 15 watts the APU still outperforms some of Intel's U/M series parts that cost much more and have poorer graphics performance than AMD's GCN APU graphics.

November 17, 2015 | 04:45 PM - Posted by Anonymous (not verified)

This stuff that you are going on and on about doesn't happen over night. AMD is certainly working in a Zen CPU for placement on an interposer for their HPC products. This CPU will have to have a different design from the CPU made to work with DDR memory. The Zen that taped out recently will be the version made to work with DDR memory. The HPC APU probably will not be until 2017 or later. HBM is still going to be an expensive product for a while. Laptops with an integrated GPU are actually mostly lower-end products. They could make a high end APU with HBM for a gaming laptop eventually, but I suspect it will be quite a while.

I suspect we will see an APU with directly connected GDDR before we get HBM. That could be done quite easily, but it would require a different die since it would need a different memory controller. AMD probably doesn't have the resources for putting out so many different die variants. Once silicon interposers become more common and cheaper, these can solve some of the issues. You can make modular dies and mix and match different devices to get the features you want. That requires a lot of design work though, and taping out many different die. I don't think AMD or Nvidia will be switching 100% to interposers. This means they still need a lower end product to work with a GDDR variant, which means multiple different die. The high end devices will be on interposers while the mid and low-end will be GDDR variants.

November 17, 2015 | 05:22 PM - Posted by Anonymous (not verified)

No APUs with that power HOG GDDR5! GDDR5 is Nvidia's problem now that AMD and Hynix came up with the HBM process, and AMD/Hynix were even nice enough to release to JEDEC the HBM standard so others could develop HBM processes of their own. Nvidia can get the leftovers from Hynix if there are any remaining while Nvidia gets the rest later from Samsung!

We are talking about HBM coming from AMD/Hynix and their Fab partners, enough Interposer based HBM to keep up with AMD's needs for HBM before any third party can get HBM from AMD/Hynix. If Apple Funds a commissioned Zen Part via AMD then AMD/Hynix will be sure to meet that need also. Apple has the funds to guarantee AMD/Hynix more fab space for its HBM/Interposer needs for anything that Apple commissions. GDDR5 still requires PCB space hogging GDDR5, to go along with GDDR5's massive power appetite. No it's HBM and the space/power savings that will entice Apple over the next few years to make its decision.

Maybe not all HBM yet for discrete GPUs, but if Apple wants to commission a custom Zen based APU on an interposer, well Apple has the cash to make AMD/Hynix supply of extra HBM and interposers assured for any of Apple's custom needs! AMD and Hynix only need to worry about the engineering the rest will be funded by Apple if Apple decides to order a custom Zen APU/HBM interposer based part!

November 17, 2015 | 10:45 AM - Posted by Anonymous (not verified)

Will we ever see Hawaii(Grenada) or Fiji shrunk down to 14/16? I must say that Tonga has never been very impressive, especially price/perf. Cooler/efficient Hawaii/Fiji would be much more exciting than Antigua.

November 17, 2015 | 01:30 PM - Posted by Anonymous (not verified)

ooohhhh fury x is already obsolete

November 17, 2015 | 04:27 PM - Posted by Anonymous (not verified)

If you look at how well AMD GPUs are doing under DX12, it seems that AMD's architecture, even the older chips like Hawaii, do very well. AMD created a very forward looking architecture. Given nvidia's possible problems with asynchronous compute, it seems like Nvidia's chips are more likely to be obsolete.

November 18, 2015 | 01:06 AM - Posted by renz (not verified)

Did nvidia really need Async compute? I mean looking at the benchmark even nvidia DX11 are much faster than AMD DX12

November 18, 2015 | 04:33 AM - Posted by ppi (not verified)

Not a really a good comparison.

First: FuryX is screwed up card. AMD:
1) made HUUUGE shader array, that is pretty much useless, because nobody is going to use this as HPC card, as AMD does not have SW support; and in the process they
2) did not give card enough ROPs and texture units to go with the shaders, so the card is now unbalanced for games; and
3) to give it sufficient performance to not loose badly vs 980Ti, they had to OC it from Nano-style-frequenies

As someone generally in favor of AMD, I have to say I cannot remember benchmark, where FuryX beat 980Ti in non-marginal way. At the same price, this is real no-brainer.

More interesting are 390/290(X) vs 980/970 comparisons. For me also because I am not going to shell out $650 for a toy that will last 2-3 years.

980 is missing there. That would have been very intersting comparison point.

Second: These are OC results. And not factory OC, but OC of factory OC'd cards. Maybe the guy has somewhere methodology how he got there, but OCing is always silicon lottery.

November 17, 2015 | 10:24 PM - Posted by Pixy Misa (not verified)

The R9 Nano is pretty much a tech demo of what we can expect at the 14/16nm node. With die sizes and costs cut by 50% or so, the new midrange cards will match the Nano (though probably with GDDR5 rather than HBM), and the high-end cards should double it.

And that's without any further power savings from the new process node.

I'm hanging on to my 7950 for a little while longer.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

This question is for testing whether you are a human visitor and to prevent automated spam submissions.