Author:
Subject: Processors
Manufacturer: AMD

Lower Power, Same Performance

AMD is in a strange position in that there is a lot of excitement about their upcoming Zen architecture, but we are still many months away from that introduction.  AMD obviously needs to keep the dollars flowing in, and part of that means that we get refreshes now and then of current products.  The “Kaveri” products that have been powering the latest APUs from AMD have received one of those refreshes.  AMD has done some redesigning of the chip and tweaked the process technology used to manufacture them.  The resulting product is the “Godavari” refresh that offers slightly higher clockspeeds as well as better overall power efficiency as compared to the previous “Kaveri” products.

7860K_01.jpg

One of the first refreshes was the A8-7670K that hit the ground in November of 2015.  This is a slightly cut down part that features 6 GPU compute units vs. the 8 that a fully enabled Godavari chip has.  This continues to be a FM2+ based chip with a 95 watt TDP.  The clockspeed of this part goes from 3.6 GHz to 3.9 GHz.  The GPU portion runs at the same 757 MHz that the original A10-7850K ran at.  It is interesting to note that it is still a 95 watt TDP part with essentially the same clockspeeds as the 7850K, but with two fewer GPU compute units.

The other product being covered here is a bit more interesting.  The A10-7860K looks to be a larger improvement from the previous 7850K in terms of power and performance.  It shares the same CPU clockspeed range as the 7850K (3.6 GHz to 3.9 GHz), but improves upon the GPU clockspeed by hitting around 800 MHz.  At first this seems underwhelming until we realize that AMD has lowered the TDP from 95 watts down to 65 watts.  Less power consumed and less heat produced for the same performance from the CPU side and improved performance from the GPU seems like a nice advance.

amd_cool_stack.png

AMD continues to utilize GLOBALFOUNDRIES 28 nm Bulk/HKMG process for their latest APUs and will continue to do so until Zen is released late this year.  This is not the same 28 nm process that we were introduced to over four years ago.  Over that time improvements have been made to improve yields and bins, as well as optimize power and clockspeed.  GF also can adjust the process on a per batch basis to improve certain aspects of a design (higher speed, more leakage, lower power, etc.).  They cannot produce miracles though.  Do not expect 22 nm FinFET performance or density with these latest AMD products.  Those kinds of improvements will show up with Samsung/GF’s 14nm LPP and TSMC’s 16nm FF+ lines.  While AMD will be introducing GPUs on 14nm LPP this summer, the Zen launch in late 2016 will be the first AMD CPU to utilize that advanced process.

Click here to read the entire AMD A10-7860K and A10-7670K Review!

AMD Pre-Announces 7th Gen A-Series SOC

Subject: Processors | April 5, 2016 - 06:30 AM |
Tagged: mobile, hp, GCN, envy, ddr4, carrizo, Bristol Ridge, APU, amd, AM4

Today AMD is “pre-announcing” their latest 7th generation APU.  Codenamed “Bristol Ridge”, this new SOC is based off of the Excavator architecture featured in the previous Carrizo series of products.  AMD provided very few hints as to what was new and different in Bristol Ridge as compared to Carrizo, but they have provided a few nice hints.

br_01.png

They were able to provide a die shot of the new Bristol Ridge APU and there are some interesting differences between it and the previous Carrizo. Unfortunately, there really are no changes that we can see from this shot. Those new functional units that you are tempted to speculate about? For some reason AMD decided to widen out the shot of this die. Those extra units around the border? They are the adjacent dies on the wafer. I was bamboozled at first, but happily Marc Sauter pointed it out to me. No new functional units for you!

carrizo_die.jpg

This is the Carrizo shot. It is functionally identical to what we see with Bristol Ridge.

AMD appears to be using the same 28 nm HKMG process from GLOBALFOUNDRIES.  This is not going to give AMD much of a jump, but from information in the industry GLOBALFOUNDRIES and others have put an impressive amount of work into several generations of 28 nm products.  TSMC is on their third iteration which has improved power and clock capabilities on that node.  GLOBALFOUNDRIES has continued to improve their particular process and likely Bristol Ridge is going to be the last APU built on that node.

br_02.png

All of the competing chips are rated at 15 watts TDP. Intel has the compute advantage, but AMD is cleaning up when it comes to graphics.

The company has also continued to improve upon their power gating and clocking technologies to keep TDPs low, yet performance high.  AMD recently released the Godavari APUs to the market which exhibit better clocking and power characteristics from the previous Kaveri.  Little was done on the actual design, rather it was improved process tech as well as better clock control algorithms that achieved these advances.  It appears as though AMD has continued this trend with Bristol Ridge.

We likely are not seeing per clock increases, but rather higher and longer sustained clockspeeds providing the performance boost that we are seeing between Carrizo and Bristol Ridge.  In these benchmarks AMD is using 15 watt TDP products.  These are mobile chips and any power improvements will show off significant gains in overall performance.  Bristol Ridge is still a native quad core part with what looks to be an 8 module GCN unit.

br_03.png

Again with all three products at a 15 watt TDP we can see that AMD is squeezing every bit of performance it can with the 28 nm process and their Excavator based design.

The basic core and GPU design look relatively unchanged, but obviously there were a lot of tweaks applied to give the better performance at comparable TDPs.  

AMD is announcing this along with the first product that will feature this APU.  The HP Envy X360.  This convertible tablet offers some very nice features and looks to be one of the better implementations that AMD has seen using its latest APUs.  Carrizo had some wins, but taking marketshare back from Intel in the mobile space has been tortuous at best. AMD obviously hopes that Bristol Ridge in the sub-35 watt range will continue to show fight for the company in this important market.  Perhaps one of the more interesting features is the option for the PCIe SSD.  Hopefully AMD will send out a few samples so we can see what a more “premium” type convertible can do with the AMD silicon.

br_04.png

The HP Envy X360 convertible in all of its glory.

Bristol Ridge will be coming to the AM4 socket infrastructure in what appears to be a Computex timeframe.  These parts will of course feature higher TDPs than what we are seeing here with the 15 watt unit that was tested.  It seems at that time AMD will announce the full lineup from top to bottom and start seeding the market with AM4 boards that will eventually house the “Zen” CPUs that will show up in late 2016.

Source: AMD

Rumor: Polaris Is the next AMD Radeon Core Architecture

Subject: Graphics Cards | December 31, 2015 - 01:41 PM |
Tagged: rumor, report, radeon, Polaris, graphics card, gpu, GCN, amd

A report claims that Polaris will succeed GCN (Graphics Core Next) as the next AMD Radeon GPU core, which will power the 400-series graphics cards.

AMD-Polaris.jpg

Image via VideoCardz.com

As these rumors go, this is about as convoluted as it gets. VideoCardz has published the story, sourced from WCCFtech, who was reporting on a post with supposedly leaked slides at HardwareBattle. The primary slide in question has since been pulled, and appears below:

slide.png

Image via HWBattle.com

Of course the name does nothing to provide architectural information on this presumptive GCN replacement, and a new core for the 400-series GPUs was expected anyway after the 300-series was largely a rebranded 200-series (that's a lot of series). Let's hope actual details emerge soon, but for now we can speculate on mysterious tweets from certain interested parties:

 

Source: VideoCardz

AMD GPU Architectures pre-GCN Are Now Legacy

Subject: Graphics Cards | November 26, 2015 - 03:09 PM |
Tagged: amd, graphics drivers, GCN, terascale

The Graphics Core Next (GCN) architecture is now a minimum requirement for upcoming AMD graphics drivers. If your graphics card (or APU) uses the TeraScale family of microarchitectures, then your last expected WHQL driver is AMD Catalyst 15.7.1 for Windows 7, 8.x, and 10. You aren't entirely left out of Radeon Software Crimson Edition, however. The latest Crimson Edition Beta driver is compatible with TeraScale, but the upcoming certified one will not be.

AMD-Catalyst.jpg

GCN was introduced with the AMD Radeon HD 7000 series, although it was only used in the Radeon HD 7700 series GPUs and above. The language doesn't seem to rule out an emergency driver release, such as if Microsoft breaks something in a Windows 10 update that causes bluescreens and fire on older hardware, but they also don't say that they will either. NVIDIA made a similar decision to deprecate pre-Fermi architectures back in March of 2014, which applied to the release of GeForce 343 Drivers in September of that year. Extended support for NVIDIA's old cards end on April 1st, 2016.

I wonder why AMD chose a beta driver to stop with, though. If AMD intended to support TeraScale with Crimson, then why wouldn't they keep it supported until at the first WHQL-certified version? If they didn't intend to support TeraScale, then why go through the effort of supporting it with the beta driver? This implies that AMD reached a hurdle with TeraScale that they didn't want to overcome. That may not be the case, but it's the first thing that comes to my mind none-the-less. Probably the best way to tell is to see how people with Radeon HD 6000-series (or lower-end 7000/8000-series) cards work with Radeon Software Crimson Beta.

Likely the last drivers that users with Radeon HD 6000-series graphics need are 15.7.1 or Radeon Software Crimson Edition Beta. We will soon learn which of the two will be best long-term.

Or, of course, you can buy a newer GPU / APU when you get a chance.

Source: AMD

AMD Plans Two GPUs in 2016

Subject: Graphics Cards | November 16, 2015 - 09:34 PM |
Tagged: amd, radeon, GCN

Late last week, Forbes published an editorial by Patrick Moorhead, who spoke with Raja Koduri about AMD's future in the GPU industry. Patrick was a Corporate Vice President at AMD until late 2011. He then created Moor Insights and Strategy, which provides industry analysis. He regularly publishes editorials to Forbes and CIO. Raja Koduri is the head of the Radeon Technologies Group at AMD.

amd-gaming-evolved.png

I'm going to be focusing on a brief mention a little more than half-way through, though. According to the editorial, Raja stated that AMD will release two new GPUs in 2016. “He promised two brand new GPUs in 2016, which are hopefully going to both be 14nm/16nm FinFET from GlobalFoundries or TSMC and will help make Advanced Micro Devices more power and die size competitive.”

We have been expecting AMD's Artic Islands to arrive at some point in 2016, which will compete with NVIDIA's Pascal architecture at the high end. AMD's product stack has been relatively stale for a while, with most of the innovation occurring at the top end and pushing the previous top-end down a bit. Two new GPU architectures almost definitely mean that a second one will focus on the lower end of the market, making more compelling products on smaller processes to be more power efficient, cheaper per unit, and include newer features.

Add the recent report of the Antigua architecture, which I assume is in addition to AMD's two architecture announcement, and AMD's product stack could look much less familiar next year.

Source: Forbes
Manufacturer: PC Perspective

To the Max?

Much of the PC enthusiast internet, including our comments section, has been abuzz with “Asynchronous Shader” discussion. Normally, I would explain what it is and then outline the issues that surround it, but I would like to swap that order this time. Basically, the Ashes of the Singularity benchmark utilizes Asynchronous Shaders in DirectX 12, but they disable it (by Vendor ID) for NVIDIA hardware. They say that this is because, while the driver reports compatibility, “attempting to use it was an unmitigated disaster in terms of performance and conformance”.

epic-2015-ue4-dx12.jpg

AMD's Robert Hallock claims that NVIDIA GPUs, including Maxwell, cannot support the feature in hardware at all, while all AMD GCN graphics cards do. NVIDIA has yet to respond to our requests for an official statement, although we haven't poked every one of our contacts yet. We will certainly update and/or follow up if we hear from them. For now though, we have no idea whether this is a hardware or software issue. Either way, it seems more than just politics.

So what is it?

Simply put, Asynchronous Shaders allows a graphics driver to cram workloads in portions of the GPU that are idle, but not otherwise available. For instance, if a graphics task is hammering the ROPs, the driver would be able to toss an independent physics or post-processing task into the shader units alongside it. Kollock from Oxide Games used the analogy of HyperThreading, which allows two CPU threads to be executed on the same core at the same time, as long as it has the capacity for it.

Kollock also notes that compute is becoming more important in the graphics pipeline, and it is possible to completely bypass graphics altogether. The fixed-function bits may never go away, but it's possible that at least some engines will completely bypass it -- maybe even their engine, several years down the road.

I wonder who would pursue something so silly, whether for a product or even just research.

But, like always, you will not get an infinite amount of performance by reducing your waste. You are always bound by the theoretical limits of your components, and you cannot optimize past that (except for obviously changing the workload itself). The interesting part is: you can measure that. You can absolutely observe how long a GPU is idle, and represent it as a percentage of a time-span (typically a frame).

And, of course, game developers profile GPUs from time to time...

According to Kollock, he has heard of some console developers getting up to 30% increases in performance using Asynchronous Shaders. Again, this is on console hardware and so this amount may increase or decrease on the PC. In an informal chat with a developer at Epic Games, so massive grain of salt is required, his late night ballpark “totally speculative” guesstimate is that, on the Xbox One, the GPU could theoretically accept a maximum ~10-25% more work in Unreal Engine 4, depending on the scene. He also said that memory bandwidth gets in the way, which Asynchronous Shaders would be fighting against. It is something that they are interested in and investigating, though.

AMD-2015-MantleAPI-slide1.png

This is where I speculate on drivers. When Mantle was announced, I looked at its features and said “wow, this is everything that a high-end game developer wants, and a graphics developer absolutely does not”. From the OpenCL-like multiple GPU model taking much of the QA out of SLI and CrossFire, to the memory and resource binding management, this should make graphics drivers so much easier.

It might not be free, though. Graphics drivers might still have a bunch of games to play to make sure that work is stuffed through the GPU as tightly packed as possible. We might continue to see “Game Ready” drivers in the coming years, even though much of that burden has been shifted to the game developers. On the other hand, maybe these APIs will level the whole playing field and let all players focus on chip design and efficient injestion of shader code. As always, painfully always, time will tell.

AMD is avoiding the heat in Carrizo

Subject: Processors | February 24, 2015 - 06:18 PM |
Tagged: Puma+, Puma, Kaveri, ISSCC 2015, ISSCC, GCN, Excavator, Carrizo-L, carrizo, APU, amd

While it is utterly inconceivable that Josh might have missed something in his look at Carrizo, that hasn't stopped certain Canadians from talking about Gila County, Arizona.  AMD's upcoming processor launch is a little more interesting than just another Phenom II launch, especially for those worried about power consumption.  With Adaptive Voltage and Frequency Scaling the new Excavator based chips will run very well at the sub-15W per core pair range which is perfect for POS, airplane entertainment and even in casinos.  The GPU portion speaks to those usage scenarios though you can't expect an R9 295 at that wattage.  Check out Hardware Canucks' coverage right here.

Carrizo.PNG

"AMD has been working hard on their mobile Carrizo architecture and they're now releasing some details about these Excavator architecture-equipped next generation APUs."

Here are some more Processor articles from around the web:

Processors

Author:
Subject: Processors
Manufacturer: AMD

AMD Details Carrizo Further

Some months back AMD introduced us to their “Carrizo” product.  Details were slim, but we learned that this would be another 28 nm part that has improved power efficiency over its predecessor.  It would be based on the new “Excavator” core that will be the final implementation of the Bulldozer architecture.  The graphics will be based on the latest iteration of the GCN architecture as well.  Carrizo would be a true SOC in that it integrates the southbridge controller.  The final piece of information that we received was that it would be interchangeable with the Carrizo-L SOC, which is a extremely low power APU based on the Puma+ cores.

car_01.jpg

A few months later we were invited by AMD to their CES meeting rooms to see early Carrizo samples in action.  These products were running a variety of applications very smoothly, but we were not informed of speeds and actual power draw.  All that we knew is that Carrizo was working and able to run pretty significant workloads like high quality 4K video playback.  Details were yet again very scarce other than the expected timeline of release, the TDP ratings of these future parts, and how it was going to be a significant jump in energy efficiency over the previous Kaveri based APUs.

AMD is presenting more information on Carrizo at the ISSCC 2015 conference.  This information dives a little deeper into how AMD has made the APU smaller, more power efficient, and faster overall than the previous 15 watt to 35 watt APUs based on Kaveri.  AMD claims that they have a product that will increase power efficiency in a way not ever seen before for the company.  This is particularly important considering that Carrizo is still a 28 nm product.

Click here to read more about AMD's ISSCC presentation on Carrizo!

Awake Yet? Good! Optimizing Inverse Trig for AMD GPUs.

Subject: General Tech, Graphics Cards | December 2, 2014 - 03:11 AM |
Tagged: amd, GCN, dice, frostbite

Inverse trigonometric functions are difficult to compute. Their use is often avoided like the plague. If, however, the value is absolutely necessary, it will probably be solved by approximations or, if possible, replacing them with easier functions by clever use of trig identities.

arctrig-examples.png

If you want to see how the experts approach this problem, then Sébastien Lagarde, a senior developer of the Frostbite engine at DICE, goes into detail with a blog post. By detail, I mean you will see some GPU assembly being stepped through by the end of it. What makes this particularly interesting is the diagrams at the end, showing what each method outputs as represented by the shading of a sphere.

If you are feeling brave, take a look.

Author:
Subject: Processors
Manufacturer: AMD

Filling the Product Gaps

In the first several years of my PCPer employment, I typically handled most of the AMD CPU refreshes.  These were rather standard affairs that involved small jumps in clockspeed and performance.  These happened every 6 to 8 months, with the bigger architectural shifts happening some years apart.  We are finally seeing a new refresh of the AMD APU parts after the initial release of Kaveri to the world at the beginning of this year.  This update is different.  Unlike previous years, there are no faster parts than the already available A10-7850K.

a10_7800_01.png

This refresh deals with fleshing out the rest of the Kaveri lineup with products that address different TDPs, markets, and prices.  The A10-7850K is still the king when it comes to performance on the FM2+ socket (as long as users do not pay attention to the faster CPU performance of the A10-6800K).  The initial launch in January also featured another part that never became available until now; the A8-7600 was supposed to be available some months ago, but is only making it to market now.  The 7600 part was unique in that it had a configurable TDP that went from 65 watts down to 45 watts.  The 7850K on the other hand was configurable from 95 watts down to 65 watts.

a10_7800_02.png

So what are we seeing today?  AMD is releasing three parts to address the lower power markets that AMD hopes to expand their reach into.  The A8-7600 was again detailed back in January, but never released until recently.  The other two parts are brand new.  The A10-7800 is a 65 watt TDP part with a cTDP that goes down to 45 watts.  The other new chip is the A6-7600K which is unlocked, has a configurable TDP, and looks to compete directly with Intel’s recently released 20 year Anniversary Pentium G3258.

Click here to read the entire article!