The Architectural Deep Dive
AMD officially unveiled their brand new Bobcat architecture to the world at CES 2011. This was a very important release for AMD in the low power market. Even though Netbooks were a dying breed at that time, AMD experienced a good uptick in sales due to the good combination of price, performance, and power consumption for the new Brazos platform. AMD was of the opinion that a single CPU design would not be able to span the power consumption spectrum of CPUs at the time, and so Bobcat was designed to fill that space which existed from 1 watt to 25 watts. Bobcat never was able to get down to that 1 watt point, but the Z-60 was a 4.5 watt part with two cores and the full 80 Radeon cores.
The Bobcat architecture was produced on TSMC’s 40 nm process. AMD eschewed the upcoming 32 nm HKMG/SOI process that was being utilized for the upcoming Llano and Bulldozer parts. In hindsight, this was a good idea. Yields took a while to improve on GLOBALFOUNDRIES new process, while the existing 40 nm product from TSMC was running at full speed. AMD was able to provide the market in fairly short order with good quantities of Bobcat based APUs. The product more than paid for itself, and while not exactly a runaway success that garnered many points of marketshare from Intel, it helped to provide AMD with some stability in the market. Furthermore, it provided a very good foundation for AMD when it comes to low power parts that are feature rich and offer competitive performance.
The original Brazos update did not happen, instead AMD introduced Brazos 2.0 which was a more process improvement oriented product which featured slightly higher speeds but remained in the same TDP range. The uptake of this product was limited, and obviously it was a minor refresh to buoy purchases of the aging product. Competition was coming from low power Ivy Bridge based chips, as well as AMD’s new Trinity products which could reach TDPs of 17 watts. Brazos and Brazos 2.0 did find a home in low powered, but full sized notebooks that were very inexpensive. Even heavily leaning Intel based manufacturers like Toshiba released Brazos based products in the sub-$500 market. The combination of good CPU performance and above average GPU performance made this a strong product in this particular market. It was so power efficient, small batteries were typically needed, thereby further lowering the cost.
All things must pass, and Brazos is no exception. Intel has a slew of 22 nm parts that are encroaching on the sub-15 watt territory, ARM partners have quite a few products that are getting pretty decent in terms of overall performance, and the graphics on all of these parts are seeing some significant upgrades. The 40 nm based Bobcat products are no longer competitive with what the market has to offer. So at this time we are finally seeing the first Jaguar based products. Jaguar is not a revolutionary product, but it improves on nearly every aspect of performance and power usage as compared to Bobcat.
A Reference Platform - But not a great one
Believe it or not, AMD claims that the Brazos platform, along with the "Brazos 2.0" update the following year, were the company's most successful mobile platforms in terms of sales and design wins. When it first took the scene in late 2010, it was going head to head against the likes of Intel's Atom processor and the combination of Atom + NVIDIA ION and winning. It was sold in mini-ITX motherboard form factors as well as small clamshell notebooks (gasp, dare we say...NETBOOKS?) and though it might not have gotten the universal attention it deserved, it was a great part.
With Kabini (and Temash as well), AMD is making another attempt to pull in some marketshare in the low power, low cost mobile markets. I have already gone over the details of the mobile platforms that AMD is calling Elite Mobility (Temash) and Mainstream (Kabini) in a previous article that launched today.
This article will quickly focus on the real-world performance of the Kabini platform as demonstrated by a reference laptop I received while visiting AMD in Toronto a few weeks ago. While this design isn't going to be available in retail (and I am somewhat thankful based on the build quality) the key is to look at the performance and power efficiency of the platform itself, not the specific implementation.
Kabini Architecture Overview
The building blocks of Kabini are four Jaguar x86 cores and 128 Radeon cores colleted in a pair of Compute Units - similar in many ways to the CUs found in the Radeon HD 7000 series discrete GPUs. Josh has written a very good article that focuses on the completely new architecture that is Jaguar and compared it to other processors including AMD's previous core used in Brazos, the Bobcat core.
2013 Elite Mobility APU - Temash
AMD has a lot to say today. At an event up in Toronto this month we got to sit down with AMD’s marketing leadership and key engineers to learn about the company’s plans for 2013 mobility processors. This includes a refreshed high performance APU known as Richland that will replace Trinity as well as two brand new APUs based on Jaguar CPU cores and the GCN architecture for low power platforms.
Josh has put together an article that details the Jaguar + GCN design of Temash and Kabini and I have also posted some initial performance results of the Kabini reference system AMD handed me in May. This article will detail the plans that AMD has for each of these three mobile segments, starting with the newest entry, AMD’s Elite Mobility APU platform – Temash.
The goal of the APU, the combination of traditional x86 processing cores and a discrete style graphics system, was to offer unparalleled performance in smaller and more efficient form factors. AMD believes that their leadership in the graphics front will offer them a good sized advantage in areas including performance tablets, hybrids and small screen clamshells that may or not be touch enabled. They are acknowledging though that getting into the smallest tablets (like the Nexus 7) is not on the table quite yet and that content creation desktop replacements are probably outside the scope of Richland.
2013 Elite Mobility APU – Temash
AMD will have the first x86 quad-core SoC design with Temash and AMD thinks it will make a big splash in a relatively new market known as the “high performance” tablet.
Temash, built around Jaguar CPU cores and the graphics technology of GCN, will be able to offer fully accelerated video playback with transcode support as well with features like image stabilization and Perfect Picture enabled. Temash will also be the only SoC to offer support for DX11 graphics and even though some games might not have the ability to show off added effects there are quite a few performance advantages of DX11 over DX10/9. With more than 100% claimed GPU performance upgrade you’ll be able to drive displays at 2560x1600 for productivity use and even be able to take advantage of wireless display options.
Subject: General Tech | April 30, 2013 - 01:23 PM | Jeremy Hellstrom
Tagged: Steamroller, piledriver, Kaveri, Kabini, hUMA, hsa, GCN, bulldozer, APU, amd
AMD may have united GPU and CPU into the APU but one hurdle had remained until now, the the non-uniformity of memory access between the two processors. Today we learned about one of the first successful HAS projects called Heterogeneous Uniform Memory Access, aka hUMA, which will appear in the upcoming Kaveri chip family. The use of this new technology will allow the on-die CPU and GPU to access the same memory pool, both physical and virtual and any data passed between the two processors will remain coherent. As The Tech Report mentions in their overview hUMA will not provide as much of a benefit to discrete GPUs, while they will be able to share address space the widely differing clock speeds between GDDR5 and DDR3 prevent unification to the level of an APU.
Make sure to read Josh's take as well so you can keep up with him on the Podcast.
"At the Fusion Developer Summit last June, AMD CTO Mark Papermaster teased Kaveri, AMD's next-generation APU due later this year. Among other things, Papermaster revealed that Kaveri will be based on the Steamroller architecture and that it will be the first AMD APU with fully shared memory.
Last week, AMD shed some more light on Kaveri's uniform memory architecture, which now has a snazzy marketing name: heterogeneous uniform memory access, or hUMA for short."
Here is some more Tech News from around the web:
- AMD’s new heterogeneous Uniform Memory Access
- hUMA; AMD’s Heterogeneous Unified Memory Architecture @ Hardware Canucks
- Compro TN50W Cloud Network Camera @ Tweaktown
- Wifi Pineapple project uses updated hardware for man-in-the-middle attacks @ Hack a Day
- New OpenWRT Drops Support For Linux 2.4, Low-Mem Devices @ Slashdot
- HP mashes up ProLiant, Integrity, BladeSystem, and Moonshot server @ The Register
- Acer selling tablet using Intel Y series processor @ The Register
- CERN Celebrates 20 Years of an Open Web (and Rebuilds 1st Web Page) @ Slashdot
- BitFenix 5K YouTube Subscriber Giveaway @ eTeknix
heterogeneous Uniform Memory Access
Several years back we first heard AMD’s plans on creating a uniform memory architecture which will allow the CPU to share address spaces with the GPU. The promise here is to create a very efficient architecture that will provide excellent performance in a mixed environment of serial and parallel programming loads. When GPU computing came on the scene it was full of great promise. The idea of a heavily parallel processing unit that will accelerate both integer and floating point workloads could be a potential gold mine in wide variety of applications. Alas, the promise of the technology did not meet expectations when we have viewed the results so far. There are many problems with combining serial and parallel workloads between CPUs and GPUs, and a lot of this has to do with very basic programming and the communication of data between two separate memory pools.
CPUs and GPUs do not share common memory pools. Instead of using pointers in programming to tell each individual unit where data is stored in memory, the current implementation of GPU computing requires the CPU to write the contents of that address to the standalone memory pool of the GPU. This is time consuming and wastes cycles. It also increases programming complexity to be able to adjust to such situations. Typically only very advanced programmers with a lot of expertise in this subject could program effective operations to take these limitations into consideration. The lack of unified memory between CPU and GPU has hindered the adoption of the technology for a lot of applications which could potentially use the massively parallel processing capabilities of a GPU.
The idea for GPU compute has been around for a long time (comparatively). I still remember getting very excited about the idea of using a high end video card along with a card like the old GeForce 6600 GT to be a coprocessor which would handle heavy math operations and PhysX. That particular plan never quite came to fruition, but the idea was planted years before the actual introduction of modern DX9/10/11 hardware. It seems as if this step with hUMA could actually provide a great amount of impetus to implement a wide range of applications which can actively utilize the GPU portion of an APU.
Subject: General Tech | January 24, 2013 - 03:31 PM | Ken Addison
Tagged: video, titan, ps4, podcast, nvidia, kavari, Kabini, H80i, gk110, GCN, corsair, APU, amd, 200r
PC Perspective Podcast #235 - 01/24/2013
Join us this week as we discuss potential AMD Hardware in the PS4, a GK110 NVIDIA product, Corsair 200R case and more!
The URL for the podcast is: http://pcper.com/podcast - Share with your friends!
- iTunes - Subscribe to the podcast directly through the Store
- RSS - Subscribe through your regular RSS reader
- MP3 - Direct download link to the MP3 file
Hosts: Ryan Shrout, Jeremy Hellstrom, Josh Walrath and Allyn Malventano
Program length: 1:16:39
Podcast topics of discussion:
- Week in Reviews:
- 0:13:35 This Podcast is brought to you by MSI!
- News items of interest:
Hardware / Software Pick of the Week
- Ryan: Pegasus R4 with Thunderbolt
- Jeremy: Not since the Sumosac has there been something more sure to get you the ladies!
- Josh: Just built a machine with one of these
- Allyn: Zip Snip ($20 at Lowes)
- Hardware / Software Pick of the Week
- 1-888-38-PCPER or firstname.lastname@example.org
- http://twitter.com/ryanshrout and http://twitter.com/pcper
We are Still Among the Living
The day after the official AMD presentation we were able to sit down with Leslie Sobon for a good hour and really dig into the products we are expecting throughout this next year. AMD did not officially announce any products, but they revealed more details about products on their roadmaps.
To say that AMD is in a somewhat precarious situation is an understatement. This does not necessarily mean that they won’t survive for some years. This was never mentioned to us by AMD, but we can assume that it is not in ATIC’s best interest to let AMD flounder too much. AMD is still GLOBALFOUNDRIES largest customer, and ATIC believes that they can become a fabrication giant in the next few years. So, while AMD is hitting some hard times, they will be around for some time to come in spite of their issues.
Believe it or not, AMD is still a CPU company with some relevant producxts. While Intel has the advantage in x86 performance and process technology, AMD has a distinct advantage in the integrated graphics portion. While Trinity was a big step in the right direction in terms of performance and power consumption, it was not enough to boost their flagging marketshare. Throughout the 2013 they are working on several products that will help to change their fortunes.
The first product that we will likely see is the Jaguar core based Kabini APUs. These are the next generation, low power APUs which will replace the Brazos 2.0 products that we currently are seeing. These quad core and dual core parts are manufactured by TSMC on their 28 nm process. Kabini will be the first APU to include the new GCN architecture that we currently see in the HD 7700 series and above. AMD will be breaking new ground in offering a true quad core part at price points unseen so far.