The Architectural Deep Dive
AMD officially unveiled their brand new Bobcat architecture to the world at CES 2011. This was a very important release for AMD in the low power market. Even though Netbooks were a dying breed at that time, AMD experienced a good uptick in sales due to the good combination of price, performance, and power consumption for the new Brazos platform. AMD was of the opinion that a single CPU design would not be able to span the power consumption spectrum of CPUs at the time, and so Bobcat was designed to fill that space which existed from 1 watt to 25 watts. Bobcat never was able to get down to that 1 watt point, but the Z-60 was a 4.5 watt part with two cores and the full 80 Radeon cores.
The Bobcat architecture was produced on TSMC’s 40 nm process. AMD eschewed the upcoming 32 nm HKMG/SOI process that was being utilized for the upcoming Llano and Bulldozer parts. In hindsight, this was a good idea. Yields took a while to improve on GLOBALFOUNDRIES new process, while the existing 40 nm product from TSMC was running at full speed. AMD was able to provide the market in fairly short order with good quantities of Bobcat based APUs. The product more than paid for itself, and while not exactly a runaway success that garnered many points of marketshare from Intel, it helped to provide AMD with some stability in the market. Furthermore, it provided a very good foundation for AMD when it comes to low power parts that are feature rich and offer competitive performance.
The original Brazos update did not happen, instead AMD introduced Brazos 2.0 which was a more process improvement oriented product which featured slightly higher speeds but remained in the same TDP range. The uptake of this product was limited, and obviously it was a minor refresh to buoy purchases of the aging product. Competition was coming from low power Ivy Bridge based chips, as well as AMD’s new Trinity products which could reach TDPs of 17 watts. Brazos and Brazos 2.0 did find a home in low powered, but full sized notebooks that were very inexpensive. Even heavily leaning Intel based manufacturers like Toshiba released Brazos based products in the sub-$500 market. The combination of good CPU performance and above average GPU performance made this a strong product in this particular market. It was so power efficient, small batteries were typically needed, thereby further lowering the cost.
All things must pass, and Brazos is no exception. Intel has a slew of 22 nm parts that are encroaching on the sub-15 watt territory, ARM partners have quite a few products that are getting pretty decent in terms of overall performance, and the graphics on all of these parts are seeing some significant upgrades. The 40 nm based Bobcat products are no longer competitive with what the market has to offer. So at this time we are finally seeing the first Jaguar based products. Jaguar is not a revolutionary product, but it improves on nearly every aspect of performance and power usage as compared to Bobcat.
2013 Elite Mobility APU - Temash
AMD has a lot to say today. At an event up in Toronto this month we got to sit down with AMD’s marketing leadership and key engineers to learn about the company’s plans for 2013 mobility processors. This includes a refreshed high performance APU known as Richland that will replace Trinity as well as two brand new APUs based on Jaguar CPU cores and the GCN architecture for low power platforms.
Josh has put together an article that details the Jaguar + GCN design of Temash and Kabini and I have also posted some initial performance results of the Kabini reference system AMD handed me in May. This article will detail the plans that AMD has for each of these three mobile segments, starting with the newest entry, AMD’s Elite Mobility APU platform – Temash.
The goal of the APU, the combination of traditional x86 processing cores and a discrete style graphics system, was to offer unparalleled performance in smaller and more efficient form factors. AMD believes that their leadership in the graphics front will offer them a good sized advantage in areas including performance tablets, hybrids and small screen clamshells that may or not be touch enabled. They are acknowledging though that getting into the smallest tablets (like the Nexus 7) is not on the table quite yet and that content creation desktop replacements are probably outside the scope of Richland.
2013 Elite Mobility APU – Temash
AMD will have the first x86 quad-core SoC design with Temash and AMD thinks it will make a big splash in a relatively new market known as the “high performance” tablet.
Temash, built around Jaguar CPU cores and the graphics technology of GCN, will be able to offer fully accelerated video playback with transcode support as well with features like image stabilization and Perfect Picture enabled. Temash will also be the only SoC to offer support for DX11 graphics and even though some games might not have the ability to show off added effects there are quite a few performance advantages of DX11 over DX10/9. With more than 100% claimed GPU performance upgrade you’ll be able to drive displays at 2560x1600 for productivity use and even be able to take advantage of wireless display options.
Subject: Systems | May 21, 2013 - 08:21 PM | Tim Verry
Tagged: Richland, msi, gx70, gx60, gaming notebook, gaming, APU, amd
MSI announced two new gaming notebooks powered by AMD's latest Richland APUs today called the GX70 and GX60. Both gaming notebooks use AMD A10-5750M processors, a discrete AMD graphics card, 8GB of RAM, and a 750GB (7200 RPM) hard drive. Other shared specifications include a Killer E2200 NIC, Blu-ray drive, THX certified speakers, a headphone amp, and a large 9-cell battery.
The GX70 is the largest of the two gaming notebooks at 8.6 pounds and packing a 17.3” display. The GX70 uses the A10-5750M APU and a Radeon 8970M discrete mobile GPU to deliver gaming performance to the 1080p display. The system is also capable of outputting to multiple displays over HDMI and supports AMD's Eyefinity 3D technology. On the outside, the MSI GX70 features a 17.3” 1920 x 1080p display with an anti-reflective coating as well as a SteelSeries gaming keyboard.
Meanwhile, the MSI GX60 is a 15-inch notebook that weighs 7.7 pounds. This gaming notebook uses an AMD A10-5750M APU and a Radeon 7970M mobile discrete GPU. Further, the GX60 has a 15.6” 1080p anti-reflective display and SteelSeries gaming keyboard.
MSI claims that the new AMD Richland APUs will give its gaming notebooks much better battery life. The new GX70 and GX60 will have up to 40% better graphical performance compared to previous generations thanks to the new APUs and discrete cards. According to MSI VP of Sales Andy Tung, “the GX70 and GX60 deliver the ultimate sensory experience for both professional and amateur gamers.” More information on the new gaming notebooks can be found on this MSI press release.
Subject: General Tech | May 13, 2013 - 10:28 AM | Tim Verry
Tagged: x86, SoC, semi-custom chip, Patent, ip, APU, amd
Advanced Micro Devices (AMD) has an extensive intellectual property (IP) portfolio. The company has a range of products from CPUs and graphics cards to video acceleration hardware. It is also the only other major player to have a license to build chips with the x86 ISA. With the launch of its Semi-Custom Business Unit, AMD plans to take advantage of the engineering experience and patent portfolio to create a new revenue stream. AMD will work with other companies to create customized processors that integrate custom IP cores and technology but use AMD's existing products as a base to cut down on engineering time and R&D costs.
The first such customized chip is the System on a Chip used in Sony's PlayStation 4 gaming console. AMD intends to market its modular SoC technology and custom IP integration services to makers of set top boxes, smart TVs, tablets, PCs, networking hardware, and High Performance Computing applications. AMD argues that using its Semi-Custom Business Unit to create a customized SoC is cheaper and faster to design and produce than a fully-custom design, which makes sense since most of the engineering work is already done. AMD could stand to make quite a bit of extra money here, especially if it can land design wins for governmental and industrial design contracts. Intel's x86 license scarcity may actually benefit AMD here, in fact.
AMD's Semi-Custom Business Unit consists of an engineering team led by AMD Corporate Vice President and General Manager Saeid Moshkelani. I think doing this is a smart move for the x86 underdog, and it will be interesting to see how well the division does for the company's bottom line.
AMD announced its third annual Developer Summit last week. Dubbed “APU13,” the upcoming summit is the AMD equivalent to NVIDIA’s GTC and is an annual event that brings together industry analysts, researchers, programmers, academics, and software/hardware companies pursuing heterogeneous computing technologies.
In previous years, the AMD Developer Summit has been the launchpad for C++ AMP and the HSA Foundation. This year’s Summit will continue that trend towards heterogeneous computing as well as look back over the year and provide updates on where the various HSA member companies are at as far as goals to move towards standards-based heterogenous computing.
In addition to keynote speeches from AMD and some of its partners, expect a great deal of presentations and workshops from researchers and programmers that are working on new programming models and hardware solutions to efficiently use CPU and GPU processors. More information on hUMA is one of the likely topics, for example. Discussion about upcoming hardware, process nodes, and products may also be on the table so far as it relates to the HSA theme. Considering the summit is called “APU13,” I also expect that AMD will reveal additional details on the company’s Kaveri APU as well as a look into its future product road map.
AMD is currently asking for presentation proposals from researchers in a number of HSA and technology-related fields including heterogeneous computing, cloud computing, web technologies, programming languages, gaming and graphics technologies, and software security. The lineup of presenters for the summit is still being worked out, and proposal papers will be accepted until May 10th with the winners being notified over the summer.
In all, AMD’s APU13 should be an exciting and intellectual event. Last year’s AMD Fusion Developer Summit (AFDS) was an interesting and fun event to cover, and I hope that APU13 will keep up the same momentum and interest in heterogeneous computing that AFDS started.
Subject: General Tech | April 30, 2013 - 01:23 PM | Jeremy Hellstrom
Tagged: Steamroller, piledriver, Kaveri, Kabini, hUMA, hsa, GCN, bulldozer, APU, amd
AMD may have united GPU and CPU into the APU but one hurdle had remained until now, the the non-uniformity of memory access between the two processors. Today we learned about one of the first successful HAS projects called Heterogeneous Uniform Memory Access, aka hUMA, which will appear in the upcoming Kaveri chip family. The use of this new technology will allow the on-die CPU and GPU to access the same memory pool, both physical and virtual and any data passed between the two processors will remain coherent. As The Tech Report mentions in their overview hUMA will not provide as much of a benefit to discrete GPUs, while they will be able to share address space the widely differing clock speeds between GDDR5 and DDR3 prevent unification to the level of an APU.
Make sure to read Josh's take as well so you can keep up with him on the Podcast.
"At the Fusion Developer Summit last June, AMD CTO Mark Papermaster teased Kaveri, AMD's next-generation APU due later this year. Among other things, Papermaster revealed that Kaveri will be based on the Steamroller architecture and that it will be the first AMD APU with fully shared memory.
Last week, AMD shed some more light on Kaveri's uniform memory architecture, which now has a snazzy marketing name: heterogeneous uniform memory access, or hUMA for short."
Here is some more Tech News from around the web:
- AMD’s new heterogeneous Uniform Memory Access
- hUMA; AMD’s Heterogeneous Unified Memory Architecture @ Hardware Canucks
- Compro TN50W Cloud Network Camera @ Tweaktown
- Wifi Pineapple project uses updated hardware for man-in-the-middle attacks @ Hack a Day
- New OpenWRT Drops Support For Linux 2.4, Low-Mem Devices @ Slashdot
- HP mashes up ProLiant, Integrity, BladeSystem, and Moonshot server @ The Register
- Acer selling tablet using Intel Y series processor @ The Register
- CERN Celebrates 20 Years of an Open Web (and Rebuilds 1st Web Page) @ Slashdot
- BitFenix 5K YouTube Subscriber Giveaway @ eTeknix
heterogeneous Uniform Memory Access
Several years back we first heard AMD’s plans on creating a uniform memory architecture which will allow the CPU to share address spaces with the GPU. The promise here is to create a very efficient architecture that will provide excellent performance in a mixed environment of serial and parallel programming loads. When GPU computing came on the scene it was full of great promise. The idea of a heavily parallel processing unit that will accelerate both integer and floating point workloads could be a potential gold mine in wide variety of applications. Alas, the promise of the technology did not meet expectations when we have viewed the results so far. There are many problems with combining serial and parallel workloads between CPUs and GPUs, and a lot of this has to do with very basic programming and the communication of data between two separate memory pools.
CPUs and GPUs do not share common memory pools. Instead of using pointers in programming to tell each individual unit where data is stored in memory, the current implementation of GPU computing requires the CPU to write the contents of that address to the standalone memory pool of the GPU. This is time consuming and wastes cycles. It also increases programming complexity to be able to adjust to such situations. Typically only very advanced programmers with a lot of expertise in this subject could program effective operations to take these limitations into consideration. The lack of unified memory between CPU and GPU has hindered the adoption of the technology for a lot of applications which could potentially use the massively parallel processing capabilities of a GPU.
The idea for GPU compute has been around for a long time (comparatively). I still remember getting very excited about the idea of using a high end video card along with a card like the old GeForce 6600 GT to be a coprocessor which would handle heavy math operations and PhysX. That particular plan never quite came to fruition, but the idea was planted years before the actual introduction of modern DX9/10/11 hardware. It seems as if this step with hUMA could actually provide a great amount of impetus to implement a wide range of applications which can actively utilize the GPU portion of an APU.
Jaguar Hits the Embedded Space
It has long been known that AMD has simply not had a lot of luck going head to head against Intel in the processor market. Some years back they worked on differentiating themselves, and in so doing have been able to stay afloat through hard times. The acquisitions that AMD has made in the past decade are starting to make a difference in the company, especially now that the PC market that they have relied upon for revenue and growth opportunities is suddenly contracting. This of course puts a cramp in AMD’s style, but with better than expected results in their previous quarter, things are not nearly as dim as some would expect.
Q1 was still pretty harsh for AMD, but they maintained their marketshare in both processors and graphics chips. One area that looks to get a boost is that of embedded processors. AMD has offered embedded processors for some time, but with the way the market is heading they look to really ramp up their offerings to fit in a variety of applications and SKUs. The last generation of G-series processors were based upon the Bobcat/Brazos platform. This two chip design (APU and media hub) came in a variety of wattages with good performance from both the CPU and GPU portion. While the setup looked pretty good on paper, it was not widely implemented because of the added complexity of a two chip design plus thermal concerns vs. performance.
AMD looks to address these problems with one of their first, true SOC designs. The latest G-series SOC’s are based upon the brand new Jaguar core from AMD. Jaguar is the successor to the successful Bobcat core which is a low power, dual core processor with integrated DX11/VLIW5 based graphics. Jaguar improves performance vs. Bobcat in CPU operations between 6% to 13% when clocked identically, but because it is manufactured on a smaller process node it is able to do so without using as much power. Jaguar can come in both dual core and quad core packages. The graphics portion is based on the latest GCN architecture.
AMD has announced that is will be hosting an event for fans in San Francisco this weekend. The AMD Fan Day is free with registration (register here), and fans will give enthusiasts a chance to go hands-on with the company's 2013 hardware lineup, play several newly released (and some not-yet-released) games, talk with industry experts, check out modded PCs, and have a chance to win free hardware and swag from AMD, Corsair, and Gigabyte.
Gamers will get a chance to speak with the developers for Bioshock Infinite, Far Cry 3, Crysis 3, Devil May Cry (DMC), and Tomb Raider as well as AMD representatives. VIZIO, IGN, Ubisoft, Sapphire, and Logitech will also be attending the AMD fan day to show off their latest products.
The event will held at City View at Metreon (address below) at 5:30pm on Saturday, April 6th. Best of all, the first 1,000 registered attendees in the door will get a free AMD A8 5600K APU. The first 120 attendees will win both an A8 5600K APU and an A85X motherboard.
One of the modded PCs that will be on the event floor.
If you're going to be in the area this weekend and are interested in going, be sure to head over to the AMD site and register. It sounds like it should be a fun time, and the free hardware doesn't hurt!
The AMD Fan Day will be held at the following address:
City View at Metreon
135 4th Street
San Francisco, CA 94013
Will you be checking out the AMD fan day to enjoy some gaming and PC hardware?
Subject: General Tech | March 31, 2013 - 02:21 AM | Tim Verry
Tagged: sony, ps4, playstation eye, playstation 4, gaming, dualshock 4, APU, amd
Sony teased a few more details about its upcoming PlayStation 4 console at the Games Developer's Conference earlier this week. While the basic specifications have not changed since the original announcement, we now know more about the X86 console hardware.
The PS4 itself is powered by an AMD Jaguar CPU with eight physical cores and eight threads. Each core gets 32 KB L1 I-cache and D-cache. Further, each group of four physical cores shares 2 MB of L2 cache, for 4MB total L2. The processor is capable of Out of Order Execution, as are AMDs other processor offerings. The console also reportedly features 8GB of GDDR5 memory that is shared by the CPU and GPU. It offers 176 GB/s of bandwidth, and is a step above the PS3 which did not use a unified memory design. The system will also sport a faster GPU rated at 1.843 TFLOPS, and clocked at 800MHz. The PS3 will have a high-capacity hard drive and a new Blu-ray drive that is up to 3-times faster. Interestingly, the console also has a co-processor that allows the system to process the video streaming features and allow the Remote Play game streaming to the PlayStation Vita at its native resolution of 960x554.
The PlayStation Eye has also been upgraded with the PS4 to include 2 cameras, four microphones, and a 3-axis accelerometer. The Eye cameras have an 85-degree field of view, and can record video at 1280x800 at 60 Hz and 12 bits per pixel or 640x480 and 120Hz. The new PS4 Eye is a noteworthy upgrade to the current generation model which is limited to either 640x480 pixels at 60Hz or 320x240 pixels at 120Hz. The extra resolution should allow developers to be more accurate. The DualShock 4 controllers sport a light-bar that can be tracked by the new Eye camera, for example. The light-bar on the controllers uses an RGB LED that changes to blue, red, pink, or green for players 1-4 respectively.
Speaking of the new DualShock 4, Sony has reportedly ditched the analog face buttons and D-pad for digital buttons. With the DS3 and the PS3, the analog face buttons and D-pad came in handy with racing games, but otherwise they are not likely to be missed. The controllers will now charge even when the console is in standby mode, and the L2 and R2 triggers are more resistant to accidental pressure. The analog sticks have been slightly modified and feature a reduced dead zone. The touchpad, which is a completely new feature for the DualShock lineup, is capable of tracking 2 points at a resolution of 1920x900–which is pretty good.
While Sony has still not revealed what the actual PS4 console will look like, most of the internals are now officially known. It will be interesting to see just where Sony prices the new console, and where game developers are able to take it. Using a DX11.1+ feature set, developers are able to use many of the same tools used to program PC titles but also have additional debugging tools and low level access to the hardware. A new low level API below DirectX, but above the driver level gives developers deeper access to the shader pipeline. I'm curious to see how PC ports will turn out, with the consoles now running X86 hardware, I'm hoping that the usual fare of bugs common to ported titles from consoles to PCs will decrease–a gamer can dream, right?