heterogeneous Uniform Memory Access
Several years back we first heard AMD’s plans on creating a uniform memory architecture which will allow the CPU to share address spaces with the GPU. The promise here is to create a very efficient architecture that will provide excellent performance in a mixed environment of serial and parallel programming loads. When GPU computing came on the scene it was full of great promise. The idea of a heavily parallel processing unit that will accelerate both integer and floating point workloads could be a potential gold mine in wide variety of applications. Alas, the promise of the technology did not meet expectations when we have viewed the results so far. There are many problems with combining serial and parallel workloads between CPUs and GPUs, and a lot of this has to do with very basic programming and the communication of data between two separate memory pools.
CPUs and GPUs do not share common memory pools. Instead of using pointers in programming to tell each individual unit where data is stored in memory, the current implementation of GPU computing requires the CPU to write the contents of that address to the standalone memory pool of the GPU. This is time consuming and wastes cycles. It also increases programming complexity to be able to adjust to such situations. Typically only very advanced programmers with a lot of expertise in this subject could program effective operations to take these limitations into consideration. The lack of unified memory between CPU and GPU has hindered the adoption of the technology for a lot of applications which could potentially use the massively parallel processing capabilities of a GPU.
The idea for GPU compute has been around for a long time (comparatively). I still remember getting very excited about the idea of using a high end video card along with a card like the old GeForce 6600 GT to be a coprocessor which would handle heavy math operations and PhysX. That particular plan never quite came to fruition, but the idea was planted years before the actual introduction of modern DX9/10/11 hardware. It seems as if this step with hUMA could actually provide a great amount of impetus to implement a wide range of applications which can actively utilize the GPU portion of an APU.
AM3+ Last Gasp?
Over the past several years I have reviewed quite a few Asus products. The ones that typically grab my attention are the ROG based units. These are usually the most interesting, over the top, and expensive products in their respective fields. Ryan has reviewed the ROG graphics cards, and they have rarely disappointed. I have typically taken a look at the Crosshair series of boards that support AMD CPUs.
Crosshair usually entails the “best of the best” when it comes to features and power delivery. My first brush with these boards was the Crosshair IV. That particular model was only recently taken out of my primary work machine. It proved itself to be an able performer and lasted for years (even overclocked). The Crosshair IV Extreme featured the Lucid Hydra chip to allow mutli-GPU performance without going to pure SLI or Crossfire. The Crosshair V got rid of Lucid and added official SLI support and it incorporated the Supreme FX II X-Fi audio. All of these boards have some things in common. They are fast, they overclock well, and they are among the most expensive motherboards ever for the AMD platform.
So what is there left to add? The Crosshair V is a very able platform for Bulldozer and Piledriver based parts. AMD is not updating the AM3+ chipsets, so we are left with the same 990FX northbridge and the SB950 southie (both of which are essentially the same as the 890FX/SB850). It should be a simple refresh, right? We had Piledriver released a few months ago and there should be some power and BIOS tweaks that can be implemented and then have a rebranded board. Sounds logical, right? Well, thankfully for us, Asus did not follow that path.
The Asus Crosshair V Formula Z is a fairly radical redesign of the previous generation of products. The amount of extra features, design changes, and power characteristics make it a far different creature than the original Crosshair V. While both share many of the same style features, under the skin this is a very different motherboard. I am rather curious why Asus did not brand this as the “Crosshair VI”. Let’s explore, shall we?
Subject: Processors | March 12, 2013 - 02:52 PM | Jeremy Hellstrom
Tagged: VLIW4, trinity, Richland, piledriver, notebook, mobile, hd 8000, APU, amd, A10-5750
The differences between Richland and Trinity are not earth shattering but there are certainly some refinements implemented by AMD in the A10-5750. One very noticeable one is support for DDR3-1866 as well as better power management for both the CPU and GPU; with new temperature balancing algorithms and measurement the ability to balance the load properly has increased from Trinity. Many AMD users will be more interested in the GPU portion of the die than the CPU, as that is where AMD actually has as lead on Intel and this particular chip contains the HD8650G, with clocks of 720MHz boost and 533MHz base and increase from the previous generation of 35 and 37MHz respectively. You can read more about the other three models that will be released over at The Tech Report.
"AMD has formally introduced the first members of its Richland APU family. We have the goods on the chips and Richland's new power management tech, which combines temperature-based inputs with bottleneck-aware clock boosting."
Here are some more Processor articles from around the web:
- AMD Richland APU Preview: Trinity Gets a Facelift @ Hardware Canucks
- 2013 AMD Mobile APU (Richland) @ Bjorn3D
- Westmere-EP to Sandy Bridge-EP: The Scientist Potential Upgrade @ AnandTech
- AMD Phenom II X4 955, Phenom II X4 960T, Phenom II X6 1075T and Intel Pentium G2120, Core i3-3220, Core i5-3330 @ ixbt.com
- AMD FX-8350 @ iXBT Labs
- The new Opteron 6300: Finally Tested! @ AnandTech
- Intel Core i5-3570K vs. i7-3770K Ivy Bridge @ techPowerUp
AMD Exposes Richland
When we first heard about “Richland” last year, there was a little bit of excitement from people. Not many were sure what to expect other than a faster “Trinity” based CPU with a couple extra goodies. Today we finally get to see what Richland is. While interesting, it is not necessarily exciting. While an improvement, it will not take AMD over the top in the mobile market. What it actually brings to the table is better competition and a software suite that could help to convince buyers to choose AMD instead of a competing Intel part.
From a design standpoint, it is nearly identical to the previous Trinity. That being said, a modern processor is not exactly simple. A lot of software optimizations can be applied to these products to increase performance and efficiency. It seems that AMD has done exactly that. We had heard rumors that the graphics portion was in fact changed, but it looks like it has stayed the same. Process improvements have been made, but that is about the extent of actual hardware changes to the design.
The new Richland APUs are branded the A-5000 series of products. The top end is the A10-5750M with HD-8650 integrated graphics. This is still the VLIW-4 based graphics unit seen in the previous Trinity products, but enough changes have been made with software that I can enable Dual Graphics with the new Solar System based GPUs (GCN). The speeds of these products have received a nice boost. As compared to the previous top end A10-4600, the 5750 takes the base speed from 2.3 GHz to 2.5 GHz. Boost goes from 3.2 GHz up to 3.5 GHz. The graphics portion takes the base clock from 496 MHz up to 533 MHz, while turbo mode improves over the 4600 from 685 MHz to 720 MHz. These are not staggering figures, but it all still fits within the 35 watt TDP of the previous product.
One other important improvement is the ability to utilize DDR-3 1866 memory. Throughout the past year we have seen memory densities increase fairly dramatically without impacting power consumption. This goes for speed as well. While we would expect to see lower power DIMMs be used in the thin and light categories, expect to see faster DDR-3 1866 in the larger notebooks that will soon be heading our way.
Podcast #226 - Dual GTX 690 System from Origin, Intel's new SATA6 controller, Piledriver-based Opeterons and more!
Subject: General Tech | November 8, 2012 - 01:33 PM | Ken Addison
Tagged: ssd, sata6, podcast, piledriver, pcper, origin, opeteron, nvidia, Intel, genesis, corsair, amd, 690
PC Perspective Podcast #226 - 11/08/2012
Join us this week as we talk about a Dual GTX 690 System from Origin, Intel's new SATA6 controller, Piledriver-based Opeterons and more!
The URL for the podcast is: http://pcper.com/podcast - Share with your friends!
- iTunes - Subscribe to the podcast directly through the Store
- RSS - Subscribe through your regular RSS reader
- MP3 - Direct download link to the MP3 file
Hosts: Ryan Shrout, Jeremy Hellstrom, Josh Walrath, and Allyn Malventano
Program length: 1:21:17
Podcast topics of discussion:
- Join us for the MoH Game Stream!
- Week in Reviews:
- 0:19:30 This podcast is brought to you by MSI
News items of interest:
- 0:20:25 Intel Crystal Forest Communications Platform
- 0:23:30 Google Nexus 10 tablet
- 0:27:00 Corsair Hydro H100i and H80i coolers
- 0:34:00 New Corsair AXi series power supplies
- 0:36:30 Intel DC S3700 Enterprise SSD
- 0:46:30 AMD Launches Piledriver based Opteron 6300 chips
- 0:51:10 Get Assassin's Creed III for Samsung SSD
- 0:52:45 Limited Linux Steam Beta starts
- 0:56:15 Zotac AD06 with new AMD APU
- 0:58:30 Mouse.. DRM!?
- 1-888-38-PCPER or firstname.lastname@example.org
- http://twitter.com/ryanshrout and http://twitter.com/pcper
Subject: General Tech, Processors | November 6, 2012 - 01:30 PM | Jeremy Hellstrom
Tagged: piledriver, opteron 6300, amd, Abu Dhabi
Low power, high density server designs are very important but it is nice to see updates on the more powerful server processors as well, something quite rare so far in 2012. AMD has finally released their Opteron 6300 family, with ten members bearing between 8 to 16 cores and all running at over 3GHz. We don't have any reviews to offer, so the only performance benchmarks are from AMD's press releases, but you can expect more change than just an increase in frequency as this is a Piledriver based chip. The Register has put together a high level overview of the new Opterons or you can head on over to AMD to check out the information on offer there. Cray is already shipping servers based on these chips, with Dell and HP releasing a variety of servers in the near future.
"Customers using big ol' fat x86 servers didn't have much to jump for joy about this year. There just isn't a lot going on. But to make things interesting, AMD is now goosing the performance of its top-end parts with the launch of its "Abu Dhabi" Opteron 6300s, which sport the "Piledriver" cores that already debuted in the FX Series of high-end desktop chips."
Here is some more Tech News from around the web:
- Canon PowerShot G15 Review @ TechReviewSource
- Logitech TV Cam HD review: couch Skyping @ Hardware.Info
- The Thomson / Technicolor TG784n Port Forwarding Guide @ TechARP
- ARM and Imagination take over MIPS for $350m @ The Inquirer
- Microsoft integrates Kinect Fusion project into SDK @ The Inquirer
- Microsoft to replace Windows Live Messenger with Skype @ The Inquirer
- Windows 8 Review – Part Two: The Things I Love @ Techgage
- Rosewill RCM-3640HD 3.0 MegaPixel Webcam Review @ Hi Tech Legion
- Get Ready For The Holidays @ Bjorn3D/Kingston
Subject: Processors | November 6, 2012 - 01:15 PM | Tim Verry
Tagged: server, piledriver, opteron, datacenter, cpu, amd
AMD announced new server processors on Monday based on the same Piledriver architecture used in the Trinity APUs and Vishera desktop CPUs we recently reviewed. With the release of the Opteron 6300 series, AMD is bringing Piledriver to the server room.
The new chips – similar to the desktop counterparts – bring several performance improvements over the previous generation 6200 series Opterons based on the Bulldozer architecture. AMD is positioning the chips as a upgrade path to existing servers and on merits of performance-per-dollar efficiency. As is AMD's fashion, the new chips are competitively priced and "good enough" performance-wise. With 6300, AMD has stated the goal is to reduce the TCO, or Total Cost of Ownership for servers used in data centers, supercomputers, and enterprises by being compatible with existing AMD server platforms with a BIOS upgrade and representing efficiency improvements over previous chips.
The Opteron 6300 series CPUs themselves build upon the Vishera desktop parts by adding more cores and more L3 cache. The server parts will have up to 16 cores clocked at 2.8GHz base and 3.2GHz turbo. They will have TDP ratings between 85W and 140W and will feature prices from $500 to $1,400. On the cache front, the chips have a 16KB L1 data cache per core, 64KB L1 instruction cache per module, 1MB L2 cache per core, and a shared 16MB cache per socket. AMD has included a quad channel memory controller that supports DDR3 up to 1866 MHz and 1.5TB per server in 4P configurations. AMD has rounded out the chips with four x16 HyperTransport 3.0 links rated at 6.4 GT/s per link. Up to 4 processors per server will be supported, which means a maximum of 64 cores.
With Piledriver, AMD added a number of new instructions including FMA3, BMI, and F16c. The company has also implemented server tweaks to the Bulldozer design to improve branch prediction, instructions per clock, scheduling, and reduced the power draw at higher clockspeeds allowing for the chps to clock higher while staying within the same power envelope of the Bulldozer-based Opteron 6200 series.
AMD is using the same socket as the 6200 series processors, and the new chips can be deployed as an upgrade to the old servers without needing a new motherboard.
When pitting the new Opteron 6380 to the previous-generation 6278, AMD is claiming a number of performance increases, including a 24-percent and 40-percent improvement in SPECjob2005 and SPECpower_ssj2008 respectively.
Further, the company is claiming competitive performance in server workloads with the Intel competition. AMD offers up benchmarks showing the Opteron 6380 and Xeon E5-2690 trading wins, with the AMD part being slower in the STREAM benchmark, but being slightly faster in LAMPS and NAMD. The allure of the Opteron, according to AMD is that the AMD part is almost half the price of the Intel processor, and is hoping the lower priced parts will encourage adoption. AMD argues that the money saved could easily go towards more RAM or more storage (or simply be saved of course).
The company has announced that its first major design win is Big Red II supercomputer at Indiana University. Built by Cray, the Big Red II will feature 21,000+ Opteron 6300-series CPU cores paired with NVIDIA GPUs. It represents a massive increase in computing power over IU’s previous Big Red supercomputer with 4,100 CPU cores, and will be used for medical, physics, chemistry, and climate research. Beyond that, AMD has stated more that 30 hardware vendors are slated to introduce servers based on the new Piledriver-based Opteron processors including HP, Dell, Cray, SGI, Supermicro, Sugon, and (of course) SeaMicro. On the software side of things, AMD is working with Microsoft, VMware, Xen, Red Hat, and Openstack. The company also stated that it is leaning on the experience and knowledge gained from the HSA Foundation to improve software support and guide the future direction of Opteron development.
The Opteron 6300 series is an interesting release that brings several improvements to the company’s server chip offerings. At launch, there are 10 processors to choose from, ranging from the quad core 6308 clocked at 3.5GHz for $501 to the top-end 6386 SE with 16 cores (2.8GHz base, 3.5GHz max turbo) and a $1,392 price tag. The 6366HE is an interesting part as well. It is the same price as the 12-core, 115W TDP Opteron 6348, but its has 16 lower-clocked cores and an 85W TDP. With the non-HE edition processors with 16 cores starting at $703, the 6366HE for $575 is a decent deal if you need multi-threading more than a fewer number of higher clocked cores.
Another bit that I found intriguing is that in a few years, AMD will (likely, if all goes according to plan) be offering processors for just about every type of server. They will have low cost, low power ARM Cortex-A57 based chips, Accelerated Processing Units (APUs) well suited to mixed workloads including GPU-accelerated tasks, and CPU-only chips with lots of traditional x86-64 cores. It seems that Intel will continue to hold the high end on pure performance, but AMD and its SeaMicro server division have not given up competing in the server room by a long shot.
The Piledrive architecture and Vishera desktop CPU review and The future of AMD: Vishera and Beyond at PC Perspective.
Subject: Processors | October 23, 2012 - 02:44 PM | Jeremy Hellstrom
Tagged: vishera, Steamroller, piledriver, FX-8350, fx-8150, FX-6300, FX-6200, bulldozer, amd
The FX-8350 Vishera processor from AMD has finally arrived with 8 fully unlocked cores of polished Piledriver processing power. With Piledriver there are no huge changes to the existing Bulldozer architecture, this is more of a polishing and optimizing the existing architecture and [H]ard|OCP's testing bears that out. While faster than the previous generation FX-8150 it still lags behind Intel's Ivy Bridge processors, disappointing but certainly expected. The unlocked cores do lend themselves somewhat to overclocking, with [H] hitting a stable 4.6GHz with all cores enabled, a 10% jump in frequency. At that speed it does better when competing with Intel's offerings, until you overclock them as well at which point the comparative performance suffers somewhat.
Make sure to catch Josh's review, covering both the 8 core FX-8350 and the $132 FX-6300 which has a disabled module; bringing back memories of older AMD chips whose modules could be brought back to life.
"AMD's new Piledriver core technology should not be a surprise to any enthusiast as much of its "embargoed" information has already been exposed on the Net. Today we take the AMD FX series model 8350 desktop variant, code named Vishera, and look at it in an enthusiast way as we expose its IPC at 4GHz, and a bit of overclocking."
Here are some more Processor articles from around the web:
- AMD's FX-8350 processor @ The Tech Report
- AMD FX-8350 "Vishera" Linux Benchmarks @ Phoronix
- AMD FX-8350 8-Core Black Edition Processor Review @ Legit Reviews
- AMD Vishera FX-8350 Review @ OCC
- The Vishera Review: AMD FX-8350, FX-8320, FX-6300 and FX-4300 Tested @ AnandTech
- AMD FX-8350: Piledriver @ Bjorn3D
- AMD FX-8350 @ Overclockers.com
- AMD FX-8350 vs Intel Core i7-3770K @ 4.8GHz - Multi-GPU Gaming Performance @ VR-Zone
- FX-8350 vs. Core i5-3470 CPU Review @ Hardware Secrets
- AMD FX-8350 (AM3+) Piledriver Processor Review @ eTeknix
- AMD FX-8350 Unlocked "Vishera" Octal Core CPU Review @ Hi Tech Legion
- AMD FX-8350 Vishera Desktop Processor @ Benchmark Reviews
- AMD FX-8350 and FX-6300 @ Legion Hardware
- AMD Piledriver FX Review - FX 8350, 8320, 6300 vs Intel Core i5 and i3 @ hardCOREware
- AMD FX-8350 Processor Review @ HardwareHeaven
- AMD FX-8350 and FX-6300 Piledriver @ TechSpot
- FX-8350 CPU Review; AMD's Vishera Arrives @ Hardware Canucks
- AMD FX8350 BE / Gigabyte HD7970 / ASUS Sabretooth 990FX R2 @ Kitguru
- AMD FX 8350 @ Guru of 3D
- AMD FX-8350 - "Piledriver" for AMD Socket AM3+ @ techPowerUp
Bulldozer to Vishera
Bulldozer is the word. Ok, perhaps it is not “the” word, but it is “a” word. When AMD let that little codename slip some years back, AMD enthusiasts and tech journalists started to salivate about the possibilities. Here was a unique and very new architecture that promised excellent single thread performance and outstanding multi-threaded performance all in a package that was easy to swallow and digest. Probiotics for the PC. Some could argue that the end product for Bulldozer and probiotics are the same, but I am not overly fond of writing articles containing four letter colorful metaphors.
The long and short of Bulldozer is that it was a product that was pushed out too fast, it had specifications that were too aggressive for the time, and it never delivered on the promise of the architecture. Logically there are some very good reasons behind the architecture, but implementing these ideas into a successful product is another story altogether. The chip was never able to reach the GHz range it was supposed to and stay within reasonable TDP limits. To get the chip out in a timely manner, timings had to be loosened internally so the chip could even run. Performance per clock was pretty dismal, and the top end FX-8150 was only marginally faster than the previous top end Phenom II X6 1100T. In some cases, the X6 was still faster and a more competent “all around” processor.
There really was not a whole lot for AMD to do about the situation. It had to have a new product, and it just did not turn out as nicely as they had hoped. The reasons for this are legion, but simply put AMD is competing with a company that is over ten times the size, with the resulting R&D budgets that such a size (and margins) can afford. Engineers looking for work are a dime a dozen, and Intel can hire as many as they need. So, instead of respinning Bulldozer ad nauseum and releasing new speed grades throughout the year by tweaking the process and metal layer design, AMD let the product line sit and stagnate at the top end for a year (though they did release higher TDP models based on the dual module FX-4000 and triple module FX-6000 series). Engineers were pushed into more forward looking projects. One of these is Vishera.
Subject: Processors | October 2, 2012 - 04:56 PM | Jeremy Hellstrom
Tagged: vishera, trinity, Steamroller, piledriver, bulldozer, amd, a8, a6, A4, a10, 5800K, 5600K
The NDA is over and we can finally tell you all about the new generation of Trinity, especially the compute portion which we were not allowed to discuss in the controversial preview. Part of the good news is the price, Legit Reviews found the highest MSRP is $122 for the A10-5800K and it is currently available, though at $130. The performance increase from the previous generation is decent for multicore applications though not so much for single threaded applications, overall you can expect general computing performance in line with Core i3 but not Core i5. Gaming on the other hand did show much improvement, especially with you compare the built in HD7660D to Intel's current HD4000 and HD3500. You can catch Josh's review right here.
"The internal testing from AMD that we can see above shows a 37% increase in the 3DMark 11 score between the first generation A-Series Llano and this generation of A-Series Trinity. While our numbers don't match their numbers exactly, our Llano system scored 1115 3Dmarks while the AMD internal testing showed 1150 3DMarks. Our AMD A10-5800K scored 1521 3DMarks while they scored 1570. The overall difference was remarkably similar, AMD is boasting an increase of 37% and we saw a difference of 36.4%..."
Here are some more Processor articles from around the web:
- AMD’s Trinity Faces Off With Intel’s Ivy Bridge @ SemiAccurate
- AMD “Virgo” Platform: 2nd Generation APU @ Bjorn3D
- AMD A10-5800K APU Performance Review @ HardwareHeaven
- AMD A10-5800K and A8-5600K APUs for Socket FM2 @ techPowerUp
- AMD A10-5800K Trinity APU Review @ TechwareLabs
- Asus F2A85-V Pro & AMD A10 5800K (w/ HD7660D) @ Kitguru
- AMD A10-5800K & A8-5600K Review: Trinity on the Desktop, Part 2 @ AnandTech
- AMD A10 5800K APU processor review and MSI FM-2 A85XA-G65 @ Guru of 3D
- AMD A10-5800K Unlocked "Trinity" Quad Core APU Review @ Hi Tech Legion
- AMD A8-3850 CPU review @ Rbmods
- Gigabyte F2A85X-UP4 & AMD A10 5800K @ Kitguru
- AMD A10-5800K / A8-5600K full review: Trinity for desktops @ Hardware.info
- AMD Trinity for Desktops. Part 1: Graphics Core @ X-bit Labs
- Workstation & Server CPU Comparison Guide @ TechARP
- All Core i3 Models @ Hardware Secrets
- Intel Core i3 3225 and 3220 review: entry-level Ivy Bridge @ Hardware.info
Get notified when we go live!