HSA 1.1 Released

Subject: Graphics Cards, Processors, Mobile | June 6, 2016 - 11:11 AM |
Tagged: hsa 1.1, hsa

The HSA Foundation released version 1.1 of their specification, which focuses on “multi-vendor” compatibility. In this case, multi-vendor doesn't refer to companies that refused to join the HSA Foundation, namely Intel and NVIDIA, but rather multiple types of vendors. Rather than aligning with AMD's focus on CPU-GPU interactions, HSA 1.1 includes digital signal processors (DSPs), field-programmable gate arrays (FPGAs), and other accelerators. I can see this being useful in several places, especially on mobile, where cameras, sound processors, and CPU cores, and a GPU regularly share video buffers.

HSA Foundation_Logo.png

That said, the specification also mentions “more efficient interoperation with non-HSA compliant devices”. I'm not quite sure what that specifically refers to, but it could be important to keep an eye on for future details -- whether it is relevant for Intel and NVIDIA hardware (and so forth).

Charlie, down at SemiAccurate, notes that HSA 1.1 will run on all HSA 1.0-compliant hardware. This makes sense, but I can't see where this is explicitly mentioned in their press release. I'm guessing that Charlie was given some time on a conference call (or face-to-face) regarding this, but it's also possible that he may be mistaken. It's also possible that it is explicitly mentioned in the HSA Foundation's press blast and I just fail at reading comprehension.

If so, I'm sure that our comments will highlight my error.

AMD HSA Patches Hoping for GCC 6

Subject: Graphics Cards, Processors | December 8, 2015 - 01:07 PM |
Tagged: hsa, GCC, amd

Phoronix, the Linux-focused hardware website, highlighted patches for the GNU Compiler Collection (GCC) that implement HSA. This will allow newer APUs, such as AMD's Carrizo, to accelerate chunks of code (mostly loops) that have been tagged with a precompiler flag as valuable to be done on the GPU. While I have done some GPGPU development, many of the low-level specifics of HSA aren't areas that I have too much experience with.


The patches have been managed by Martin Jambor of SUSE Labs. You can see a slideshow presentation of their work on the GNU website. Even though features froze about a month ago, they are apparently hoping that this will make it into the official GCC 6 release. If so, many developers around the world will be able to target HSA-compatible hardware in the first half of 2016. Technically, anyone can do so regardless, but they would need to specifically use the unofficial branch on the GCC Subversion repository. This probably means compiling it themselves, and it might even be behind on a few features in other branches that were accepted into GCC 6.

Source: Phoronix

Meet the Boltzmann Initiative, AMD's answer to HPC

Subject: General Tech | November 18, 2015 - 05:35 PM |
Tagged: amd, firepro, boltzmann, HPC, hsa

AMD has announced the Boltzmann Initiative to compete against Intel and NVIDIA in the HPC market this week at SC15.  It is not a physical product but rather new a way to unite the processing power of HSA compliant AMD APUs and FirePro GPUs.  They have announced several new projects including the Heterogeneous Compute Compiler (HCC) and Heterogeneous-compute Interface for Portability (HIP) for CUDA based apps which can automatically convert CUDA code into C++.  They also announced a headless Linux driver and HSA runtime infrastructure interface for managing clusters which utilizes their InfiniBand fabric interconnect to interface system memory directly to GPU memory as well as adding P2P GPU support and numerous other enhancements.   Check out more at DigiTimes.


"The Boltzmann Initiative leverages HSA's ability to harness both central processing units (CPU) and AMD FirePro graphics processing units (GPU) for maximum compute efficiency through software."

Here is some more Tech News from around the web:

Tech Talk

Source: DigiTimes

HSA Version 1.0 arrived today

Subject: General Tech | March 17, 2015 - 05:18 PM |
Tagged: hsa foundation, hsa, amd, arm, Samsung, Imagination Technologies, HSAIL

We have been talking about the HSA foundation since 2013, a cooperative effort by AMD, ARM, Imagination, Samsung, Qualcomm, MediaTek and TI to design a heterogeneous memory architecture to allow GPUs, DSPs and CPUs to all directly access the same physical memory.  The release of the official specifications today are a huge step forward for these companies, especially for garnering future mobile market share as physical hardware apart from Carrizo becomes available.

Programmers will be able to use C, C++, Fortran, Java, and Python to write HSA-compliant code which is then compiled into HSAIL (Heterogeneous System Architecture Intermediate Language) and from there to the actual binary executables which will run on your devices.  HSA currently supports x86 and x64 and there are Linux kernel patches available for those who develop on that OS.  Intel and NVIDIA are not involved in this project at all, they have chosen their own solutions for mobile devices and while Intel certainly has pockets deep enough to experiment NVIDIA might not.  We shall soon see if Pascal and improvements Maxwell's performance and efficiency through future generations can compete with the benefits of HSA.

The current problem is of course hardware, Bald Eagle and Carrizo are scheduled to arrive on the market soon but currently they are not available.  Sea Islands GPUs and Kaveri have some HSA enhancements but with limited hardware to work with it will be hard to convince developers to focus on programming HSA optimized applications.  The release of the official specs today is a great first step; if you prefer an overview to reading through the official documents The Register has a good article right here.


"The HSA Foundation today officially published version 1.0 of its Heterogeneous System Architecture specification, which (if we were being flippant) describes how GPUs, DSPs and CPUs can share the same physical memory and pass pointers between each other. (A provisional 1.0 version went live in August 2014.)"

Here is some more Tech News from around the web:

Tech Talk

Source: The Register
Subject: Processors
Manufacturer: AMD

Filling the Product Gaps

In the first several years of my PCPer employment, I typically handled most of the AMD CPU refreshes.  These were rather standard affairs that involved small jumps in clockspeed and performance.  These happened every 6 to 8 months, with the bigger architectural shifts happening some years apart.  We are finally seeing a new refresh of the AMD APU parts after the initial release of Kaveri to the world at the beginning of this year.  This update is different.  Unlike previous years, there are no faster parts than the already available A10-7850K.


This refresh deals with fleshing out the rest of the Kaveri lineup with products that address different TDPs, markets, and prices.  The A10-7850K is still the king when it comes to performance on the FM2+ socket (as long as users do not pay attention to the faster CPU performance of the A10-6800K).  The initial launch in January also featured another part that never became available until now; the A8-7600 was supposed to be available some months ago, but is only making it to market now.  The 7600 part was unique in that it had a configurable TDP that went from 65 watts down to 45 watts.  The 7850K on the other hand was configurable from 95 watts down to 65 watts.


So what are we seeing today?  AMD is releasing three parts to address the lower power markets that AMD hopes to expand their reach into.  The A8-7600 was again detailed back in January, but never released until recently.  The other two parts are brand new.  The A10-7800 is a 65 watt TDP part with a cTDP that goes down to 45 watts.  The other new chip is the A6-7600K which is unlocked, has a configurable TDP, and looks to compete directly with Intel’s recently released 20 year Anniversary Pentium G3258.

Click here to read the entire article!

HSA on Linux

Subject: General Tech | July 11, 2014 - 06:36 PM |
Tagged: linux, hsa, amd, open source

Open source HSA has arrived for the Linux kernel with a newly released set of patches which will allow Sea Islands and newer GPUs to share hardware resources.   These patches are both for a sample driver for any HSA-compatible hardware and the river for Radeon GPUs.  As the debut of the Linux 3.16 kernel is so close you shouldn't expect to see these patches included until 3.17 which should be released in the not too distant future.  Phoronix and Linux users everywhere give a big shout of thanks to AMD's John Bridgman for his work on this project.


"AMD has just published a massive patch-set for the Linux kernel that finally implements a HSA (Heterogeneous System Architecture) in open-source. The set of 83 patches implement a Linux HSA driver for Radeon family GPUs and serves too as a sample driver for other HSA-compatible devices. This big driver in part is what well known Phoronix contributor John Bridgman has been working on at AMD."

Here is some more Tech News from around the web:

Tech Talk

Source: Phoronix

Fully Enabling the A10-7850K while Utilizing a Standalone GPU

Subject: Processors | July 9, 2014 - 09:42 PM |
Tagged: nvidia, msi, Luxmark, Lightning, hsa, GTX 580, GCN, APU, amd, A88X, A10-7850K

When I first read many of the initial AMD A10 7850K reviews, my primary question was how would the APU act if there was a different GPU installed on the system and did not utilize the CrossFire X functionality that AMD talked about.  Typically when a user installs a standalone graphics card on the AMD FM2/FM2+ platform, they disable the graphics portion of the APU.  They also have to uninstall the AMD Catalyst driver suite.  So this then leaves the APU as a CPU only, and all of that graphics silicon is left silent and dark.


Who in their right mind would pair a high end graphics card with the A10-7850K? This guy!

Does this need to be the case?  Absolutely not!  The GCN based graphics unit on the latest Kaveri APUs is pretty powerful when used in GPGPU/OpenCL applications.  The 4 cores/2 modules and 8 GCN cores can push out around 856 GFlops when fully utilized.  We also must consider that the APU is the first fully compliant HSA (Heterogeneous System Architecture) chip, and it handles memory accesses much more efficiently than standalone GPUs.  The shared memory space with the CPU gets rid of a lot of the workarounds typically needed for GPGPU type applications.  It makes sense that users would want to leverage the performance potential of a fully functioning APU while upgrading their overall graphics performance with a higher end standalone GPU.

To get this to work is very simple.  Assuming that the user has been using the APU as their primary graphics controller, they should update to the latest Catalyst drivers.  If the user is going to use an AMD card, then it would behoove them to totally uninstall the Catalyst driver and re-install only after the new card is installed.  After this is completed restart the machine, go into the UEFI, and change the primary video boot device to PEG (PCI-Express Graphics) from the integrated unit.  Save the setting and shut down the machine.  Insert the new video card and attach the monitor cable(s) to it.  Boot the machine and either re-install the Catalyst suite if an AMD card is used, or install the latest NVIDIA drivers if that is the graphics choice.

Windows 7 and Windows 8 allow users to install multiple graphics drivers from different vendors.  In my case I utilized a last generation GTX 580 (the MSI N580GTX Lightning) along with the AMD A10 7850K.  These products coexist happily together on the MSI A88X-G45 Gaming motherboard.  The monitor is attached to the NVIDIA card and all games are routed through that since it is the primary graphics adapter.  Performance seems unaffected with both drivers active.


I find it interesting that the GPU portion of the APU is named "Spectre".  Who owns those 3dfx trademarks anymore?

When I load up Luxmark I see three entries: the APU (CPU and GPU portions), the GPU portion of the APU, and then the GTX 580.  Luxmark defaults to the GPUs.  We see these GPUs listed as “Spectre”, which is the GCN portion of the APU, and the NVIDIA GTX 580.  Spectre supports OpenCL 1.2 while the GTX 580 is an OpenCL 1.1 compliant part.

With both GPUs active I can successfully run the Luxmark “Sala” test.  The two units perform better together than when they are run separately.  Adding in the CPU does increase the score, but not by very much (my guess here is that the APU is going to be very memory bandwidth bound in such a situation).  Below we can see the results of the different units separate and together.


These results make me hopeful about the potential of AMD’s latest APU.  It can run side by side with a standalone card, and applications can leverage the performance of this unit.  Now all we need is more HSA aware software.  More time and more testing is needed for setups such as this, and we need to see if HSA enabled software really does see a boost from using the GPU portion of the APU as compared to a pure CPU piece of software or code that will run on the standalone GPU.

Personally I find the idea of a heterogeneous solution such as this appealing.  The standalone graphics card handles the actual graphics portions, the CPU handles that code, and the HSA software can then fully utilize the graphics portion of the APU in a very efficient manner.  Unfortunately, we do not have hard numbers on the handful of HSA aware applications out there, especially when used in conjunction with standalone graphics.  We know in theory that this can work (and should work), but until developers get out there and really optimize their code for such a solution, we simply do not know if having an APU will really net the user big gains as compared to something like the i7 4770 or 4790 running pure x86 code.


In the meantime, at least we know that these products work together without issue.  The mixed mode OpenCL results make a nice case for improving overall performance in such a system.  I would imagine with more time and more effort from developers, we could see some really interesting implementations that will fully utilize a system such as this one.  Until then, happy experimenting!

Source: AMD

AMD's Bald Eagle; 4K casino games anyone?

Subject: General Tech | May 21, 2014 - 07:06 PM |
Tagged: amd, Bald Eagle, embedded, hsa

AMD has just introduced their powerful new embedded chip called Bald Eagle.  Depending on the model of processor you purchase you get two or four Steamroller CPU cores, and up to eight GCN GPU cores based on the HD 9000 series.  That gives the higher end chips enough juice to power up to four independent 3D, 4K, or HD displays which you can bump up to nine if you include an embedded Radeon E8860 discrete GPU in your system.  The cores are all fully HSA compliant and will support ECC and non-ECC DDR3 at speeds of up to 2133MHz as well as support for PCIe Gen3 x16, PCIe Gen2 2x4 and USB and SATA as well.  Check out more at The Inquirer.


"Bald Eagle also enables heterogeneous system architecture (HSA), which first appeared in AMD chippery in its desktop Kaveri APUs this January, and which allows the CPU and GPU to share the same system memory, vastly simpifying the programming challenge of getting GPUs to shoulder the parallel-processing chores that they excel at far better then CPUs."

Here is some more Tech News from around the web:

Tech Talk

Source: The Register

Berlin invades San Francisco; meet the new HSA enabled Opteron

Subject: General Tech | April 17, 2014 - 05:07 PM |
Tagged: amd, hsa, berlin, Opteron X-series, Red Hat

Next Wednesday we will get our first look at the HSA enabled Opteron X Series, otherwise known as Berlin.  AMD will be unveiling the processor at the Red Hat Summit in San Francisco with an X2100 Opteron running on a Linux environment that is based on the Fedora Project.  We have very recently had a chance to see the desktop equivalent, Kaveri, in action but this will be the first example of AMD's heterogeneous computing on a server.  Keep your eyes peeled for our coverage, in the mean time you can get a preview at The Register.


"AMD will give the first public demo of its second-generation Opteron X-Series server processor, code-named "Berlin", at the Red Hat Summit in San Francisco on Wednesday."

Here is some more Tech News from around the web:

Tech Talk

Source: The Register

AMD announces Garlic and Onion flavours on their first HSA chips

Subject: Processors | January 14, 2014 - 07:52 PM |
Tagged: a10-6700, a8-6500, a8-7600, amd, APU, hsa, i3-4330, Kaveri

Not only are the first Kaveri reviews arriving today, the A10-7850K is up for sale on both NewEgg and Amazon and the A10-7700K is available on NewEgg.  This new part, at 45W competes favourably with the previous 100W Trinity APU in most tests and when Ryan boosted it to 65W it gained a little more.  The Steamroller cores have been updated but not in a way that has a huge effect on CPU performance, on the other hand the 384 SIMD units composing the GPU portion of this chip are quite impressive, 1080p gaming of current generation titles is possible on this chip and we haven't seen it's big brother with 512 SIMD units yet.  In the Tech Report's review you can see that BF4 is playable on this chip and this is not the Mantle version optimized for AMD's new architecture.  It is also a pity that Thief was unavailable to see just what TrueAudio is capable of.  Unfortunately this chip will not find its home in gamers dream machines, that is simply not where AMD is targeting its CPUs.  However, for SFF systems that need to be energy efficient and where a discrete GPU is to big to fit Kaveri will usher in a new level of performance.


"AMD's next-generation APU packs in a ton of innovation, including updated "Steamroller" CPU cores, GCN graphics, and advanced HSA features. But is it enough to restore AMD's competitiveness in desktop processors?"

Here are some more Processor articles from around the web: