AMD Introduces Radeon Pro SSG: A Professional GPU Paired With Low Latency Flash Storage (Updated)

Subject: Graphics Cards | July 27, 2016 - 01:56 AM |
Tagged: solid state, radeon pro, Polaris, gpgpu, amd

UPDATE (July 27th, 1am ET):  More information on the Radeon Pro SSG has surfaced since the original article. According to AnandTech, the prototype graphics card actually uses an AMD Fiji GPU. The Fiji GPU is paired onboard PCI-E based storage using the same PEX8747 bridge chip used in the Radeon Pro Duo. Storage is handled by two PCI-E 3.0 x4 M.2 slots that can accommodate up to 1TB of NAND flash storage. As I mentioned below, having the storage on board the graphics card vastly reduces latency by reducing the number of hops and not having to send requests out to the rest of the system. AMD had more numbers to share following their demo, however.

From the 8K video editing demo, the dual Samsung 950 Pro PCI-E SSDs (in RAID 0) on board the Radeon Pro SSG hit 4GB/s while scrubbing through the video. That same video source stored on a Samsung 950 Pro attached to the motherboard had throughput of only 900MB/s. In theory, reaching out to system RAM still has raw throughput advantages (with DDR4 @ 3200 MHz  on a Haswell-E platform theroretically capable of 62 GB/s reads and 47 GB/s writes though that would be bottlenecked by the graphics card having to go over the PCI-E 3.0 x16 link and it's maximum of 15.754 GB/s.). Of course if you can hold it in (much smaller) GDDR5 (300+GB/s depending on clocks and memory bus width) or HBM (1TB/s) and not have to go out to any other storage tier that's ideal but not always feasible especially in the HPC world.

However, having onboard storage on the same board as the GPU only a single "hop" away vastly reduces latency and offers much more total storage space than most systems have in DDR3 or DDR4. In essence, the solid state storage on the graphics card (which developers will need to specifically code for) acts as a massive cache for streaming in assets for data sets and workloads that are highly impacted by latency. This storage is not the fastest, but is the next best thing for holding active data outside of GDDR5/x or HBM. For throughput intensive workloads reaching out to system RAM will be better Finally, reaching out to system attached storage should be the last resort as it will be the slowest and most latent. Several commentors mentioned using a PCI-E based SSD in a second slot on the motherboard accessed much like GPUs in CrossFire communicate now (DMA over the PCI-E bus) which is an interesting idea that I had not considered.

Per my understanding of the situation, I think that the on board SSG storage would still be slightly more beneficial than this setup but it would get you close (I am assuming the GPU would be able to directly interact and request data from the SSD controller and not have to rely on the system CPU to do this work but I may well be mistaken. I will have to look into this further and ask the experts heh). On the prototype Radeon Pro SSG the M.2 slots are actually able to be seen as drives by the system and OS so it is essentially acting as if there was a PCI-E adapter card in a slot on the motherboard holding those drives but that may not be the case should this product actually hit the market. I do question their choice to go with Fiji rather than Polaris, but it sounds like they built the prototype off of the Radeon Pro Duo platform so I suppose it would make sense there.

Hopefully the final versions in 2017 or beyond use at least Vega though :).

 Alongside the launch of new Radeon Pro WX (workstation) series graphics cards, AMD teased an interesting new Radeon Pro product: the Radeon Pro SSG. This new professional graphics card pairs a Polaris GPU with up ot a terabyte of on board solid state storage and seeks to solve one of the biggest hurdles in GP GPU performance when dealing with extremely large datasets which is latency.

AMD Radeon Pro SSG.jpg

One of the core focuses of AMD's HSA (heterogeneous system architecture) is unified memory and the ability of various processors (CPU, GPU, specialized co-processors, et al) to work together efficiently by being able to access and manipulate data from the same memory pool without having to copy data bck and forth between CPU-accessible memory and GPU-accessible memory. With the Radeon Pro SSG, this idea is not fully realized (it is more of a sidestep), but it will move performance further. It does not eliminate the need to copy data to the GPU before it can work on it, but once copied the GPU will be able to work on data stored in what AMD describes as a one terabyte frame buffer. This memory will be solid state and very fast, but more importantly it will be able to get at the data with much lower latency than previous methods. AMD claims the solid state storage (likely NAND but they have not said) will link with the GPU over a dedicated PCI-E bus. I suppose that if you can't bring the GPU to the data, you bring the data to the GPU!

Considering AMD's previous memory champ – the Radeon W9100 – maxed out at 32GB of GDDR5, the teased Radeon Pro SSG with its 1TB of purportedly low latency onboard flash storage opens up a slew of new possibilities for researchers and professionals in media, medical, and scientific roles working with massive datasets for imaging, creation, and simulations! I expect that there are many professionals out there eager to get their hands on one of these cards! They will be able to as well thanks to a beta program launching shortly, so long as they have $10,000 for the hardware!

AMD gave a couple of examples in their PR on the potential benefits of its "solid state graphics" including the ability to image a patient's beating heart in real time to allow medical professionals to examine and spot issues as early as possible and using the Radeon Pro SSG to edit and scrub through 8K video in real time at 90 FPS versus 17 with current offerings. On the scientific side of things being able to load up entire models into the new graphics memory (not as low latency as GDDR5 or HBM certainly) will be a boon as will being able to get data sets as close to the GPU as possible into servers using GPU accelerated databases powering websites accessed by millions of users.

It is not exactly the HSA future I have been waiting for ever so impatiently, but it is a nice advancement and an intriguing idea that I am very curious to see how well it pans out and if developers and researchers will truly take advantage of and use to further their projects. I suspect something like this could be great for deep learning tasks as well (such as powering the "clouds" behind self driving cars perhaps).

Stay tuned to PC Perspective for more information as it develops.

This is definitely a product that I will be watching and I hope that it does well. I am curious what Nvidia's and Intel's plans are here as well! What are your thoughts on AMD's "Solid State Graphics" card? All hype or something promising?

Source: AMD

Seagate Introduces SSHD Lineup with Dual Mode NAND Cache

Subject: Storage | March 8, 2013 - 09:20 AM |
Tagged: sshd, solid state, Seagate, Intel SRT, cache, adaptive memory

Following the announcement that the company would be axing 7200 rpm notebook drives, Seagate has introduced its third generation hybrid hard drives. The new Seagate Solid State Hybrid Drives (SSHD) will initially launch with two notebook drives and a single desktop-sized drive. The hybrid drives will combine a spinning platter drive with 8GB of NAND flash with Seagate’s Adaptive Memory tech that will reportedly cache reads as well as writes.

The 2.5” notebook SSHDs include a 7mm model that combines 500GB of mechanical storage and 8GB of Adaptive Memory cache. This model will retail for around $80. There will also be a slightly larger 9.5mm  with 8GB of cache and 1TB mechanical hard drive capacity. The 1TB model utilizes two 500GB, 5400RPM platters and will retail for just under $100.

Seagate SSHD.jpg

The desktop SSHDs come in 3.5” form factor and will initially use 7200 RPM platters.  Seagate will offer up to 2TB of mechanical storage with its SSHDs and 8GB of NAND flash for caching.  Seagate claims that its desktop SSHD is up to four times faster than other mechanical hard drives, (according to PC Mark Vantage) which is likely due to the Adaptive Memory technology caching frequently used data on the flash memory and the use of 1TB platters. The 1TB and 2TB SSHD will cost around $100 and $150 respectively. Naturally, the SSHDs will carry a small premium over traditional mechanical hard drives. They will still be much more price-efficient than Solid State Drives for the storage offered (though I would still like to see a larger NAND cache).

Interestingly, Tech Report was able to glean a few more details about Seagate’s third generation hybrid drives. Reportedly, the drives will be capable of writing as well as reading to/from the NAND cache. That is a major step up from previous generation’s which limited the drive’s flash storage to a read-only cache. Seagate has reportedly built the drives such that they will have enough capacitance to flush the write cache in the event of a power failure (so that you will not lose any data). The dual mode NAND term stems from Seagate’s ability to use SLC for boot data and the write cache and address the remaining NAND as MLC flash. Unfortunately, details are scarce on how Seagate is doing this.

The SSHDs will come with three year warranties, but Seagate has rated the NAND flash at a lifespan of at least five years. In an neat twist, Seagate is also allegedly working on another SSHD implementation that will combine a mechanical hard drive and a larger NAND cache. However, the flash memory will be managed by Intel’s Smart Response Technology instead of Seagate’s own Adaptive Memory tech (which doesn't need additional drives, unlike SRT). Using the port multiplexing aspect of the SATA spec, Seagate will be able to put both drives into a single 3.5” form factor hybrid drive. Admittedly, this is the Seagate SSHD that I am most excited about, despite the fact that it’s also the drive I know the least about. I’m interested to see what kind of performance Seagate can wring out of the larger cache!

Source: Seagate