Subject: Graphics Cards | April 5, 2016 - 02:13 AM | Tim Verry
Tagged: HPC, hbm, gpgpu, firepro s9300x2, firepro, dual fiji, deep learning, big data, amd
Earlier this month AMD launched a dual Fiji powerhouse for VR gamers it is calling the Radeon Pro Duo. Now, AMD is bringing its latest GCN architecture and HBM memory to servers with the dual GPU FirePro S9300 x2.
The new server-bound professional graphics card packs an impressive amount of computing hardware into a dual-slot card with passive cooling. The FirePro S9300 x2 combines two full Fiji GPUs clocked at 850 MHz for a total of 8,192 cores, 512 TUs, and 128 ROPs. Each GPU is paired with 4GB of non-ECC HBM memory on package with 512GB/s of memory bandwidth which AMD combines to advertise this as the first professional graphics card with 1TB/s of memory bandwidth.
Due to lower clockspeeds the S9300 x2 has less peak single precision compute performance versus the consumer Radeon Pro Duo at 13.9 TFLOPS versus 16 TFLOPs on the desktop card. Businesses will be able to cram more cards into their rack mounted servers though since they do not need to worry about mounting locations for the sealed loop water cooling of the Radeon card.
|FirePro S9300 x2||Radeon Pro Duo||R9 Fury X||FirePro S9170|
|GPU||Dual Fiji||Dual Fiji||Fiji||Hawaii|
|GPU Cores||8192 (2 x 4096)||8192 (2 x 4096)||4096||2816|
|Rated Clock||850 MHz||1050 MHz||1050 MHz||930 MHz|
|Texture Units||2 x 256||2 x 256||256||176|
|ROP Units||2 x 64||2 x 64||64||64|
|Memory||8GB (2 x 4GB)||8GB (2 x 4GB)||4GB||32GB ECC|
|Memory Clock||500 MHz||500 MHz||500 MHz||5000 MHz|
|Memory Interface||4096-bit (HBM) per GPU||4096-bit (HBM) per GPU||4096-bit (HBM)||512-bit|
|Memory Bandwidth||1TB/s (2 x 512GB/s)||1TB/s (2 x 512GB/s)||512 GB/s||320 GB/s|
|TDP||300 watts||?||275 watts||275 watts|
|Peak Compute||13.9 TFLOPS||16 TFLOPS||8.60 TFLOPS||5.24 TFLOPS|
AMD is aiming this card at datacenter and HPC users working on "big data" tasks that do not require the accuracy of double precision floating point calculations. Deep learning tasks, seismic processing, and data analytics are all examples AMD says the dual GPU card will excel at. These are all tasks that can be greatly accelerated by the massive parallel nature of a GPU but do not need to be as precise as stricter mathematics, modeling, and simulation work that depend on FP64 performance. In that respect, the FirePro S9300 x2 has only 870 GLFOPS of double precision compute performance.
Further, this card supports a GPGPU optimized Linux driver stack called GPUOpen and developers can program for it using either OpenCL (it supports OpenCL 1.2) or C++. AMD PowerTune, and the return of FP16 support are also features. AMD claims that its new dual GPU card is twice as fast as the NVIDIA Tesla M40 (1.6x the K80) and 12 times as fast as the latest Intel Xeon E5 in peak single precision floating point performance.
The double slot card is powered by two PCI-E power connectors and is rated at 300 watts. This is a bit more palatable than the triple 8-pin needed for the Radeon Pro Duo!
The FirePro S9300 x2 comes with a 3 year warranty and will be available in the second half of this year for $6000 USD. You are definitely paying a premium for the professional certifications and support. Here's hoping developers come up with some cool uses for the dual 8.9 Billion transistor GPUs and their included HBM memory!
Subject: Graphics Cards | February 3, 2016 - 02:37 AM | Tim Verry
Tagged: virtual machines, virtual graphics, mxgpu, gpu virtualization, firepro, amd
AMD made an interesting enterprise announcement today with the introduction of new FirePro S-Series graphics cards that integrate hardware-based virtualization technology. The new FirePro S1750 and S1750 x2 are aimed at virtualized workstations, render farms, and cloud gaming platforms where each virtual machine has direct access to the graphics hardware.
The new graphics cards use a GCN-based Tonga GPU with 2,048 stream processors paired with 8GB of ECC GDDR5 memory on the single slot FirePro S1750. The dual slot FirePro S1750 x2, as the name suggests, is a dual GPU card that features a total of 4,096 shaders (2,048 per GPU) and 16 GB of ECC GDDR5 (8 GB per GPU). The S1750 has a TDP of 150W while the dual-GPU S1750 x2 variant is rated at 265W and either can be passively cooled.
Where the graphics cards get niche is the inclusion of what AMD calls MxGPU (Multi-User GPU) technology which is derived from the SR-IOV (Single Root Input/Output Virtualization) PCI-Express standard. According to AMD, the new FirePro S-Series allows virtual machines direct access to the full range of GPU hardware (shaders, memory, ect.) and OpenCL 2.0 support on the software side. The S1750 supports up to 16 simultaneous users and the S1750 x2 tops out at 32 users. Each virtual machine is allocated an equal slice of the GPU, and as you add virtual machines the equal slices get smaller. AMD’s solution to that predicament is to add more GPUs to spread out the users and allocate each VM more hardware horsepower. It is worth noting that AMD has elected not to charge companies any per-user licensing fees for all these VMs the hardware supports which should make these cards more competitive.
The graphics cards use ECC memory to correct errors when dealing with very large numbers and calculations and every VM is reportedly protected and isolated such that one VM can not access any data of a different VM stored in graphics memory.
I am interested to see how these stack up compared to NVIDIA’s GRID and VGX GPU virtualization specialized graphics cards. The difference between the software versus hardware-based virtualization may not make much difference, but AMD’s approach may be every so slightly more efficient with the removal of layer between the virtual machine and hardware. We’ll have to wait and see, however.
Enterprise users will be able to pick up the new cards installed in systems from server manufacturers sometime in the first half of 2016. Pricing for the cards themselves appears to be $2,399 for the single GPU S1750 and $3,999 for the dual GPU S1750 x2.
Needless to say, this is all a bit more advanced (and expensive!) than the somewhat finicky 3D acceleration option desktop users can turn on in VMWare and VirtualBox! Are you experimenting with remote workstations and virtual machines for thin clients that can utilize GPU muscle? Does AMD’s MxGPU approach seem promising?
Subject: General Tech | November 18, 2015 - 12:35 PM | Jeremy Hellstrom
Tagged: amd, firepro, boltzmann, HPC, hsa
AMD has announced the Boltzmann Initiative to compete against Intel and NVIDIA in the HPC market this week at SC15. It is not a physical product but rather new a way to unite the processing power of HSA compliant AMD APUs and FirePro GPUs. They have announced several new projects including the Heterogeneous Compute Compiler (HCC) and Heterogeneous-compute Interface for Portability (HIP) for CUDA based apps which can automatically convert CUDA code into C++. They also announced a headless Linux driver and HSA runtime infrastructure interface for managing clusters which utilizes their InfiniBand fabric interconnect to interface system memory directly to GPU memory as well as adding P2P GPU support and numerous other enhancements. Check out more at DigiTimes.
"The Boltzmann Initiative leverages HSA's ability to harness both central processing units (CPU) and AMD FirePro graphics processing units (GPU) for maximum compute efficiency through software."
Here is some more Tech News from around the web:
- Microsoft Open-Sources Visual Studio Code @ Slashdot
- Microsoft's gamble pays off as half of enterprises pledge Windows 10 in 2016 @ The Inquirer
- Microsoft chief Satya drops an S bomb in Windows 10, cloud talk @ The Register
- Trend Micro warns of Ashley Madison fallout and rise in data breaches @ The Inquirer
- How to Test-Drive OpenStack @ Linux.com
- Adobe releases out-of-band security patches – amazingly not for Flash @ The Register
- Asus RP-AC56 802.11ac wireless extender @ Kitguru
Subject: General Tech, Graphics Cards | September 3, 2014 - 06:15 PM | Scott Michaud
Tagged: Matrox, firepro, cape verde xt gl, cape verde xt, cape verde, amd
Matrox, along with S3, develop GPU ASICs for use with desktop add-in boards, alongside AMD and NVIDIA. Last year, they sold less than 7000 units in their quarter according to my math (rounding to 0.0% market share implies < 0.05% of total market, which was 7000 units that quarter). Today, Matrox Graphics Inc. announce that they will use an AMD GPU on their upcoming product line.
While they do not mention a specific processor, they note that "the selected AMD GPU" will be manufactured at a 28nm process with 1.5 billion transistors. It will support DirectX 11.2, OpenGL 4.4, and OpenCL 1.2. It will have a 128-bit memory bus.
Basically, it kind-of has to be Cape Verde XT (or XT GL) unless it is a new, unannounced GPU.
If it is Cape Verde XT, it would have about 1.0 to 1.2 TFLOPs of single precision performance (depending on the chosen clock rate). Whatever clock rate is chosen, the chip contains 640 shader processors. It was first released in February 2012 with the Radeon HD 7770 GHz Edition. Again, this is assuming that AMD will not release a GPU refresh for that category.
Matrox will provide their PowerDesk software to configure multiple monitors. It will work alongside AMD's professional graphics drivers. It is a sad that to see a GPU ASIC manufacturer throw in the towel, at least temporarily, but hopefully they can use AMD's technology to remain in the business with competitive products. Who knows: maybe they will make a return when future graphics APIs reduce the burden of driver and product development?
Subject: General Tech | August 7, 2014 - 12:45 PM | Jeremy Hellstrom
Tagged: HPC, amd, firepro, S9150, S9050, opencl
The new cooling on the 290X tends to have it at the top of the gaming charts and with the impending release of two new FirePro HPC cards AMD looks to take the productivity title away from the Tesla K40. The higher end S9150 boasts 16GB GDDR5 memory with a 512-bit memory interface, 44 GCN compute units with 64 stream processors each there is a total of 2816 stream processors on board. That equates to 5.07 TFLOPS peak single-precision 2.53 TFLOPS peak double-precision performance with theoretical memory bandwidth of 320GB per second. AMD expects the S9150 to have support for OpenCL 2.0 drivers by the end of the year, which the lower priced and specced S9050 will not though both will support AMD Stream technology and OpenCL 1.2. Check them out at The Register.
"The company's new big gun is the FirePro S9150 card, which maxes out at a blistering 5.07 TFLOPS peak single-precision floating-point performance and 2.53 TFLOPS peak double-precision performance."
Here is some more Tech News from around the web:
- How to Choose the Best Linux Desktop for You @ Linux.com
- nCrypted Cloud brings client side integration to Dropbox, Microsoft Onedrive @ The Inquirer
- IBM can't give away its chip business: report @ The Register
- Testing VR Limits with a Raspberry Pi @ Hack a Day
- Google Will Give a Search Edge To Websites That Use Encryption @ Slashdot
- OpenSSL receives nine post-Heartbleed critical bug fixes @ The Inquirer
- Now even Internet Explorer will throw lousy old Java into the abyss @ The Register
- Striker Capsule Task Light @ Benchmark Reviews
- Almost $1K worth of prizes up for grabs in our haiku contest @ The Tech Report
Subject: General Tech, Graphics Cards | March 26, 2014 - 05:43 PM | Scott Michaud
Tagged: amd, firepro, W9100
The AMD FirePro W9100 has been announced, bringing the Hawaii architecture to non-gaming markets. First seen in the Radeon R9 series of graphics cards, it has the capacity for 5 TeraFLOPs of single-precision (32-bit) performance and 2 TeraFLOPs of double-precision (64-bit). The card also has 16GB of GDDR5 memory to support it. From the raw numbers, this is slightly more capacity than either the Titan Black or Quadro K6000 in all categories. It will also support six 4K monitors (or three at 60Hz), per card. AMD supports up to four W9100 cards in a single system.
Professional users can be looking for several things in their graphics cards: compute performance (either directly or through licensed software such as Photoshop, Premiere, Blender, Maya, and so forth), several high-resolution monitors (or digital signage units), and/or a lot of graphics performance. The W9100 is basically the top of the stack which covers all three of these requirements.
AMD also announced a system branding initiative called, "AMD FirePro Ultra Workstation". They currently have five launch partners, Supermicro, Boxx, Tarox, Silverdraft, and Versatile Distribution Services, which will have workstations available under this program. The list of components for a "Recommend" certification is: two eight-core 2.6 GHz CPUs, 32GB of RAM, four PCIe 3.0 x16 slots, a 1500W Platinum PSU, and a case with nine expansion slots (to allow four W9100 GPUs along with one SSD or SDI interface card).
Also, while the company has heavily discussed OpenCL in their slide deck, they have not mentioned specific versions. As such, I will assume that the FirePro W9100 supports OpenCL 1.2, like the R9-series, and not OpenCL 2.0 which was ratified back in November. This is still a higher conformance level than NVIDIA, which is at OpenCL 1.1.
Currently no word about pricing or availability.
Subject: Processors | January 7, 2014 - 04:52 AM | Josh Walrath
Tagged: amd, CES, 2014, Kaveri, A10 7850K, A10 7700K, APU, firepro, hsa
This year’s AMD CES was actually more interesting than I was expecting. The details of the event were well known, as most Kaveri details have been revealed over the past few months. I was unsure what Lisa Su and the gang would go over, but it was actually more interesting than I was expecting.
This past year has been a big one for AMD. They seem to be doing a lot better than others expected them to, especially with all of the delayed product launches on the CPU side for quite a few years. This year saw the APU take a pretty prominent place in the industry with the launch of the latest generation consoles from Sony and Microsoft. AMD made inroads with mobile form factors with a variety of APUs. The HSA Foundation members have grown and HSA members ship two out of every three connected, smart devices. Apple also includes Firepro graphics cards with all of their new Mac Pros.
Kaveri is of course the big news here. AMD feels that this is the best APU yet. The combination of Steamroller CPU cores, GCN graphics compute cores, HSA, hUMA, HQ, TrueAudio, Mantle support, PCI-E 3.0 support, and a configurable TDP makes for a pretty compelling product. AMD has shuffled some nomenclature about by saying that Kaveri, at the top end, is comprised of 12 compute cores. These include 4 Steamroller cores and 8 GCN compute clusters. Each compute cluster matches the historical definition of a core, but of course it looks quite a bit different than a traditional x86 core.
We have gone over Kaveri pretty extensively in the past. The CPU is clocked at 3.7 GHz with a 4 GHz boost. The graphics portion clocks in at 720 MHz. It can support up to DDR-3 2400 MHz memory, which is really needed to extract as much performance out of this new APU. Benchmarks provided by AMD show this product to be a big jump from the previous Richland, and in these particular benchmarks are quite a bit faster than the competing i5 4670K.
Gaming performance is also improved. This APU can run most current applications at 1080P resolutions with low to medium quality settings. Older titles can be run at 1080P with Medium to High/Extreme settings. While this processor is rated at around 867 GFLOPS, which is around 110 GFLOPS greater than the previous top end Richland, it is more efficient at delivering that theoretical performance. It looks to be a significant improvement all around.
Software support is improving with applications from companies like Adobe, The Document Foundation, and Nuance. These cover HSA applications and in Nuance’s case, using the TrueAudio portion to clean up and accelerate voice recognition. TrueAudio is also being supported in five upcoming games. This is not a huge amount, but it is a decent start for this new technology.
Mantle is gaining a lot more momentum with support from 3 engines, 5 developers, and 20+ games in development. They showed off Battlefied 4 running Mantle on a Kaveri APU for the first time publicly. They mentioned that it ran 45% faster than Direct3D at the same quality levels on the same hardware. The display showed frame rates up in the low 50 fps area.
AMD is continuing to move forward on their low power offerings based on Beema and Mullins. Lisa claims that these parts are outperforming the Intel Baytrail offerings in both CPU performance and graphics. Unfortunately, she mentioned noting about the power consumption associated with these results. They showed off the Discovery tablet as well as a fully functional PC that was the size of a large cellphone.
They closed up the even by talking about the Surround House 2. This demo looks significantly better than the previous iteration we saw last year. This features something like a 34.2 speaker setup in a projected dome. It is much more complex than the House from last year, but the hardware running it all is rather common. A single high end Firepro card running on a single A10 7850K. The demo is also one of the first shows of a 360 degree gesture recognition setup.
AMD has come a long way since hitting rock bottom a few years back. They continue to claw their way back to relevance, and they hope that Kaveri will help them regain a foothold in the computing market. They are certainly doing well in the graphics market, but the introduction of Kaveri should help them gain more momentum in the CPU/APU market. We have yet to test Kaveri on our own, but initial results look promising. It is a better APU, but we just don’t know how much better so far.
Follow all of our coverage of the show at http://pcper.com/ces!
Subject: General Tech, Graphics Cards | December 19, 2013 - 07:23 PM | Scott Michaud
Tagged: amd, firepro, SPECviewperf
SPECviewperf 12 is a benchmark for workstation components that attempts to measure performance expected for professional applications. It is basically synthetic but is designed to quantify how your system can handle Maya, for instance. AMD provided us with a press deck of some benchmarks they ran leading to many strong FirePro results in the entry to mid-range levels.
They did not include high-end results which they justify with the quote, "[The] Vast majority of CAD and CAE users purchase entry and mid-range Professional graphics boards". That slide, itself, was titled, "Focusing Where It Matters Most". I will accept that but I assume they did the benchmarks and wonder if it would have just been better to include them.
The cards AMD compared are:
- Quadro 410 ($105) vs FirePro V3900 ($105)
- Quadro K600 ($160) vs FirePro V4900 ($150)
- Quadro K2000 ($425) vs FirePro W5000 ($425)
- Quadro K4000 ($763) vs FirePro W7000 ($750)
In each of the pairings, about as equally-priced as possible, AMD held decent lead throughout eight tests included in SPECviewperf 12. You could see the performance gap leveling off as prices begun to rise, however.
Obviously a single benchmark suite should be just one data-point when comparing two products. Still, these are pretty healthy performance numbers.
Subject: General Tech, Graphics Cards | October 23, 2013 - 07:30 PM | Scott Michaud
Tagged: amd, firepro
Currently AMD holds 18% market share with their FirePro line of professional GPUs. This compares to NVIDIA who owns 81% with Quadro. I assume the "other" category is the sum of S3 and Matrox who, together, command 1% of the professional market (just the professional market)
According to Jon Peddie of JPR, as reported by X-Bit Labs, AMD intends to wrestle back revenue left unguarded for NVIDIA. "After years of neglect, AMD’s workstation group, under the tutorage of Matt Skyner, has the backing and commitment of top management and AMD intends to push into the market aggressively." They have already gained share this year.
During AMD's 3rd Quarter (2013) earnings call, CEO Rory Read outlined the importance of the professional graphics market.
We also continue to make steady progress in another of growth businesses in the third quarter as we delivered our fifth consecutive quarter of revenue and share growth in the professional graphics area. We believe that we can continue to gain share in this lucrative part of the GPU market based on our product portfolio, design wins in flight, and enhanced channel programs.
On the same conference call (actually before and after the professional graphics sound bite), Rory noted their renewed push into the server and embedded SoC markets with 64-bit x86 and 64-bit ARM processors. They will be the only company manufacturing both x86 and ARM solutions which should be an interesting proposition for an enterprise in need of both. Why deal with two vendors?
Either way, AMD will probably be refocusing on the professional and enterprise markets for the near future. For the rest of us, this hopefully means that AMD has a stable (and confident) roadmap in the processor and gaming markets. If that is the case, a profitable Q3 is definitely a good start.
Subject: General Tech, Systems | September 9, 2013 - 09:00 AM | Tim Verry
Tagged: Xeon Phi, workstation, quadro, micron, LSI, k6000, Ivy Bridge-EP, firepro, dell
Along with the release of new mobile workstations, Dell announced three new desktop workstations. Specifically, Dell is launching the T3610, T5610, and T7610 PC workstations under its Precision series. The new systems reside in redesigned cases with improved cable management, removable power supplies (tool-less, removable by sliding out from rear panel), and in the case of the T7610 removable hard drives. All of the new Precision workstations have been outfitted with Intel's latest Ivy Bridge-EP based Xeon processors, ECC memory, workstation-class graphics cards from AMD and NVIDIA, Xeon Phi accelerator card options, LSI hardware RAID controllers, and updated software solutions from Intel and Dell.
The new Precision workstations side-by-side. From left to right: T3610, T5610, and T7610.
Dell's Precision T3610 is a the mid-tower system of the group powered by single socket Xeon E5-2600 v2 hardware that further supports up to 128GB DDR3 ECC memory, two graphics cards, three 3.5” hard drives, and four 2.5” SSDs.
The Precision T3610, a new single socket, mid-range workstation.
The Precision T5610 ups the ante to a dual socket IVB-EP processor system that can be configured with up to 128GB DDR3 ECC memory, two AMD FirePro or NVIDIA Quadro (e.g. Quadro K5000) graphics cards, a Tesla K20C accelerator card, three 3.5” hard drives, and four 2.5” solid state drives.
Finally, the T7610 workstation supports dual Intel Ivy Bridge-EP Xeon E5-2600 v2 series processors (up to 24 cores per system), up to 512GB DDR3 ECC memory, three graphics cards (including two NVIDIA Quadro K6000 cards), four 3.5” hard drives, and eight 2.5” SSDs.
Dell's Precision T5610 dual socket workstation.
The new Precision workstations can also be configured with an Intel Xeon Phi 3120A accelerator card in lieu of a Tesla card. The choice will mainly depend on the applications being used and the development resources and expertise available. Both options are designed to accelerate highly parallel workloads in applications that have been compiled to support them. Further, users can add an LSI hardware RAID card with 1GB of onboard memory to the systems. Dell further offers a Micron P320h PCI-E SSD that, while not bootable, offers up 350GB of high performance storage that excels at high sequential reads and writes.
On the software front, Dell is including the Dell Precision Performance Optimizer and the Intel Cache Acceleration Software. The former automatically configures and optimizes the workstation for specific applications based on profiles that are reportedly regularly updated. The other bit of software works to optimize systems that use both hard drives and SSDs with the SSDs as a cache for the mechanical storage. The Intel Cache Acceleration Software configures the caching algorithms to favor caching very large files on the solid state storage. It is a different approach to consumer caching strategies, but one that works well with businesses that use these workstations to process large data sets.
The Dell Precision T7610 workstation.
The Dell workstations are aimed at businesses doing scientific analysis, professional engineering, and complex 3D modeling. The T7610 in particular is aimed at the oil and gas industry for use in simulations and modeling as companies search for new oil deposits.
All three systems will be available for purchase worldwide beginning September 12th. Some of the options, such as 512GB of ECC and the NVIDIA Quadro K6000 on the T7610 will not be available until next month, however. The T3610 has a starting price of $1,099 while the T5610 and T7610 have starting prices of $2,729 and $3,059 respectively.
What are your thoughts on Dell's new mid-tower workstations?