Subject: Systems | March 9, 2017 - 07:01 AM | Scott Michaud
Tagged: nvidia, microsoft, hgx-1, GP100, dgx-1
When NVIDIA announced the Pascal architecture at last year’s GTC, they started with the GP100 architecture that was to be initially available in their $129,000 DGX-1 PC. In fact, this device contained eight of those “Big Pascal” GPUs that are connected together by their NVLink interconnection.
Now, almost a full year later, Microsoft, NVIDIA, and Ingrasys have announced the HGX-1 system. It, too, will contain eight GP100 GPUs through eight Tesla P100 accelerators. On the CPU side of things, Microsoft is planning on utilizing the next generation of x86 processors, Intel Skylake (which we assume means Skylake-X) and AMD Naples in these "Project Olympus" servers. Future versions could also integrate Intel FPGAs for an extra level of acceleration. ARM64 is another goal of theirs, but in the more distant future.
At the same time, NVIDIA has also announced, through a single-paragraph statement, that they are joining the Open Compute Project. This organization contains several massive players in the data center market, spanning from Facebook to Rackspace to Bank of America.
Whenever it arrives, the HGX-1 will be intended for cloud-based AI computations. Four of these machines are designed to be clustered together at high bandwidth, which I estimate would have north of 160 TeraFLOPs of double-precision (FP64) or 670 TeraFLOPs of half-precision (FP16) performance in the GPUs alone, depending on final clocks.