Intel Sheds More Light On Benefits of Nervana Neural Network Processor

Subject: General Tech, Processors | December 12, 2017 - 04:52 PM |
Tagged: training, nnp, nervana, Intel, flexpoint, deep learning, asic, artificial intelligence

Intel recently provided a few insights into its upcoming Nervana Neural Network Processor (NNP) on its blog. Built in partnership with deep learning startup Nervana Systems which Intel acquired last year for over $400 million, the AI-focused chip previously codenamed Lake Crest is built on a new architecture designed from the ground up to accelerate neural network training and AI modeling.

new_nervana_chip-fb.jpg

The full details of the Intel NNP are still unknown, but it is a custom ASIC with a Tensor-based architecture placed on a multi-chip module (MCM) along with 32GB of HBM2 memory. The Nervana NNP supports optimized and power efficient Flexpoint math and interconnectivity is huge on this scalable platform. Each AI accelerator features 12 processing clusters (with an as-yet-unannounced number of "cores" or processing elements) paired with 12 proprietary inter-chip links that 20-times faster than PCI-E, four HBM2 memory controllers, a management-controller CPU, as well as standard SPI, I2C, GPIO, PCI-E x16, and DMA I/O. The processor is designed to be highly configurable and to meet both mode and data parallelism goals.

The processing elements are all software controlled and can communicate with each other using high speed bi-directional links at up to a terabit per second. Each processing element has more than 2MB of local memory and the Nervana NNP has 30MB in total of local memory. Memory accesses and data sharing is managed with QOS software which controls adjustable bandwidth over multiple virtual channels with multiple priorities per channel. Processing elements can talk to and send/receive data between each other and the HBM2 stacks locally as well as off die to processing elements and HBM2 on other NNP chips. The idea is to allow as much internal sharing as possible and to keep as much data stored and transformed in local data as possible in order to save precious HBM2 bandwidth (1TB/s) for pre-fetching upcoming tensors, reduce the number of hops and resulting latency by not having to go out to the HBM2 memory and back to transfer data between cores and/or processors, and to save power. This setup also helps Intel achieve an extremely parallel and scalable platform where multiple Nervana NNP Xeon co-processors on the same and remote boards effectively act as a massive singular compute unit!

Intel Lake Crest Block Diagram.jpg
 

Intel's Flexpoint is also at the heart of the Nervana NNP and allegedly allows Intel to achieve similar results to FP32 with twice the memory bandwidth while being more power efficient than FP16. Flexpoint is used for the scalar math required for deep learning and uses fixed point 16-bit multiply and addition operations with a shared 5-bit exponent. Unlike FP16, Flexpoint uses all 16-bits of address space for the mantissa and passes the exponent in the instruction. The NNP architecture also features zero cycle transpose operations and optimizations for matrix multiplication and convolutions to optimize silicon usage.

Software control allows users to dial in the performance for their specific workloads, and since many of the math operations and data movement are known or expected in advance, users can keep data as close to the compute units working on that data as possible while minimizing HBM2 memory accesses and data movements across the die to prevent congestion and optimize power usage.

Intel is currently working with Facebook and hopes to have its deep learning products out early next year. The company may have axed Knights Hill, but it is far from giving up on this extremely lucrative market as it continues to push towards exascale computing and AI. Intel is pushing for a 100x increase in neural network performance by 2020 which is a tall order but Intel throwing its weight around in this ring is something that should give GPU makers pause as such an achievement could cut heavily into their GPGPU-powered entries into this market that is only just starting to heat up.

You won't be running Crysis or even Minecraft on this thing, but you might be using software on your phone for augmented reality or in your autonomous car that is running inference routines on a neural network that was trained on one of these chips soon enough! It's specialized and niche, but still very interesting.

Also read:

Source: Intel
Author:
Manufacturer: PC Perspective

Why?

Astute readers of the site might remember the original story we did on Bitcoin mining in 2011, the good ole' days where the concept of the blockchain was new and exciting and mining Bitcoin on a GPU was still plenty viable.

gpu-bitcoin.jpg

However, that didn't last long, as the race for cash lead people to developing Application Specific Integrated Circuits (ASICs) dedicated solely to Bitcoin mining quickly while sipping power. Use of the expensive ASICs drove the difficulty of mining Bitcoin to the roof and killed any sort of chance of profitability from mere mortals mining cryptocurrency.

Cryptomining saw a resurgence in late 2013 with the popular adoption of alternate cryptocurrencies, specifically Litecoin which was based on the Scrypt algorithm instead of AES-256 like Bitcoin. This meant that the ASIC developed for mining Bitcoin were useless. This is also the period of time that many of you may remember as the "Dogecoin" era, my personal favorite cryptocurrency of all time. 

dogecoin-300.png

Defenders of these new "altcoins" claimed that Scrypt was different enough that ASICs would never be developed for it, and GPU mining would remain viable for a larger portion of users. As it turns out, the promise of money always wins out, and we soon saw Scrypt ASICs. Once again, the market for GPU mining crashed.

That brings us to today, and what I am calling "Third-wave Cryptomining." 

While the mass populous stopped caring about cryptocurrency as a whole, the dedicated group that was left continued to develop altcoins. These different currencies are based on various algorithms and other proofs of works (see technologies like Storj, which use the blockchain for a decentralized Dropbox-like service!).

As you may have predicted, for various reasons that might be difficult to historically quantify, there is another very popular cryptocurrency from this wave of development, Ethereum.

ETHEREUM-LOGO_LANDSCAPE_Black.png

Ethereum is based on the Dagger-Hashimoto algorithm and has a whole host of different quirks that makes it different from other cryptocurrencies. We aren't here to get deep in the woods on the methods behind different blockchain implementations, but if you have some time check out the Ethereum White Paper. It's all very fascinating.

Continue reading our look at this third wave of cryptocurrency!

Seasonic PSUs Will Power HashFast Bitcoin Miners

Subject: General Tech, Cases and Cooling, Systems | October 22, 2013 - 07:10 PM |
Tagged: seasonic, Power Supplies, mining, bitcoin, asic

Seasonic (Sea Sonic Electronics) has announced a design win that will see its power supplies used in HashFast’s bitcoin mining rigs. The upcoming HashFast mining rigs feature the company’s “Golden Nonce” ASIC(s) and all-in-one water coolers. HashFast has a single ASIC Baby Jet and multi-ASIC Sierra rig. Both units will be available December 15 starting at $2,250 and $6,300 respectively.

The Seasonic power supplies are high efficiency models with Japanese capacitors and at least 80 PLUS Bronze. On the high end, Seasonic has PSUs that are up to 93% efficient. HashFast stated that it chose Seasonic for its mining rigs because of the build quality and efficiency. The Baby Jet and Sierra mining rigs allow users to overclock the ASICs, and the systems can be rather demanding on PSUs.

HashFast Baby Jet BTC Miner.jpg

The Golden Nonce ASIC is a 28nm chip that is rated at 400 GHash/s and 0.65 Watts per Gigahash.

Beyond that, the companies have not gone into specifics. It is good news for Seasonic, and should mean a stable system for bitcoin miners (the 93% efficiency rating is nice as well, as it means less wasted electricity and slightly more bitcoin mining profit).

The full press blast is below for reference.

Read more about Bitcoin @ PC Perspective!

Source: Seasonic