Subject: General Tech, Processors | December 12, 2017 - 04:52 PM | Tim Verry
Tagged: training, nnp, nervana, Intel, flexpoint, deep learning, asic, artificial intelligence
Intel recently provided a few insights into its upcoming Nervana Neural Network Processor (NNP) on its blog. Built in partnership with deep learning startup Nervana Systems which Intel acquired last year for over $400 million, the AI-focused chip previously codenamed Lake Crest is built on a new architecture designed from the ground up to accelerate neural network training and AI modeling.
The full details of the Intel NNP are still unknown, but it is a custom ASIC with a Tensor-based architecture placed on a multi-chip module (MCM) along with 32GB of HBM2 memory. The Nervana NNP supports optimized and power efficient Flexpoint math and interconnectivity is huge on this scalable platform. Each AI accelerator features 12 processing clusters (with an as-yet-unannounced number of "cores" or processing elements) paired with 12 proprietary inter-chip links that 20-times faster than PCI-E, four HBM2 memory controllers, a management-controller CPU, as well as standard SPI, I2C, GPIO, PCI-E x16, and DMA I/O. The processor is designed to be highly configurable and to meet both mode and data parallelism goals.
The processing elements are all software controlled and can communicate with each other using high speed bi-directional links at up to a terabit per second. Each processing element has more than 2MB of local memory and the Nervana NNP has 30MB in total of local memory. Memory accesses and data sharing is managed with QOS software which controls adjustable bandwidth over multiple virtual channels with multiple priorities per channel. Processing elements can talk to and send/receive data between each other and the HBM2 stacks locally as well as off die to processing elements and HBM2 on other NNP chips. The idea is to allow as much internal sharing as possible and to keep as much data stored and transformed in local data as possible in order to save precious HBM2 bandwidth (1TB/s) for pre-fetching upcoming tensors, reduce the number of hops and resulting latency by not having to go out to the HBM2 memory and back to transfer data between cores and/or processors, and to save power. This setup also helps Intel achieve an extremely parallel and scalable platform where multiple Nervana NNP Xeon co-processors on the same and remote boards effectively act as a massive singular compute unit!
Intel's Flexpoint is also at the heart of the Nervana NNP and allegedly allows Intel to achieve similar results to FP32 with twice the memory bandwidth while being more power efficient than FP16. Flexpoint is used for the scalar math required for deep learning and uses fixed point 16-bit multiply and addition operations with a shared 5-bit exponent. Unlike FP16, Flexpoint uses all 16-bits of address space for the mantissa and passes the exponent in the instruction. The NNP architecture also features zero cycle transpose operations and optimizations for matrix multiplication and convolutions to optimize silicon usage.
Software control allows users to dial in the performance for their specific workloads, and since many of the math operations and data movement are known or expected in advance, users can keep data as close to the compute units working on that data as possible while minimizing HBM2 memory accesses and data movements across the die to prevent congestion and optimize power usage.
Intel is currently working with Facebook and hopes to have its deep learning products out early next year. The company may have axed Knights Hill, but it is far from giving up on this extremely lucrative market as it continues to push towards exascale computing and AI. Intel is pushing for a 100x increase in neural network performance by 2020 which is a tall order but Intel throwing its weight around in this ring is something that should give GPU makers pause as such an achievement could cut heavily into their GPGPU-powered entries into this market that is only just starting to heat up.
You won't be running Crysis or even Minecraft on this thing, but you might be using software on your phone for augmented reality or in your autonomous car that is running inference routines on a neural network that was trained on one of these chips soon enough! It's specialized and niche, but still very interesting.
- Intel Launches Stratix 10 FPGA With ARM CPU and HBM2
- Intel's Nervana chip targets Nvidia on artificial intelligence
- New AI products will Crest Computex
- Intel to Ship FPGA-Accelerated Xeons in Early 2016
- Intel Kills Knights Hill, Will Launch Xeon Phi Architecture for Exascale Computing @ ExtremeTech
- NVIDIA Discusses Multi-Die GPUs
Subject: General Tech | October 6, 2016 - 11:37 PM | Tim Verry
Tagged: supercomputer, microsoft, deep neural network, azure, artificial intelligence, ai
Microsoft recently announced it would be restructuring 5,000 employees as it focuses its efforts on artificial intelligence with a new AI and Research Group. The Redmond giant is pulling computer scientists and engineers from Microsoft Research, the Information Platfrom, Bing, and Cortana groups, and the Ambient Computing and Robotics teams. Led by 20 year Microsoft veteran Harry Shum (who has worked in both research and engineering roles at Microsoft), the new AI team promises to "democratize AI" and be a leader in the field with intelligent products and services.
It seems that "democratizing AI" is less about free artificial intelligence and more about making the technology accessible to everyone. The AI and Research Group plans to develop artificial intelligence to the point where it will change how humans interact with their computers (read: Cortana 2.0) with services and commands being conversational rather than strict commands, new applications baked with AI such as office and photo editors that are able to proof read and suggest optimal edits respectively, and new vision, speech, and machine analytics APIs that other developers will be able to harness for their own applications. (Wow that's quite the long sentence - sorry!)
Further, Microsoft wants to build the world's fastest AI supercomputer using its Azure cloud computing service. The Azure-powered AI will be available to everyone for their applications and research needs (for a price, of course!). Microsoft certainly has the money, brain power, and computing power to throw at the problem, and this may be one of the major areas where looking to "the cloud" for a company's computing needs is a smart move as the up front capital needed for hardware, engineers, and support staff to do something like this in-house would be extremely prohibative. It remains to be seen whether Microsoft will win out in the wake of competitors at being the first, but it is certainly staking its claim and does not want to be left out completely.
“Microsoft has been working in artificial intelligence since the beginning of Microsoft Research, and yet we’ve only begun to scratch the surface of what’s possible,” said Shum, executive vice president of the Microsoft AI and Research Group. “Today’s move signifies Microsoft’s commitment to deploying intelligent technology and democratizing AI in a way that changes our lives and the world around us for the better. We will significantly expand our efforts to empower people and organizations to achieve more with our tools, our software and services, and our powerful, global-scale cloud computing capabilities.”
Interestingly, this announcement comes shortly after a previous announcement that industry giants Amazon, Facebook, Google-backed DeepMind, IBM, and Microsoft founded the not-for-profit Partnership On AI organization that will collaborate and research best practices on AI development and exploitation (and hopefully how to teach them not to turn on us heh).
I am looking forward to the future of AI and the technologies it will enable!