NVIDIA in the news

Subject: General Tech | May 31, 2018 - 01:41 PM |
Tagged: jen-hsun huang, GTC, HPC, nvswitch, tesla v100

Jen-Hsun Huang has a busy dance card right now, with several interesting tidbits hitting the news recently, including his statement in this DigiTimes post that GPU development is outstripping Moore's law. The GPU Technology Conference kicked off yesterday in Taiwan 2018, with NVIDIA showing off their brand new HGX-2 GPU which contains both AIs and HPCs with Deep Learnings a sure bet as well.  Buzzwords aside, the new accelerator is made up of 16  Tesla V100 GPUs, a mere half terabyte of memory and NVIDIA's NVSwitch.   Specialized products from Lenovo and Supermicro, to name a few, as well as cloud providers will also be picking up this newest peice of kit from NVIDIA. 

For those less interested in HPC, there is an interesting tidbit of information about an event at Hot Chips, on August 20th Stuart Oberman will be talking about NVIDIA’s Next Generation Mainstream GPU with other sessions dealing with their IoT and fabric connections.

asdasd.PNG

"But demand for that power is "growing, not slowing," thanks to AI, Huang said. "Before this time, software was written by humans and software engineers can only write so much software, but machines don't get tired," he said, adding that every single company in the world that develops software will need an AI supercomputer."

Here is some more Tech News from around the web:

Tech Talk

Source: NVIDIA

Eight-GPU SLI in Unreal Engine 4 (Yes There Is a Catch)

Subject: Graphics Cards | March 29, 2018 - 09:52 PM |
Tagged: nvidia, GTC, gp102, quadro p6000

At GTC 2018, Walt Disney Imagineering unveiled a work-in-progress clip of their upcoming Star Wars: Galaxy’s Edge attraction, which is expected to launch next year at Disneyland and Walt Disney World Resort. The cool part about this ride is that it will be using Unreal Engine 4 with eight, GP102-based Quadro P6000 graphics cards. NVIDIA also reports that Disney has donated the code back to Epic Games to help them with their multi-GPU scaling in general – a win for us consumers… in a more limited fashion.

nvidia-2018-GTC-starwars-8-way-sli.jpg

See? SLI doesn’t need to be limited to two cards if you have a market cap of $100 billion USD.

Another interesting angle to this story is how typical PC components are contributing to these large experiences. Sure, Quadro hardware isn’t exactly cheap, but it can be purchased through typical retail channels and it allows the company to focus their engineering time elsewhere.

Ironically, this also comes about two decades after location-based entertainment started to decline… but, you know, it’s Disneyland and Disney World. They’re fine.

Source: NVIDIA

GTC 2018: Nvidia and ARM Integrating NVDLA Into Project Trillium For Inferencing at the Edge

Subject: General Tech | March 29, 2018 - 03:10 PM |
Tagged: project trillium, nvidia, machine learning, iot, GTC 2018, GTC, deep learning, arm, ai

During GTC 2018 NVIDIA and ARM announced a partnership that will see ARM integrate NVIDIA's NVDLA deep learning inferencing accelerator into the company's Project Trillium machine learning processors. The NVIDIA Deep Learning Accelerator (NVDLA) is an open source modular architecture that is specifically optimized for inferencing operations such as object and voice recognition and bringing that acceleration to the wider ARM ecosystem through Project Trillium will enable a massive number of smarter phones, tablets, Internet-of-Things, and embedded devices that will be able to do inferencing at the edge which is to say without the complexity and latency of having to rely on cloud processing. This means potentially smarter voice assistants (e.g. Alexa, Google), doorbell cameras, lighting, and security around the home and out-and-about on your phone for better AR, natural translation, and assistive technologies.

NVIDIAandARM_NVDLA.jpg

Karl Freund, lead analyst for deep learning at Moor Insights & Strategy was quoted in the press release in stating:

“This is a win/win for IoT, mobile and embedded chip companies looking to design accelerated AI inferencing solutions. NVIDIA is the clear leader in ML training and Arm is the leader in IoT end points, so it makes a lot of sense for them to partner on IP.”

ARM's Project Trillium was announced back in February and is a suite of IP for processors optimized for parallel low latency workloads and includes a Machine Learning processor, Object Detection processor, and neural network software libraries. NVDLA is a hardware and software platform based upon the Xavier SoC that is highly modular and configurable hardware that can feature a convolution core, single data processor, planar data processor, channel data processor, and data reshape engines. The NVDLA can be configured with all or only some of those elements and they can independently them up or down depending on what processing acceleration they need for their devices. NVDLA connects to the main system processor over a control interface and through two AXI memory interfaces (one optional) that connect to system memory and (optionally) dedicated high bandwidth memory (not necessarily HBM but just its own SRAM for example).

arm project trillium integrates NVDLA.jpg

NVDLA is presented as a free and open source architecture that promotes a standard way to design deep learning inferencing that can accelerate operations to infer results from trained neural networks (with the training being done on other devices perhaps by the DGX-2). The project, which hosts the code on GitHub and encourages community contributions, goes beyond the Xavier-based hardware and includes things like drivers, libraries, TensorRT support (upcoming)  for Google's TensorFlow acceleration, testing suites and SDKs as well as a deep learning training infrastructure (for the training side of things) that is compatible with the NVDLA software and hardware, and system integration support.

Bringing the "smarts" of smart devices to the local hardware and closer to the users should mean much better performance and using specialized accelerators will reportedly offer the performance levels needed without blowing away low power budgets. Internet-of-Things (IoT) and mobile devices are not going away any time soon, and the partnership between NVIDIA and ARM should make it easier for developers and chip companies to offer smarter (and please tell me more secure!) smart devices.

Also read:

Source: NVIDIA

GTC 2018: NVIDIA Announces Volta-Powered Quadro GV100

Subject: General Tech | March 27, 2018 - 03:30 PM |
Tagged: nvidia, GTC, quadro, gv100, GP100, tesla, titan v, v100, votla

One of the big missing markets for NVIDIA with their slow rollout of the Volta architecture was professional workstations. Today, NVIDIA announced they are bringing Volta to the Quadro family with the Quadro GV100 card.

27-gv100-gpu.jpg

Powered by the same GV100 GPU that announced at last year's GTC in the Tesla V100, and late last year in the Titan V, the Quadro GV100 represents a leap forward in computing power for workstation-level applications. While these users could currently be using TITAN V for similar workloads, as we've seen in the past, Quadro drivers generally provide big performance advantages in these sorts of applications. Although, we'd love to see NVIDIA repeat their move of bringing these optimizations to the TITAN lineup as they did with the TITAN Xp.

As it is a Quadro, we would expect this to be NVIDIA's first Volta-powered product which provides certified, professional driver code paths for applications such as CATIA, Solidedge, and more.

quadro-gv100.png

NVIDIA also heavily promoted the idea of using two of these GV100 cards in one system, utilizing NVLink. Considering the lack of NVLink support for the TITAN V, this is also the first time we've seen a Volta card with display outputs supporting NVLink in more standard workstations.

More importantly, this announcement brings NVIDIA's RTX technology to the professional graphics market. 

With popular rendering applications like V-Ray already announcing and integrating support for NVIDIA's Optix Raytracing denoiser in their beta branch, it seems only a matter of time before we'll see a broad suite of professional applications supporting RTX technology for real-time. For example, raytraced renders of items being designed in CAD and modeling applications. 

This sort of speed represents a potential massive win for professional users, who won't have to waste time waiting for preview renderings to complete to continue iterating on their projects.

The NVIDIA Quadro GV100 is available now directly from NVIDIA now for a price of $8,999, which puts it squarely in the same price range of the previous highest-end Quadro GP100. 

Source: NVIDIA
Manufacturer: NVIDIA

93% of a GP100 at least...

NVIDIA has announced the Tesla P100, the company's newest (and most powerful) accelerator for HPC. Based on the Pascal GP100 GPU, the Tesla P100 is built on 16nm FinFET and uses HBM2.

nvidia-2016-gtc-pascal-banner.png

NVIDIA provided a comparison table, which we added what we know about a full GP100 to:

  Tesla K40 Tesla M40 Tesla P100 Full GP100
GPU GK110 (Kepler) GM200 (Maxwell) GP100 (Pascal) GP100 (Pascal)
SMs 15 24 56 60
TPCs 15 24 28 (30?)
FP32 CUDA Cores / SM 192 128 64 64
FP32 CUDA Cores / GPU 2880 3072 3584 3840
FP64 CUDA Cores / SM 64 4 32 32
FP64 CUDA Cores / GPU 960 96 1792 1920
Base Clock 745 MHz 948 MHz 1328 MHz TBD
GPU Boost Clock 810/875 MHz 1114 MHz 1480 MHz TBD
FP64 GFLOPS 1680 213 5304 TBD
Texture Units 240 192 224 240
Memory Interface 384-bit GDDR5 384-bit GDDR5 4096-bit HBM2 4096-bit HBM2
Memory Size Up to 12 GB Up to 24 GB 16 GB TBD
L2 Cache Size 1536 KB 3072 KB 4096 KB TBD
Register File Size / SM 256 KB 256 KB 256 KB 256 KB
Register File Size / GPU 3840 KB 6144 KB 14336 KB 15360 KB
TDP 235 W 250 W 300 W TBD
Transistors 7.1 billion 8 billion 15.3 billion 15.3 billion
GPU Die Size 551 mm2 601 mm2 610 mm2 610mm2
Manufacturing Process 28 nm 28 nm 16 nm 16nm

This table is designed for developers that are interested in GPU compute, so a few variables (like ROPs) are still unknown, but it still gives us a huge insight into the “big Pascal” architecture. The jump to 16nm allows for about twice the number of transistors, 15.3 billion, up from 8 billion with GM200, with roughly the same die area, 610 mm2, up from 601 mm2.

nvidia-2016-gp100_block_diagram-1-624x368.png

A full GP100 processor will have 60 shader modules, compared to GM200's 24, although Pascal stores half of the shaders per SM. The GP100 part that is listed in the table above is actually partially disabled, cutting off four of the sixty total. This leads to 3584 single-precision (32-bit) CUDA cores, which is up from 3072 in GM200. (The full GP100 architecture will have 3840 of these FP32 CUDA cores -- but we don't know when or where we'll see that.) The base clock is also significantly higher than Maxwell, 1328 MHz versus ~1000 MHz for the Titan X and 980 Ti, although Ryan has overclocked those GPUs to ~1390 MHz with relative ease. This is interesting, because even though 10.6 TeraFLOPs is amazing, it's only about 20% more than what GM200 could pull off with an overclock.

Continue reading our preview of the NVIDIA Pascal architecture!!

GTC 2015: NVIDIA Roadmap Shows Pascal with 3D Memory, NVLink and Mixed Precision Compute

Subject: Graphics Cards | March 17, 2015 - 01:47 PM |
Tagged: pascal, nvidia, gtc 2015, GTC, geforce

At the keynote of the GPU Technology Conference (GTC) today, NVIDIA CEO Jen-Hsun Huang disclosed some more updates on the roadmap for future GPU technologies.

GTC-36.jpg

Most of the detail was around Pascal, due in 2016, that will introduce three new features including mixed compute precision, 3D (stacked) memory, and NVLink. Mixed precision is a method of computing in FP16, allowing calculations to run much faster at lower accuracy than full single or double precision when they are not necessary. Keeping in mind that Maxwell doesn't have an implementation with full speed DP compute (today), it would seem that NVIDIA is targeting different compute tasks moving forward. Though details are short, mixed precision would likely indicate processing cores than can handle both data types.

3D memory is the ability to put memory on-die with the GPU directly to improve overall memory banwidth. The visual diagram that NVIDIA showed on stage indicated that Pascal would have 750 GB/s of bandwidth, compared to 300-350 GB/s on Maxwell today.

NVLink is a new way of connecting GPUs, improving on bandwidth by more than 5x over current implementations of PCI Express. They claim this will allow for connecting as many as 8 GPUs for deep learning performance improvements (up to 10x). What that means for gaming has yet to be discussed.

GTC-38.jpg

NVIDIA made some other interesting claims as well. Pascal will be more than 2x more performance per watt efficient than Maxwell, even without the three new features listed above. It will also ship (in a compute targeted product) with a 32GB memory system compared to the 12GB of memory announced on the Titan X today. Pascal will also have 4x the performance in mixed precision compute.

Watch NVIDIA Reveal the GTX TITAN X at GTC 2015

Subject: Graphics Cards, Shows and Expos | March 17, 2015 - 10:31 AM |
Tagged: nvidia, video, GTC, gtc 2015

NVIDIA is streaming today's keynote from the GPU Technology Conference (GTC) on Ustream, and we have the embed below for you to take part. NVIDIA CEO Jen-Hsun Huang will reveal the details about the new GeForce GTX TITAN X but there are going to be other announcements as well, including one featuring Tesla CEO Elon Musk.

Should be interesting!

Source: NVIDIA

NVIDIA Proves the Cake is Not a Lie with Portal on SHIELD

Subject: General Tech | May 1, 2014 - 02:47 PM |
Tagged: nvidia, shield, Portal, GTC, Cake, lie

Sometimes I feel like this job just keeps getting stranger and stranger. Today is no expection.

Over the years we have gotten plenty of strange guerilla marketing materials in the mail from both AMD and NVIDIA. Apparently, this is one of those days.

After reciving just a tracking number, and no additional information from NVIDIA earlier this week, the mystery package finally arrived today. Upon initial inspection we had no idea what to expect.

image.jpeg

When we opened the box, we were greeted by a polystyrene cooler with the logo of Bake Me a Wish, which only served to confuse us more.

As we opened the cooler, and the subsequent box inside of it, things started to make more sense.

image_2.jpeg

Inside the box, we were greeted by a chocolate cake, accompanied by a card from NVIDIA.

image_3.png

As you may remember at this year's GTC Conference, NVIDIA announced that they had ported Valve's Portal to Android and would be releasing it for SHIELD. Today we were greeted with a reminder of that, and the message that we should be able to try it for ourselves.

DSC01456.JPG

A teaser from this year's GTC Keynote

While we can't talk about our experiences with Portal just yet, stay tuned to PC Perspective for more coverage of the NVIDIA SHIELD and Portal very soon!

Source: NVIDIA

NVIDIA Will Present Global Impact Award And $150,000 Grant To Researchers At GTC 2015

Subject: General Tech | April 8, 2014 - 05:03 PM |
Tagged: research, nvidia, GTC, gpgpu, global impact award

During the GPU Technology Conference last month, NVIDIA introduced a new annual grant called the Global Impact Award. The grant awards $150,000 to researchers using NVIDIA GPUs to research issues with worldwide impact such as disease research, drug design, medical imaging, genome mapping, urban planning, and other "complex social and scientific problems."

NVIDIA Global Impact Award.png

NVIDIA will be presenting the Global Impact Award to the winning researcher or non-profit institution at next year's GPU Technology Conference (GTC 2015). Individual researchers, universities, and non-profit research institutions that are using GPUs as a significant enabling technology in their research are eligible for the grant. Both third party and self-nomiations (.doc form) are accepted with the nominated candidates being evaluated based on several factors including the level of innovation, social impact, and current state of the research and its effectiveness in approaching the problem. Submissions for nominations are due by December 12, 2014 with the finalists being announced by NVIDIA on March 13, 2015. NVIDIA will then reveal the winner of the $150,000 grant at GTC 2015 (April 28, 2015).

The researcher, university, or non-profit firm can be located anywhere in the world, and the grant money can be assigned to a department, initiative, or a single project. The massively parallel nature of modern GPUs makes them ideal for many times of research with scalable projects, and I think the Global Impact Award is a welcome incentive to encourage the use of GPGPU in applicable research projects. I am interested to see what the winner will do with the money and where the research leads.

More information on the Global Impact Award can be found on the NVIDIA website.

Source: NVIDIA

Podcast #293 - NVIDIA Titan-Z, ASUS ROG Poseidon 780, News from OculusVR and more!

Subject: General Tech | March 27, 2014 - 02:42 PM |
Tagged: W9100, video, titan z, poseidon 780, podcast, Oculus, nvidia, GTC, GDC

PC Perspective Podcast #293 - 03/27/2014

Join us this week as we discuss the NVIDIA Titan-Z, ASUS ROG Poseidon 780, News from OculusVR and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

  • iTunes - Subscribe to the podcast directly through the iTunes Store
  • RSS - Subscribe through your regular RSS reader
  • MP3 - Direct download link to the MP3 file

Hosts: Ryan Shrout, Jeremy Hellstrom, Josh Walrath and Allyn Malventano

 
This podcast is brought to you by Coolermaster, and the CM Storm Pulse-R Gaming Headset!
 
Program length: 1:19:03
  1. Week in Review:
    1. 0:10:45 Microsoft's DirectX 12 (Live Blog)
  2. 0:37:07 This podcast is brought to you by Coolermaster, and the CM Storm Pulse-R Gaming Headset
  3. News items of interest:
  4. Hardware/Software Picks of the Week:
    1. Josh: Certainly not a Skype Connection to the Studio
  5. Closing/outro

Be sure to subscribe to the PC Perspective YouTube channel!!