NVIDIA Will Present Global Impact Award And $150,000 Grant To Researchers At GTC 2015

Subject: General Tech | April 8, 2014 - 05:03 PM |
Tagged: research, nvidia, GTC, gpgpu, global impact award

During the GPU Technology Conference last month, NVIDIA introduced a new annual grant called the Global Impact Award. The grant awards $150,000 to researchers using NVIDIA GPUs to research issues with worldwide impact such as disease research, drug design, medical imaging, genome mapping, urban planning, and other "complex social and scientific problems."

NVIDIA Global Impact Award.png

NVIDIA will be presenting the Global Impact Award to the winning researcher or non-profit institution at next year's GPU Technology Conference (GTC 2015). Individual researchers, universities, and non-profit research institutions that are using GPUs as a significant enabling technology in their research are eligible for the grant. Both third party and self-nomiations (.doc form) are accepted with the nominated candidates being evaluated based on several factors including the level of innovation, social impact, and current state of the research and its effectiveness in approaching the problem. Submissions for nominations are due by December 12, 2014 with the finalists being announced by NVIDIA on March 13, 2015. NVIDIA will then reveal the winner of the $150,000 grant at GTC 2015 (April 28, 2015).

The researcher, university, or non-profit firm can be located anywhere in the world, and the grant money can be assigned to a department, initiative, or a single project. The massively parallel nature of modern GPUs makes them ideal for many times of research with scalable projects, and I think the Global Impact Award is a welcome incentive to encourage the use of GPGPU in applicable research projects. I am interested to see what the winner will do with the money and where the research leads.

More information on the Global Impact Award can be found on the NVIDIA website.

Source: NVIDIA

GTC 2014: NVIDIA Awards Startup Map-D $100,000 In Early Stage Challenge

Subject: General Tech | March 26, 2014 - 08:49 PM |
Tagged: remote graphics, nvidia, GTC 2014, gpgpu, emerging companies summit, ecs 2014, cloud computing

NVIDIA started the Emerging Companies Summit six years ago, and since then the event has grown in size and scope to identify and support those technology companies tha leverage (or plan to leverage) GPGPU computing to deliver innovative products. The ECS continues to be a platform for new startups to showcase their work at the annual GPU Technology Conference. NVIDIA provides support in the form of legal, developmental, and co-marketing to the companies featured at ECS.

GTC 2014 ECS GPGPU Technologies.jpg

There was an interesting twist this year though in the form of the Early Start Challenge. This is a new aspect to ECS in addition to the ‘One to Watch’ award. I attended the Emerging Companies Summit again this year and managed to snag some photos and participate in the Early Start Challenge (disclosure: i voted for Audiostream TV).

GTC 2014 ECS Early Start Challenge Companies.jpg

The 12 Early Start Challenge contestants take the stage at once to await the vote tally.

During the challenge, 12 selected startup companies were each given eight minutes on stage to pitch their company and why their innovations were deserving of the $100,000 grand prize. The on stage time was divided into a four minute presentation and a four minute Q&A session with the panel of judges (this year the audience was not part of the Q&A session at ECS unlike last year due to time constraints).

After all 12 companies had their chance on stage, the panel of judges and the audience submitted their votes for the most innovative startup. The panel of judges included:

  • Scott Budman Business & Technology Reporter, NBC
  • Jeff Herbst Vice President of Business Development, NVIDIA
  • Jens Hortsmann Executive Producer & Managing Partner, Crestlight Venture Productions
  • Pat Moorhead President & Principal Analyst, Moor Insights & Strategy
  • Bill Reichert Managing Director, Garage Technology Ventures

The companies participating in the challenge include Okam Studio, MyCloud3D, Global Valuation, Brytlyt, Clarifai, Aerys, oMobio, ShiVa Technologies, IGI Technologies, Map-D, Scalable Graphics, and AudioStream TV. The companies are involved in machine learning, deep neural networks, computer vision, remote graphics, real time visualization, gaming, and big data analytics.

After all the votes were tallied, Map-D was revealed to be the winner and received a check for $100,000 from NVIDIA Vice President of Business Development Jeff Herbst.

Map-D Wins ECS Early Start Challenge.jpg

Jeff Herbst awarding Map-D's CEO with the Early Start Challenge grand prize check. From left to right: Scott Budman, Jeff Herbst, and Thomas Graham.

Map-D is a company that specializes in a scaleable in-memory GPU database that promises millisecond queries directly from GPU memory (with GPU memory bandwidth being the bottleneck) and very fast database inserts. The company is working with Facebook and PayPal to analyze data. In the case of Facebook, Map-D is being used to analyze status updates in real time to identify malicious behavior. The software can be scaled across eight NVIDIA Tesla cards to analyze a billion Twitter tweets in real time.

It is specialized software, but extremely useful within its niche. Hopefully the company puts the prize money to good use in furthering its GPGPU endeavors. Although there was only a single grand prize winner, I found all the presentations interesting and look forward to seeing where they go from here.

Read more about the Emerging Companies Summit (from last year) and keep track of new GTC 2014 articles by following the GTC 2014 tag @ PC Perspective.

Source: PC

Intel Xeon Phi to get Serious Refresh in 2015?

Subject: General Tech, Graphics Cards, Processors | November 28, 2013 - 03:30 AM |
Tagged: Intel, Xeon Phi, gpgpu

Intel was testing the waters with their Xeon Phi co-processor. Based on the architecture designed for the original Pentium processors, it was released in six products ranging from 57 to 61 cores and 6 to 16GB of RAM. This lead to double precision performance of between 1 and 1.2 TFLOPs. It was fabricated using their 22nm tri-gate technology. All of this was under the Knights Corner initiative.

Intel_Xeon_Phi_Family.jpg

In 2015, Intel plans to have Knights Landing ready for consumption. A modified Silvermont architecture will replace the many simple (basically 15 year-old) cores of the previous generation; up to 72 Silvermont-based cores (each with 4 threads) in fact. It will introduce the AVX-512 instruction set. AVX-512 allows applications to vectorize 8 64-bit (double-precision float or long integer) or 16 32-bit (single-precision float or standard integer) values.

In other words, packing a bunch of related problems into a single instruction.

The most interesting part? Two versions will be offered: Add-In Boards (AIBs) and a standalone CPU. It will not require a host CPU, because of its x86 heritage, if your application is entirely suited for an MIC architecture; unlike a Tesla, it is bootable with existing and common OSes. It can also be paired with standard Xeon processors if you would like a few strong threads with the 288 (72 x 4) the Xeon Phi provides.

And, while I doubt Intel would want to cut anyone else in, VR-Zone notes that this opens the door for AIB partners to make non-reference cards and manage some level of customer support. I'll believe a non-Intel branded AIB only when I see it.

Source: VR-Zone
Manufacturer: Scott Michaud

A new generation of Software Rendering Engines.

We have been busy with side projects, here at PC Perspective, over the last year. Ryan has nearly broken his back rating the frames. Ken, along with running the video equipment and "getting an education", developed a hardware switching device for Wirecase and XSplit.

My project, "Perpetual Motion Engine", has been researching and developing a GPU-accelerated software rendering engine. Now, to be clear, this is just in very early development for the moment. The point is not to draw beautiful scenes. Not yet. The point is to show what OpenGL and DirectX does and what limits are removed when you do the math directly.

Errata: BioShock uses a modified Unreal Engine 2.5, not 3.

In the above video:

  • I show the problems with graphics APIs such as DirectX and OpenGL.
  • I talk about what those APIs attempt to solve, finding color values for your monitor.
  • I discuss the advantages of boiling graphics problems down to general mathematics.
  • Finally, I prove the advantages of boiling graphics problems down to general mathematics.

I would recommend watching the video, first, before moving forward with the rest of the editorial. A few parts need to be seen for better understanding.

Click here, after you watch the video, to read more about GPU-accelerated Software Rendering.

JavaOne 2013: GPU Is Coming Whether You Know It or Not

Subject: General Tech, Shows and Expos | September 23, 2013 - 09:38 PM |
Tagged: JavaOne, JavaOne 2013, gpgpu

Are the enterprise users still here? Oh, hey!

GPU acceleration throws a group of many similar calculations at thousands of simple cores. Their architecture makes it very cheap and power efficient for the amount of work they achieve. Gamers, obviously, enjoy the efficiency at tasks such as calculating pixels on a screen or modifying thousands of vertex positions. This technology has evolved more generally than graphics. Enterprise and research applications have been taking notice over the years.

GPU discussion, specifically, starts around 16 minutes.

Java, a friend of scientific and "big-data" developers, is also evolving in a few directions including "offload".

IBM's CTO of Java, John Duimovich, discussed a few experiments they created when optimizing the platform to use new hardware. Sorting arrays, a common task, saw between a 2-fold and 48-fold increase of performance. Including the latency of moving data and initializing GPU code, a 32,000-entry array took less than 1.5ms to sort, compared to about 3ms on the CPU. The sample code was programmed in CUDA.

The goal of these tests is, as far as I can tell, to (eventually) automatically use specialized hardware for Java's many built-in libraries. The pitch is free performance. Of course there is only so much you can get for free. Still, optimizing the few usual suspects is an obvious advantage, especially if it just translates average calls to existing better-suited libraries.

Hopefully they choose to support more than just CUDA whenever they take it beyond experimentation. The OpenPOWER Consortium, responsible for many of these changes, currently consists of IBM, Mellanox, TYAN, Google, and NVIDIA.

Source: JavaOne
Manufacturer: Adobe

OpenCL Support in a Meaningful Way

Adobe had OpenCL support since last year. You would never benefit from its inclusion unless you ran one of two AMD mobility chips under Mac OSX Lion, but it was there. Creative Cloud, predictably, furthers this trend with additional GPGPU support for applications like Photoshop and Premiere Pro.

This leads to some interesting points:

  • How OpenCL is changing the landscape between Intel and AMD
  • What GPU support is curiously absent from Adobe CC for one reason or another
  • Which GPUs are supported despite not... existing, officially.

adobe-cs-products.jpg

This should be very big news for our readers who do production work whether professional or for a hobby. If not, how about a little information about certain GPUs that are designed to compete with the GeForce 700-series?

Read on for our thoughts, after the break.

GTC 2013: Fuzzy Logix Launches Tanay Rx for GPU Accelerating Analytic Models Programmed In R

Subject: General Tech | March 26, 2013 - 11:40 PM |
Tagged: GTC 2013, gpu analytics, gpgpu, fuzzy logix

Fuzzy Logix, a company that specializes in HPC data analytics, recently unveiled a new extension (to the Tanay Zx library) called Tanay Rx that will GPU accelerate analytic models written in R. R is a programming language commonly used by statisticians. It is reportedly relatively easy to program, but has an inherent lack of multi-threading performance and memory limitations. With Tanay Rx, Fuzzy Logix is hoping to combine the performance benefits of its Tanay Zx libraries with the simplicity of R programming. According to Fuzzy Logix, Tanay Rx is "the perfect prescription to cure performance issues with R."

FuzzyLogix_at_GTC2013.jpg

Tanay Zx allowed the use of many programming languages to run models with .net, .dll, or shared object calls on the GPU, and the new Tanay Rx extension extends that functionality to statistical and analytic models run using R. Models include those data intensive tasks as matrix operations, Monte Carlo simulations, data mining, financial mathematics (equities, fixed income, and time series analysis). Fuzzy Logix claims to enable R users to run over 500 analytic models up to 10 to 100-times faster by harnessing the parallel processing power of graphics and accelerator cards such as NVIDIA's Quadro/Tesla cards, Intel's MIC, and AMD's FirePro cards.

As an example, Fuzzy Logix states that calculations for intra-day risk of equity, interest rate, and FX options amount to approximately 1 billion future scenarios can be performed in milliseconds on the GPU. While some conversions may be more intensive, certain aspects of R code can be sped-up by replacing R functions with Fuzzy Logix' own Tanay Rx functions.

Fuzzy Logix CUDA Function.jpg

As per Fuzzy Logix's website.

Industry solutions implementing Tanay Rx for the financial, healthcare, internet marketing, pharmaceutical, oil, gas, insurance, and other sectors are available now. More information on the company's approach to GPGPU analytics is available here.

Source: Fuzzy Logix

GTC 2013: Cortexica Vision Systems Talks About the Future of Image Recognition During the Emerging Companies Summit

Subject: General Tech, Graphics Cards | March 20, 2013 - 09:44 PM |
Tagged: video fingerprinting, image recognition, GTC 2013, gpgpu, cortexica, cloud computing

The Emerging Companies Summit is an series of sessions at NVIDIA's GPU Technology Conference (GTC) that gives the floor to CEOs from several up-and-coming technology startups. Earlier today, the CEO of Cortexica Vision Systems took the stage to talk briefly about the company's products and future direction, and to answer questions from a panel of industry experts.

If you tuned into NVIDIA's keynote presentation yesterday, you may have noticed the company showing off a new image recognition technology. That technology is being developed by a company called Cortexica Vision Systems. While it cannot perform facial recognition, it is capable of identifying everything else, according the company's CEO Ian McCready. Currently, Cortexica is employing a cluster of approximately 70 NVIDIA graphics cards, but it is capable of scaling beyond that. Mcready estimates that about 100 GPUs and a CPU would be required by a company like eBay, should they want to implement Cortexica's image recognition technology in-house.

20130320_047.jpg

The Cortexica technology uses images captured by a camera (such as the one in your smartphone), which is then sent to Cortexica's servers for processing. The GPUs in the Cortexica cluster handle the fingerprint creation task while the CPU does the actual lookup in the database of known fingerprints to either find an exact match, or return similar image results. According to Cortexica, the fingerprint creation takes only 100ms, though as more powerful GPUs make it into mobile devices, it may be possible to do the fingerprint creation on the device itself, reducing the time between taking a photo and getting relevant results back.

20130320_051.jpg

The image recognition technology is currently being used by Ebay Motors in the US, UK, and Germany. Cortexica hopes to find a home with many of the fashion companies that would use the technology to allow people to identify and ultimately purchase clothing they take photos of on television or in public. The technology can also perform 360-degree object recognition, identify logos that are as small as .4% of the screen, and identify videos. In the future Cortexica hopes to reduce latency, improve recognition accuracy, and add more search categories. Cortexica is also working on enabling an "always on" mobile device that will constantly be indentifying everything around it, which is both cool and a bit creepy. With mobile chips like Logan and Parker coming in the future, Cortexica hopes to be able to do on-device image recognition, which would greatly reduce latency and allow the use of the recognition technology while not connected to the internet.

20130320_054.jpg

The number of photos taken is growing rapidly, where as many as 10% of all photos stored "in the cloud" were taken last year alone. Even Facebook, with it's massive data centers is moving to a cold-storage approach to save money on electricity costs of storing and serving up those photos. And while some of these photos have relevant meta data, the majority of photos taken do not, and Cortexica claims that its technology can be used to get around that issue, but identifying photos as well as finding similar photos using its algorithms.

20130320_055.jpg

Stay tuned to PC Perspective for more GTC coverage!

Additional slides are available after the break:

Too good to be true; bad coding versus GPGPU compute power

Subject: General Tech | November 23, 2012 - 01:03 PM |
Tagged: gpgpu, amd, nvidia, Intel, phi, tesla, firepro, HPC

The skeptics were right to question the huge improvements seen when using GPGPUs in a system for heavy parallel computing tasks.  The cards do help a lot but the 100x improvements that have been reported by some companies and universities had more to do with poorly optimized CPU code than with the processing power of GPGPUs.  This news comes from someone who you might not expect to burst this particular bubble, Sumit Gupta is the GM of NVIDIA's Tesla team and he might be trying to mitigate any possible disappointment from future customers which have optimized CPU coding and won't see the huge improvements seen by academics and other current customers.  The Inquirer does point out a balancing benefit, it is obviously much easier to optimize code in CUDA, OpenCL and other GPGPU languages than it is to code for multicored CPUs.

bubble-burst.jpg

"Both AMD and Nvidia have been using real-world code examples and projects to promote the performance of their respective GPGPU accelerators for years, but now it seems some of the eye popping figures including speed ups of 100x or 200x were not down to just the computing power of GPGPUs. Sumit Gupta, GM of Nvidia's Tesla business told The INQUIRER that such figures were generally down to starting with unoptimised CPU."

Here is some more Tech News from around the web:

Tech Talk

Source: The Inquirer

AMD Launches Dual Tahiti FirePro S10000 Graphics Card

Subject: Graphics Cards | November 13, 2012 - 04:15 PM |
Tagged: tahiti, HPC, gpgpu, firepro s10000, firepro

On Monday, AMD launched its latest graphics card aimed at the server and workstation market. Called the AMD FirePro S10000 (for clarity, that’s FirePro S10,000), it is a dual GPU Tahiti graphics card that offers up some impressive performance numbers.

No, unfortunately, this is not the (at this point) mythical dual-7970 AMD HD 7990 graphics card. Rather, the FirePro S10,000 is essentially two Radeon 7950 GPUs on a single PCB along with 6 GB of GDDR5 memory. Specifications on the card include 3,584 stream processors, a GPU clock speed of 825 MHz, and 6 GB GDDR5 with a total of 480 GB/s of memory bandwidth. That is 1,792 stream processors and 3 GB of memory per GPU. Interestingly, this is a dual slot card with an active cooler. At 375W, a passive cooler is just not possible in a form factor necessary to fit into a server rack. Therefore, AMD has equipped the FirePro S10,000 GPGPU card with a triple fan cooler reminiscant of the setup PowerColor uses on its custom (2x7970) Devil 13, but not as large. The FirePro card has three red fans (shrouded by a black cover) over a heatpipe and aluminum fin heatsink. The card does include display outputs for workstation uses including one DVI and four mini DisplayPort ports.

gpgpu.png

AMD is claiming 1.48 TFLOPS in double precision work and 5.91 TFLOPS in single precision workloads. Those are impressive numbers, and the card even manages to beat NVIDIA’s new Tesla K20X with big Kepler GK110 and the company’s dual GPU GK104 Tesla K10 by notable margins. Additionally, the new FirePro S10000 manages to beat its FirePro 9000 predecessor handily. The S9000 in comparison is rated at 0.806 TFLOPS for double precision calculations and 3.23 TFLOPS on single precision work. The S9000 is a single GPU card equivalent to the Radeon 7950 on the consumer side of things with 1,792 shader cores. AMD has essentially taken two S9000 cards and put them on a single PCB, and managed to get almost twice the potential performance without needing twice the power.

Efficiency and calculations per watt were numbers that AMD did not dive too much into, but the company did share that the new FirePro S10000 achieves 3.94 GLOPS/W. AMD compares this to NVIDIA’s dual GPU (Fermi-based) Tesla M2090 at 2.96 GFLOPS/W. Unfortunately, NVIDIA has not shared a single GPU GFLOPS/W rating on its new K20X cards.

  AMD S10000 AMD S9000 NVIDIA K20X NVIDIA K10
Double Precision 1.48 TF 0.806 TF 1.31 TF 0.19 TF
Single Precision 5.91 TF 3.23 TF 3.95 TF 4.58 TF
Architecture Tahiti (x2) Tahiti (x1) GK110 GK104 (x2)
TDP 375W 225W 235W 225W
Memory Bandwidth 480 GB/s 264 GB/s 250 GB/s 320 GB/s
Memory Capacity 6 GB 6 GB 6 GB 8 GB
Stream Processors 3,584 1,792 2,688 3,070
Core clock speed 825 MHz 900 MHz 732 MHz 745 M
MSRP $3,599 $2,499 $3,199 ~$2500

Other features of the AMD FirePro S10000 include support for OpenCL, Microsoft RemoteFX, Direct GPU pass-through, and (shared) virtualized graphics. AMD envisions businesses using these FirePro cards to provide GPU hardware acceleration for virtualized desktops and thin clients. With Xen Server, multiple users are able to tap into the hardware acceleration offered by the FirePro S10000 to speed up desktop and speed up programs that support it.

Operating systems in particular have begun tapping into GPU acceleration to speed up the user interface and run things like the Aero desktop in Windows 7. High end software for workstations also have a high GPU acceleration adoption rate, so there are benefits to be had, and AMD is continuing to offer it with its latest FirePro card.

AMD FirePro Market Position_Aimed at Server Graphics.png

AMD is offering up a card that can be used for a mix of compute or graphics output, making them an interesting choice for workstations. The FirePro S10000’s major fault lies with a 375W TDP, and while the peak performance is respectable it is going to use more power while provided that compute muscle.

The cards are available now with an MSRP of $3,599. It is neat to finally see AMD come out with a dual GPU card with Tahiti chips, and it will be interesting to see what kind of design wins the company is able to get for its beastly FirePro S10000.