NVIDIA's Tesla K10 offers serious single-precision performance

Subject: General Tech | June 19, 2012 - 03:04 PM |
Tagged: nvidia, tesla, K10, GK104, HPC

One of NVIDIA 's line of Tesla HPC cards, the Tesla K10 has actually been seen in the wild.  the new Tesla series is split between the GK104 based K10 model specifically designed for single-precision tasks and the GK110 based Tesla K20 and it is optimized for double-precision tasks.  The K10 is capable of 4.58 teraflops thanks to a pair of GK104s with 8GB of GDDR5, whereas the K20 should in theory double Intel's Xeon Phi at 2 teraflops of double-precision performance but that has yet to be demonstrated.  The K10 that was demonstrated also showed off another of the benefits of NVIDIA's new architecture, even with two GPUs the card remains within a 225W thermal envelop, something that is incredibly important if you are building a cluster.  The Register has gathered together some of the benchmarks and slides from NVIDIA's release, which you can see here.

elreg_nvidia_isc_tesla_k10_benchmarks.jpg

"The Top 500 supercomputer ranking is based on the performance of machines running the Linpack Fortran matrix math benchmark using double-precision floating point math, but a lot of applications will do just fine with single-precision math. And it is for these workloads, graphics chip maker and supercomputing upstart Nvidia says, that it designed the new Tesla K10 server coprocessors."

Here is some more Tech News from around the web:

Tech Talk

 

Source: The Register

Podcast #202 - GTX 670, NVIDIA's GK110 Tesla card, our AMD Trinity Mobile review and more!

Subject: General Tech | May 17, 2012 - 03:16 PM |
Tagged: trinity, tesla, podcast, nvidia, kepler, gtx670, GTC 2012, gk110, GK104, dv nation, a10

PC Perspective Podcast #202 - 05/17/2012

Join us this week as we talk about the GTX 670, NVIDIA's GK110 Tesla card, our AMD Trinity Mobile review and more!

If you want even more PC Perspective this, check out our "aftershow" event as well.  Event might be an over-statement though...

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

  • iTunes - Subscribe to the podcast directly through the iTunes Store
  • RSS - Subscribe through your regular RSS reader
  • MP3 - Direct download link to the MP3 file

Hosts: Jeremy Hellstrom, Josh Walrath, and Allyn Malvantano

Program length: 1:05:16

Program Schedule:

  1. 0:00:21 Introduction
  2. 1-888-38-PCPER or podcast@pcper.com
  3. http://pcper.com/podcast
  4. http://twitter.com/ryanshrout and http://twitter.com/pcper
  5. 0:01:15 NVIDIA GeForce GTX 670 2GB Graphics Card Review - Kepler for $399
    1. GeForce GTX 670 vs GTX 570 Performance Update
    2. The GTX 670 and the Case of the Missing (and Returning) 4-Way SLI Support
  6. 0:11:20 Graphics Card (GPU) Stock Check - May 10th, 2012
    1. Hard to make a profit when no one can find Kepler cards for sale, NVIDIA
  7. 0:14:25 NVIDIA Reveals GK110 GPU - Kepler at 7.1B Transistors, 15 SMX Units
  8. 0:20:20 Lenovo IdeaCentre Q180: Atom's Wake
  9. 0:24:30 AMD A10-4600M Trinity For Mobile Review: Trying To Cut The Ivy
  10. 0:33:40 Just Delivered: DV Nation RAMRod PC - Sandy Bridge-E, 64GB DDR3, 480GB RevoDrive 3 X2
  11. 0:35:42 Plug and Pray PCIe SSD that you can upgrade; OWC's Mercury Accelsior
  12. 0:40:40 GTC 2012: NVIDIA Announces GeForce GRID Cloud Gaming Platform
    1. NVIDIA Pioneers New Standard for High Performance Computing with Tesla GPUs
    2. NVIDIA Introduces World's First Virtualized GPU, Accelerating Graphics for Cloud Computing
  13. 0:53:00 ZOTAC announces ZOTAC GeForce GT 630, GT 620 and GT 610 series
  14. 0:55:00 Hardware / Software Pick of the Week
    1. Jeremy: Only to be used for evil
    2. Josh: Since NV doesn't have an answer yet at this price range...
    3. Allyn: If you need your files secure - without the destruction
  15. 1-888-38-PCPER or podcast@pcper.com
  16. http://pcper.com/podcast   
  17. http://twitter.com/ryanshrout and http://twitter.com/pcper
  18. Closing

NVIDIA Pioneers New Standard for High Performance Computing with Tesla GPUs

Subject: Shows and Expos | May 15, 2012 - 03:43 PM |
Tagged: tesla, nvidia, GTC 2012, kepler, CUDA

SAN JOSE, Calif.—GPU Technology Conference—May 15, 2012—NVIDIA today unveiled a new family of Tesla GPUs based on the revolutionary NVIDIA Kepler GPU computing architecture, which makes GPU-accelerated computing easier and more accessible for a broader range of high performance computing (HPC) scientific and technical applications.

GTC_horizontal_376_large.jpg

The new NVIDIA Tesla K10 and K20 GPUs are computing accelerators built to handle the most complex HPC problems in the world. Designed with an intense focus on high performance and extreme power efficiency, Kepler is three times as efficient as its predecessor, the NVIDIA Fermi architecture, which itself established a new standard for parallel computing when introduced two years ago.

“Fermi was a major step forward in computing,” said Bill Dally, chief scientist and senior vice president of research at NVIDIA. “It established GPU-accelerated computing in the top tier of high performance computing and attracted hundreds of thousands of developers to the GPU computing platform. Kepler will be equally disruptive, establishing GPUs broadly into technical computing, due to their ease of use, broad applicability and efficiency.”

servers-workstations-on.png

The Tesla K10 and K20 GPUs were introduced at the GPU Technology Conference (GTC), as part of a series of announcements from NVIDIA, all of which can be accessed in the GTC online press room.

NVIDIA developed a set of innovative architectural technologies that make the Kepler GPUs high performing and highly energy efficient, as well as more applicable to a wider set of developers and applications. Among the major innovations are:

  • SMX Streaming Multiprocessor – The basic building block of every GPU, the SMX streaming multiprocessor was redesigned from the ground up for high performance and energy efficiency. It delivers up to three times more performance per watt than the Fermi streaming multiprocessor, making it possible to build a supercomputer that delivers one petaflop of computing performance in just 10 server racks. SMX’s energy efficiency was achieved by increasing its number of CUDA architecture cores by four times, while reducing the clock speed of each core, power-gating parts of the GPU when idle and maximizing the GPU area devoted to parallel-processing cores instead of control logic.
  • Dynamic Parallelism – This capability enables GPU threads to dynamically spawn new threads, allowing the GPU to adapt dynamically to the data. It greatly simplifies parallel programming, enabling GPU acceleration of a broader set of popular algorithms, such as adaptive mesh refinement, fast multipole methods and multigrid methods.
  • Hyper-Q – This enables multiple CPU cores to simultaneously use the CUDA architecture cores on a single Kepler GPU. This dramatically increases GPU utilization, slashing CPU idle times and advancing programmability. Hyper-Q is ideal for cluster applications that use MPI.

“We designed Kepler with an eye towards three things: performance, efficiency and accessibility,” said Jonah Alben, senior vice president of GPU Engineering and principal architect of Kepler at NVIDIA. “It represents an important milestone in GPU-accelerated computing and should foster the next wave of breakthroughs in computational research.”

NVIDIA Tesla K10 and K20 GPUs
The NVIDIA Tesla K10 GPU delivers the world’s highest throughput for signal, image and seismic processing applications. Optimized for customers in oil and gas exploration and the defense industry, a single Tesla K10 accelerator board features two GK104 Kepler GPUs that deliver an aggregate performance of 4.58 teraflops of peak single-precision floating point and 320 GB per second memory bandwidth.

The NVIDIA Tesla K20 GPU is the new flagship of the Tesla GPU product family, designed for the most computationally intensive HPC environments. Expected to be the world’s highest-performance, most energy-efficient GPU, the Tesla K20 is planned to be available in the fourth quarter of 2012.

The Tesla K20 is based on the GK110 Kepler GPU. This GPU delivers three times more double precision compared to Fermi architecture-based Tesla products and it supports the Hyper-Q and dynamic parallelism capabilities. The GK110 GPU is expected to be incorporated into the new Titan supercomputer at the Oak Ridge National Laboratory in Tennessee and the Blue Waters system at the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign.

“In the two years since Fermi was launched, hybrid computing has become a widely adopted way to achieve higher performance for a number of critical HPC applications,” said Earl C. Joseph, program vice president of High-Performance Computing at IDC. “Over the next two years, we expect that GPUs will be increasingly used to provide higher performance on many applications.”

Preview of CUDA 5 Parallel Programming Platform
In addition to the Kepler architecture, NVIDIA today released a preview of the CUDA 5 parallel programming platform. Available to more than 20,000 members of NVIDIA’s GPU Computing Registered Developer program, the platform will enable developers to begin exploring ways to take advantage of the new Kepler GPUs, including dynamic parallelism.

The CUDA 5 parallel programming model is planned to be widely available in the third quarter of 2012. Developers can get access to the preview release by signing up for the GPU Computing Registered Developer program on the CUDA website.

Source: NVIDIA

What's up at the GTC? Check out this BOXX!

Subject: General Tech | May 14, 2012 - 03:31 PM |
Tagged: tesla, quadro, nvidia, maximus, GTC 2012, BOXX

There are many professional level products to be seen at this years GPU Technology Conference, one of the more impressive being NVIDIA Maximus technology. That takes the power of a Quadro and couples it with the new Tesla GPUs for impressive live rendering and CAD applications.  These products are not for gamers more for game designers and graphical artists, but the technology its self is still something to keep your eyes on. 

maximus_bnr.jpg

One of the vendors you will see is BOXX, with several different lines of computers are designed to 3ds Max, CATIA V6 Live Rendering, SolidWorks and other professional level HPC applications.  With a NVIDIA Quadro 6000 6 GB, a Tesla 2075 6 GB and a 240GB SSD for cache and programs you will be rendering like never before.

Ryan will be at the GTC so keep an eye on the page for news from that show when it begins in the middle of this week.  NVIDIA's Maximus technology is sure to feature in some of these stories but do keep in mind this is the GTC and not the GDC so new game previews are unlikely though new benchmark software and proof of concept game engines might be.

gdcBOXX.png

"3DBOXX workstations featuring NVIDIA Maximus technology combine the visualization and interactive design capability of NVIDIA Quadro GPUs with the high-performance computing power of NVIDIA Tesla C2075 GPUs into a single system."

Here is some more Tech News from around the web:

Tech Talk

 

Source: BOXX
Author:
Subject: Editorial
Manufacturer: NVIDIA

Quarter Down but Year Up

Yesterday NVIDIA released their latest financial results for Q4 2012 and FY2012.  There was some good and bad mixed in the results, but overall it was a very successful year for NVIDIA.

Q4 saw gross revenue top $953.2 million US with a net income of $116 million US.  This is about $53 million less in gross revenue and $62 million down in net income as compared to last quarter.  There are several reasons as to why this happened, but the majority of it appears to be due to the hard drive shortage affecting add-in sales.  Simply put, the increase in hard drive prices caused most OEMs to take a good look at the price points of the entire system, and oftentimes would cut out the add-in graphics and just use integrated.

tegra3.jpg

Tegra 3 promises a 50% increase in revenue for NVIDIA this coming year.

Two other reasons for the lower than expected quarter were start of the transition to 28 nm products based on Kepler.  They are ramping up production on 28 nm and slowing down 40 nm.  Yields on 28 nm are not where they expected them to be, and there is also a shortage of wafer starts for that line.  This had a pretty minimal affect overall on Q4, but it will be one of the prime reasons why revenue looks like it will be down in Q1 2013. 

Read the rest of the article here.

NVIDIA Reports Q3 2012 Results

Subject: Editorial | November 16, 2011 - 09:08 PM |
Tagged: tesla, tegra, Results, Q3 2012, nvidia, income, fermi

Late last week NVIDIA reported their Q3 2012 (they have an unconventional reporting calendar), and the results were overwhelmingly positive for the once struggling company.  Throughout 2010 NVIDIA struggled with the poor results of their 400 series of graphics cards as compared to the relative smooth sailing that AMD had going into the DirectX 11 marketplace.  NVIDIA was also struggling to get the original Tegra to be accepted by the marketplace, which never occurred with that particular generation of products.

633889_NVLogo_3D_H_DarkType.jpg

NVIDIA reported gross revenues of $1.07 billion for the previous quarter, with a net income (GAAP) of $178.3 million.  Margins improved to a respectable 52.5%, which is generally considered high for a fabless semiconductor company.  When we compare these results to AMD which had reported earnings a few weeks ago, we see that while NVIDIA had less revenue (AMD reported $1.7 billion) the company had nearly double the overall profit (AMD reported around $97 million).  AMD has a strong CPU business, which is something that NVIDIA is working on.  AMD reported margins in the 45% range, but they also have a larger workforce and larger capital expenditures at this time.

Read the rest of the article here.

Source: NVIDIA

NVIDIA Outlines Multi-GPU and Cloud Graphics With Project Maximus and Virtual Graphics Technologies

Subject: Graphics Cards | August 9, 2011 - 05:08 PM |
Tagged: virtual graphics, tesla, quadro, project maximus, nvidia

It's that time of year again, SIGGRAPH is upon us. The same graphics showcase that brings oohs and ahhs over the latest in ray tracing generated graphics each year has seen NVIDIA bring multi-GPU, scalable Tesla computing power and professional graphics for mobile devices delivered using the Internet at SIGGRAPH 2011. The multi-GPU and cloud based graphics technologies have been dubbed Project Maximus and Virtual Graphics respectively.

nvidia-siggraph-20111091.jpg

According to Engadget, Project Maximus sees NVIDIA opting to recommend a lower end Quadro card and combining it with an almost infinitely scalable Tesla powered cluster. The light Quadro card would handle all of the graphics duties in displaying the desktop and applications' output while the attached Tesla processors would be responsible for handling all of the underlying computationally intensive calculations. This option will be especially interesting for businesses and professional designers as they will be able to allocate to each user only the power they need to get the job done, and future upgrade-ability would improve by allowing more Tesla processors to be added as opposed to a whole graphics system overhaul. Engadget quoted NVIDIA in further clarifying that in some programs, "better performance is achieved by adding a Tesla companion processor, as opposed to scaling up the primary Quadro graphics. Users still require as much graphics as possible."

Virtual Graphics on the other hand is NVIDIA's technology preview that aims to bring quality graphics to numerous devices so long as they have a solid internet connection. Much like onlive is able to stream games to low end computers, NVIDIA's virtual graphics technology seems to be pushing professional level graphics to mobile devices by using graphics card clusters based in the cloud to deliver much more graphical prowess than the mobile SoC (System on a Chip) graphics processors can provide alone. Branching off from Virtual Graphics technology is Project Monterrey, which is an initiative to bring NVIDIA Quadro level graphics on an application agnostic basis to any device capable of maintaining a solid internet connection.

Adobe and Autodesk have already signed on as software partners, and HP will be delivering a three GPU workstation later this year.  More photos of the NVIDIA presentation are available over at Engadget.

Source: Engadget

Cray Announces AMD Bulldozer CPU and NVIDIA Tesla GPU Supercomputer Capable of 50 Petaflops

Subject: Systems | May 24, 2011 - 09:07 PM |
Tagged: tesla, supercomputer, petaflop, HPC, bulldozer

 Cray has been a huge name in the supercomputer market for years, and with the new XK6 they are promising to deliver a supercomputer capable of 50 Thousand Trillion operations per second. Powered by AMD Operton CPUs and NVIDIA GPUs, each XK6 blade is comprised of 2 Gemini interconnects pairing four AMD Opteron CPUs with four NVIDIA Tesla X2090 embedded graphics cards. The graphics cards in each blade have access to 6GB of GDDR5 memory, and are connected via PCI-E 2.0 links to the Opteron processors. The CPUS have access to four DDR3 memory slots “running at 1.6GHz for every G34 socket,” according to The Register. This amounts to 32GB per two-socket node when using 4GB sticks.

cray-xk6.jpg

Cray plans to wait until AMD releases the 16 core 32nm Opteron CPUs in Q3, dubbed the Opteron 6200s. The Register quotes AMD’s CEO Thomas Siefert as promising the processors are based on the new Bulldozer cores (and would be compatible with the current G34 sockets) “would ship by summer.”

Further, they claim that Cray’s goal with the XK6 was to keep the new blades within the same thermal boundaries as its predecessor, despite the inclusion of GPUs into the mix. Cray has indicated that, due to their success in remaining within the thermal envelope, their customers will be able to use XE6 and XK6 blades interchangeably and will allow them to customize their supercomputer load-out to meet the demands of their specific computing workloads.

XK6_Blade.PNG

Each cabinet is capable of storing up to 24 blades, and can deliver up to 50 kilowatts of power. Each of the Tesla X2090 GPUS are capable of 665 gigaflops during double-precision floating point operations, something that GPUs excel at. As each XK6 blade contains 4 GPUS, and each cabinet can hold 24 blades, customers are looking at 63.8 teraflops of computing power solely from the graphics cards. On the CPU side of things, Cray is not able to release specifications on the processors as AMD has yet to deliver the chips in question. The Register estimates that each XK6 blade will provide 3.5 teraflops of floating point computing power, which amounts to approximately 84 teraflops per cabinet.

With a claimed capability to utilize up to 300 cabinets full of XK6 blades, customers are looking at approximately 44 petaflops of computing horsepower, with GPUs delivering 19.14 petaflops, and the CPUs estimated to provide 25.2 petaflops of floating point computational power.

The first customer of this system will be the Swiss National Supercomputing Centre. According to the Seattle Times, the center’s director Professor Thomas Schulthess stated that they chose the Cray XK6 based supercomputer not for it’s raw performance, but because “the Cray XK6 promises to be the first general-purpose supercomputer based on GPU technology, and we are very much looking forward to exploring its performance and productivity on real applications relevant to our scientists.”

Source: The Register

Fermi, Fermi, Fermi! Nobody pays attention to Tesla and the M2090 GPU Coprocessor

Subject: General Tech | May 18, 2011 - 11:39 AM |
Tagged: nvidia, gpu coprocessor, tesla

It is always the flashy brother that everyone notices, even if you've never met them ... say the GTX590.  However the other brother shouldn't be ignored because it turns out Telsa is pretty cool among the server crowd.  Where once the humble math coprocessor went the M2090 GPU coprocessor races past, with a specially made, not bin sorted 40nm Fermi GPU running at 1.3GHz and GDDR5 at 1.85GHz which can pull some interesting ECC tricks and of course a ful 512 CUDA Cores.  If you think that is a lot of power, NVIDIA told The Register they are recommending one M2090 per CPU core, not per physical CPU.  

ElReg_nvidia_tesla_m2090_gpu.jpg

"GPU chipmaker Nvidia knows that it has to do more to grow its Tesla biz than slap some passive heat sinks on a fanless GPU card and talk up its CUDA parallel-programming tools. It has to keep delivering price/performance improvements, as well.

And that's exactly what it's doing with the new Tesla M2090 GPU coprocessor."

Here is some more Tech News from around the web:

Tech Talk

Source: The Register