Subject: Shows and Expos | May 15, 2012 - 03:43 PM | Jeremy Hellstrom
Tagged: tesla, nvidia, GTC 2012, kepler, CUDA
SAN JOSE, Calif.—GPU Technology Conference—May 15, 2012—NVIDIA today unveiled a new family of Tesla GPUs based on the revolutionary NVIDIA Kepler GPU computing architecture, which makes GPU-accelerated computing easier and more accessible for a broader range of high performance computing (HPC) scientific and technical applications.
The new NVIDIA Tesla K10 and K20 GPUs are computing accelerators built to handle the most complex HPC problems in the world. Designed with an intense focus on high performance and extreme power efficiency, Kepler is three times as efficient as its predecessor, the NVIDIA Fermi architecture, which itself established a new standard for parallel computing when introduced two years ago.
“Fermi was a major step forward in computing,” said Bill Dally, chief scientist and senior vice president of research at NVIDIA. “It established GPU-accelerated computing in the top tier of high performance computing and attracted hundreds of thousands of developers to the GPU computing platform. Kepler will be equally disruptive, establishing GPUs broadly into technical computing, due to their ease of use, broad applicability and efficiency.”
The Tesla K10 and K20 GPUs were introduced at the GPU Technology Conference (GTC), as part of a series of announcements from NVIDIA, all of which can be accessed in the GTC online press room.
NVIDIA developed a set of innovative architectural technologies that make the Kepler GPUs high performing and highly energy efficient, as well as more applicable to a wider set of developers and applications. Among the major innovations are:
- SMX Streaming Multiprocessor – The basic building block of every GPU, the SMX streaming multiprocessor was redesigned from the ground up for high performance and energy efficiency. It delivers up to three times more performance per watt than the Fermi streaming multiprocessor, making it possible to build a supercomputer that delivers one petaflop of computing performance in just 10 server racks. SMX’s energy efficiency was achieved by increasing its number of CUDA architecture cores by four times, while reducing the clock speed of each core, power-gating parts of the GPU when idle and maximizing the GPU area devoted to parallel-processing cores instead of control logic.
- Dynamic Parallelism – This capability enables GPU threads to dynamically spawn new threads, allowing the GPU to adapt dynamically to the data. It greatly simplifies parallel programming, enabling GPU acceleration of a broader set of popular algorithms, such as adaptive mesh refinement, fast multipole methods and multigrid methods.
- Hyper-Q – This enables multiple CPU cores to simultaneously use the CUDA architecture cores on a single Kepler GPU. This dramatically increases GPU utilization, slashing CPU idle times and advancing programmability. Hyper-Q is ideal for cluster applications that use MPI.
“We designed Kepler with an eye towards three things: performance, efficiency and accessibility,” said Jonah Alben, senior vice president of GPU Engineering and principal architect of Kepler at NVIDIA. “It represents an important milestone in GPU-accelerated computing and should foster the next wave of breakthroughs in computational research.”
NVIDIA Tesla K10 and K20 GPUs
The NVIDIA Tesla K10 GPU delivers the world’s highest throughput for signal, image and seismic processing applications. Optimized for customers in oil and gas exploration and the defense industry, a single Tesla K10 accelerator board features two GK104 Kepler GPUs that deliver an aggregate performance of 4.58 teraflops of peak single-precision floating point and 320 GB per second memory bandwidth.
The NVIDIA Tesla K20 GPU is the new flagship of the Tesla GPU product family, designed for the most computationally intensive HPC environments. Expected to be the world’s highest-performance, most energy-efficient GPU, the Tesla K20 is planned to be available in the fourth quarter of 2012.
The Tesla K20 is based on the GK110 Kepler GPU. This GPU delivers three times more double precision compared to Fermi architecture-based Tesla products and it supports the Hyper-Q and dynamic parallelism capabilities. The GK110 GPU is expected to be incorporated into the new Titan supercomputer at the Oak Ridge National Laboratory in Tennessee and the Blue Waters system at the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign.
“In the two years since Fermi was launched, hybrid computing has become a widely adopted way to achieve higher performance for a number of critical HPC applications,” said Earl C. Joseph, program vice president of High-Performance Computing at IDC. “Over the next two years, we expect that GPUs will be increasingly used to provide higher performance on many applications.”
Preview of CUDA 5 Parallel Programming Platform
In addition to the Kepler architecture, NVIDIA today released a preview of the CUDA 5 parallel programming platform. Available to more than 20,000 members of NVIDIA’s GPU Computing Registered Developer program, the platform will enable developers to begin exploring ways to take advantage of the new Kepler GPUs, including dynamic parallelism.
The CUDA 5 parallel programming model is planned to be widely available in the third quarter of 2012. Developers can get access to the preview release by signing up for the GPU Computing Registered Developer program on the CUDA website.
Subject: General Tech | May 14, 2012 - 03:31 PM | Jeremy Hellstrom
Tagged: tesla, quadro, nvidia, maximus, GTC 2012, BOXX
There are many professional level products to be seen at this years GPU Technology Conference, one of the more impressive being NVIDIA Maximus technology. That takes the power of a Quadro and couples it with the new Tesla GPUs for impressive live rendering and CAD applications. These products are not for gamers more for game designers and graphical artists, but the technology its self is still something to keep your eyes on.
One of the vendors you will see is BOXX, with several different lines of computers are designed to 3ds Max, CATIA V6 Live Rendering, SolidWorks and other professional level HPC applications. With a NVIDIA Quadro 6000 6 GB, a Tesla 2075 6 GB and a 240GB SSD for cache and programs you will be rendering like never before.
Ryan will be at the GTC so keep an eye on the page for news from that show when it begins in the middle of this week. NVIDIA's Maximus technology is sure to feature in some of these stories but do keep in mind this is the GTC and not the GDC so new game previews are unlikely though new benchmark software and proof of concept game engines might be.
"3DBOXX workstations featuring NVIDIA Maximus technology combine the visualization and interactive design capability of NVIDIA Quadro GPUs with the high-performance computing power of NVIDIA Tesla C2075 GPUs into a single system."
Here is some more Tech News from around the web:
- GTC 2012: Not your average vendo-loveathon @ The Register
- Ubuntu Developer Summit 12.10 Recap @ Phoronix
- Using a Lenovo All-In-One? Grab a fire extinguisher! @ The Register
- Tenda Portable Wireless AP/Router W300M @ Kitguru
- OC3D @ i45 Spring Event
Quarter Down but Year Up
Yesterday NVIDIA released their latest financial results for Q4 2012 and FY2012. There was some good and bad mixed in the results, but overall it was a very successful year for NVIDIA.
Q4 saw gross revenue top $953.2 million US with a net income of $116 million US. This is about $53 million less in gross revenue and $62 million down in net income as compared to last quarter. There are several reasons as to why this happened, but the majority of it appears to be due to the hard drive shortage affecting add-in sales. Simply put, the increase in hard drive prices caused most OEMs to take a good look at the price points of the entire system, and oftentimes would cut out the add-in graphics and just use integrated.
Tegra 3 promises a 50% increase in revenue for NVIDIA this coming year.
Two other reasons for the lower than expected quarter were start of the transition to 28 nm products based on Kepler. They are ramping up production on 28 nm and slowing down 40 nm. Yields on 28 nm are not where they expected them to be, and there is also a shortage of wafer starts for that line. This had a pretty minimal affect overall on Q4, but it will be one of the prime reasons why revenue looks like it will be down in Q1 2013.
Subject: Editorial | November 16, 2011 - 09:08 PM | Josh Walrath
Tagged: tesla, tegra, Results, Q3 2012, nvidia, income, fermi
Late last week NVIDIA reported their Q3 2012 (they have an unconventional reporting calendar), and the results were overwhelmingly positive for the once struggling company. Throughout 2010 NVIDIA struggled with the poor results of their 400 series of graphics cards as compared to the relative smooth sailing that AMD had going into the DirectX 11 marketplace. NVIDIA was also struggling to get the original Tegra to be accepted by the marketplace, which never occurred with that particular generation of products.
NVIDIA reported gross revenues of $1.07 billion for the previous quarter, with a net income (GAAP) of $178.3 million. Margins improved to a respectable 52.5%, which is generally considered high for a fabless semiconductor company. When we compare these results to AMD which had reported earnings a few weeks ago, we see that while NVIDIA had less revenue (AMD reported $1.7 billion) the company had nearly double the overall profit (AMD reported around $97 million). AMD has a strong CPU business, which is something that NVIDIA is working on. AMD reported margins in the 45% range, but they also have a larger workforce and larger capital expenditures at this time.
Subject: Graphics Cards | August 9, 2011 - 05:08 PM | Tim Verry
Tagged: virtual graphics, tesla, quadro, project maximus, nvidia
It's that time of year again, SIGGRAPH is upon us. The same graphics showcase that brings oohs and ahhs over the latest in ray tracing generated graphics each year has seen NVIDIA bring multi-GPU, scalable Tesla computing power and professional graphics for mobile devices delivered using the Internet at SIGGRAPH 2011. The multi-GPU and cloud based graphics technologies have been dubbed Project Maximus and Virtual Graphics respectively.
According to Engadget, Project Maximus sees NVIDIA opting to recommend a lower end Quadro card and combining it with an almost infinitely scalable Tesla powered cluster. The light Quadro card would handle all of the graphics duties in displaying the desktop and applications' output while the attached Tesla processors would be responsible for handling all of the underlying computationally intensive calculations. This option will be especially interesting for businesses and professional designers as they will be able to allocate to each user only the power they need to get the job done, and future upgrade-ability would improve by allowing more Tesla processors to be added as opposed to a whole graphics system overhaul. Engadget quoted NVIDIA in further clarifying that in some programs, "better performance is achieved by adding a Tesla companion processor, as opposed to scaling up the primary Quadro graphics. Users still require as much graphics as possible."
Virtual Graphics on the other hand is NVIDIA's technology preview that aims to bring quality graphics to numerous devices so long as they have a solid internet connection. Much like onlive is able to stream games to low end computers, NVIDIA's virtual graphics technology seems to be pushing professional level graphics to mobile devices by using graphics card clusters based in the cloud to deliver much more graphical prowess than the mobile SoC (System on a Chip) graphics processors can provide alone. Branching off from Virtual Graphics technology is Project Monterrey, which is an initiative to bring NVIDIA Quadro level graphics on an application agnostic basis to any device capable of maintaining a solid internet connection.
Adobe and Autodesk have already signed on as software partners, and HP will be delivering a three GPU workstation later this year. More photos of the NVIDIA presentation are available over at Engadget.
Subject: Systems | May 24, 2011 - 09:07 PM | Tim Verry
Tagged: tesla, supercomputer, petaflop, HPC, bulldozer
Cray has been a huge name in the supercomputer market for years, and with the new XK6 they are promising to deliver a supercomputer capable of 50 Thousand Trillion operations per second. Powered by AMD Operton CPUs and NVIDIA GPUs, each XK6 blade is comprised of 2 Gemini interconnects pairing four AMD Opteron CPUs with four NVIDIA Tesla X2090 embedded graphics cards. The graphics cards in each blade have access to 6GB of GDDR5 memory, and are connected via PCI-E 2.0 links to the Opteron processors. The CPUS have access to four DDR3 memory slots “running at 1.6GHz for every G34 socket,” according to The Register. This amounts to 32GB per two-socket node when using 4GB sticks.
Cray plans to wait until AMD releases the 16 core 32nm Opteron CPUs in Q3, dubbed the Opteron 6200s. The Register quotes AMD’s CEO Thomas Siefert as promising the processors are based on the new Bulldozer cores (and would be compatible with the current G34 sockets) “would ship by summer.”
Further, they claim that Cray’s goal with the XK6 was to keep the new blades within the same thermal boundaries as its predecessor, despite the inclusion of GPUs into the mix. Cray has indicated that, due to their success in remaining within the thermal envelope, their customers will be able to use XE6 and XK6 blades interchangeably and will allow them to customize their supercomputer load-out to meet the demands of their specific computing workloads.
Each cabinet is capable of storing up to 24 blades, and can deliver up to 50 kilowatts of power. Each of the Tesla X2090 GPUS are capable of 665 gigaflops during double-precision floating point operations, something that GPUs excel at. As each XK6 blade contains 4 GPUS, and each cabinet can hold 24 blades, customers are looking at 63.8 teraflops of computing power solely from the graphics cards. On the CPU side of things, Cray is not able to release specifications on the processors as AMD has yet to deliver the chips in question. The Register estimates that each XK6 blade will provide 3.5 teraflops of floating point computing power, which amounts to approximately 84 teraflops per cabinet.
With a claimed capability to utilize up to 300 cabinets full of XK6 blades, customers are looking at approximately 44 petaflops of computing horsepower, with GPUs delivering 19.14 petaflops, and the CPUs estimated to provide 25.2 petaflops of floating point computational power.
The first customer of this system will be the Swiss National Supercomputing Centre. According to the Seattle Times, the center’s director Professor Thomas Schulthess stated that they chose the Cray XK6 based supercomputer not for it’s raw performance, but because “the Cray XK6 promises to be the first general-purpose supercomputer based on GPU technology, and we are very much looking forward to exploring its performance and productivity on real applications relevant to our scientists.”
Subject: General Tech | May 18, 2011 - 11:39 AM | Jeremy Hellstrom
Tagged: nvidia, gpu coprocessor, tesla
It is always the flashy brother that everyone notices, even if you've never met them ... say the GTX590. However the other brother shouldn't be ignored because it turns out Telsa is pretty cool among the server crowd. Where once the humble math coprocessor went the M2090 GPU coprocessor races past, with a specially made, not bin sorted 40nm Fermi GPU running at 1.3GHz and GDDR5 at 1.85GHz which can pull some interesting ECC tricks and of course a ful 512 CUDA Cores. If you think that is a lot of power, NVIDIA told The Register they are recommending one M2090 per CPU core, not per physical CPU.
"GPU chipmaker Nvidia knows that it has to do more to grow its Tesla biz than slap some passive heat sinks on a fanless GPU card and talk up its CUDA parallel-programming tools. It has to keep delivering price/performance improvements, as well.
And that's exactly what it's doing with the new Tesla M2090 GPU coprocessor."
Here is some more Tech News from around the web:
- How Windows 7 Knows About Your Internet Connection @ Slashdot
- Intel’s 2011 Investor Meeting - Intel’s Architecture Group: 14nm Airmont Atom In 2014 @ AnandTech
- Otellini: 'Intel won't build ARM chips' @ The Register
- No McAfee technology will appear in Intel chips until 2012 @ The Inquirer
- Intel Sandy Bridge On Ubuntu 11.04 Is Still Troubling @ Phoronix
- Microsoft volume licensing to let you swap iron for clouds @ The Register
- Epson WorkForce 840 All-in-One Printer @ Maximum CPU
- Win A BitFenix Shinobi Window + Full Alchemy Cable Kit @ eTeknix