Frame Pacing for CrossFire
When the Radeon HD 7990 launched in April of this year, we had some not-so-great things to say about it. The HD 7990 depends on CrossFire technology to function and we had found quite a few problems with AMD's CrossFire technology over the last months of testing with our Frame Rating technology, the HD 7990 "had a hard time justifying its $1000 price tag." Right at launch, AMD gave us a taste of a new driver that they were hoping would fix the frame pacing and frame time variance issues seen in CrossFire, and it looked positive. The problem was that the driver wouldn't be available until summer.
As I said then: "But until that driver is perfected, is bug free and is presented to buyers as a made-for-primetime solution, I just cannot recommend an investment this large on the Radeon HD 7990."
Today could be a very big day for AMD - the release of the promised driver update that enables frame pacing on AMD 7000-series CrossFire configurations including the Radeon HD 7990 graphics cards with a pair of Tahiti GPUs.
It's not perfect yet and there are some things to keep an eye on. For example, this fix will not address Eyefinity configurations which includes multi-panel solutions and the new 4K 60 Hz displays that require a tiled display configuration. Also, we found some issues with more than two GPU CrossFire that we'll address in a later page too.
New Driver Details
Starting with 13.8 and moving forward, AMD plans to have the frame pacing fix integrated into all future drivers. The software team has implemented a software based frame pacing algorithm that simply monitors the time it takes for each GPU to render a frame, how long a frame is displayed on the screen and inserts delays into the present calls when necessary to prevent very tightly timed frame renders. This balances or "paces" the frame output to the screen without lowering the overall frame rate. The driver monitors this constantly in real-time and minor changes are made on a regular basis to keep the GPUs in check.
As you would expect, this algorithm is completely game engine independent and the games should be completely oblivious to all that is going on (other than the feedback from present calls, etc).
This fix is generic meaning it is not tied to any specific game and doesn't require profiles like CrossFire can from time to time. The current implementation will work with DX10 and DX11 based titles only with DX9 support being added later with another release. AMD claims this was simply a development time issue and since most modern GPU-bound titles are DX10/11 based they focused on that area first. In phase 2 of the frame pacing implementation AMD will add in DX9 and OpenGL support. AMD wouldn't give me a timeline for implementation though so we'll have to see how much pressure AMD continues with internally to get the job done.
Subject: General Tech | August 1, 2013 - 01:35 PM | Ken Addison
Tagged: video, shield, Samsung, quakecon, podcast, nvidia, frame rating, crossfire, amd, 840 evo, 7990
PC Perspective Podcast #262 - 08/01/2013
Join us this week as we discuss NVIDIA SHIELD, the Samsung 840 EVO, Viewer Q&A, and much more LIVE from QuakeCon 2013!
The URL for the podcast is: http://pcper.com/podcast - Share with your friends!
- iTunes - Subscribe to the podcast directly through the Store
- RSS - Subscribe through your regular RSS reader
- MP3 - Direct download link to the MP3 file
Hosts: Ryan Shrout, Josh Walrath, and Allyn Malventano
Program length: 1:19:01
Happy Birthday to Josh!!
It has come to my attention that you are planning on producing and selling a device to be called “NVIDIA SHIELD.” It should be noted that even though it shares the same name, this device has no matching attributes of the super-hero comic-based security agency. Please adjust.
When SHIELD was previewed to the world at CES in January of this year, there were a hundred questions about the device. What would it cost? Would the build quality stand up to expectations? Would the Android operating system hold up as a dedicated gaming platform? After months of waiting a SHIELD unit finally arrived in our offices in early July, giving us plenty of time (I thought) to really get a feel for the device and its strengths and weakness. As it turned out though, it still seemed like an inadequate amount of time to really gauge this product. But I am going to take a stab at it, feature by feature.
NVIDIA SHIELD aims to be a mobile gaming platform based on Android with a flip out touch-screen interface, high quality console design integrated controller, and added features like PC game streaming and Miracast support.
Initial Unboxing and Overview of Product Video
At the heart of NVIDIA SHIELD is the brand new Tegra 4 SoC, NVIDIA’s latest entry into the world of mobile processors. Tegra 4 is a quad-core, ARM Cortex-A15 based SoC that includes a 5th A15 core built on lower power optimized process technology to run background and idle tasks using less power. This is very similar to what NVIDIA did with Tegra 3’s 4+1 technology, and how ARM is tackling the problem with big.LITTLE philosophy.
Subject: General Tech, Graphics Cards, Mobile, Shows and Expos | July 24, 2013 - 05:15 PM | Scott Michaud
Tagged: Siggraph, kepler, mobile, tegra, nvidia, unreal engine 4
SIGGRAPH 2013 is wrapping up in the next couple of days but, now that NVIDIA removed the veil surrounding Mobile Kepler, people are chatting about what is to follow Tegra 4. Tim Sweeney, founder of Epic Games, contributed to NVIDIA Blogs the number of ways that certain attendees can experience Unreal Engine 4 at the show. As it turns out, NVIDIA engineers have displayed the engine both on Mobile Kepler as well as behind closed doors on desktop PCs.
Not from SIGGRAPH, this is a leak from, I believe, GTC late last March.
Also, this is Battlefield 3, not Unreal Engine 4.
Tim, obviously taking the developer standpoint, is very excited about OpenGL 4.3 support within the mobile GPU. In all, he did not say too much of note. They are targeting Unreal Engine 4 at a broad range of platforms: mobile, desktop, console, and, while absent from this editorial, web standards. Each of these platforms are settling on the same set of features, albeit with huge gaps in performance, allowing developers to focus on a scale of performance instead of a flowchart of capabilities.
Unfortunately for us, there have yet to be leaks from the trade show. We will keep you up-to-date if we find any, however.
NVIDIA Finally Gets Serious with Tegra
Tegra has had an interesting run of things. The original Tegra 1 was utilized only by Microsoft with Zune. Tegra 2 had a better adoption, but did not produce the design wins to propel NVIDIA to a leadership position in cell phones and tablets. Tegra 3 found a spot in Microsoft’s Surface, but that has turned out to be a far more bitter experience than expected. Tegra 4 so far has been integrated into a handful of products and is being featured in NVIDIA’s upcoming Shield product. It also hit some production snags that made it later to market than expected.
I think the primary issue with the first three generations of products is pretty simple. There was a distinct lack of differentiation from the other ARM based products around. Yes, NVIDIA brought their graphics prowess to the market, but never in a form that distanced itself adequately from the competition. Tegra 2 boasted GeForce based graphics, but we did not find out until later that it was comprised of basically four pixel shaders and four vertex shaders that had more in common with the GeForce 7800/7900 series than it did with any of the modern unified architectures of the time. Tegra 3 boasted a big graphical boost, but it was in the form of doubling the pixel shader units and leaving the vertex units alone.
While NVIDIA had very strong developer relations and a leg up on the competition in terms of software support, it was never enough to propel Tegra beyond a handful of devices. NVIDIA is trying to rectify that with Tegra 4 and the 72 shader units that it contains (still divided between pixel and vertex units). Tegra 4 is not perfect in that it is late to market and the GPU is not OpenGL ES 3.0 compliant. ARM, Imagination Technologies, and Qualcomm are offering new graphics processing units that are not only OpenGL ES 3.0 compliant, but also offer OpenCL 1.1 support. Tegra 4 does not support OpenCL. In fact, it does not support NVIDIA’s in-house CUDA. Ouch.
Jumping into a new market is not an easy thing, and invariably mistakes will be made. NVIDIA worked hard to make a solid foundation with their products, and certainly they had to learn to walk before they could run. Unfortunately, running effectively entails having design wins due to outstanding features, performance, and power consumption. NVIDIA was really only average in all of those areas. NVIDIA is hoping to change that. Their first salvo into offering a product that offers features and support that is a step above the competition is what we are talking about today.
Subject: Graphics Cards | July 23, 2013 - 09:00 AM | Tim Verry
Tagged: workstation, simulation, quadro k6000, quadro, nvidia, k6000, gk110
Today, NVIDIA announced its flagship Quadro graphics card called the K6000. Back in March of this year, NVIDIA launched a new like of Quadro graphics cards for workstations. Those cards replaced the Fermi-based predecessors with new models based on NVIDIA’s GK-104 “Kepler” GPUs. Notably missing from that new lineup was NVIDIA Quadro K6000, which is the successor to the Quadro 6000.
Contrary to previous rumors, the Quadro K6000 will be based on the full GK110 chip. In fact, it will be the fastest single-GPU graphics card that NVIDIA has to offer.
The Quadro K6000 features a full GK110 GPU, 12GB of GDDR5 memory on a 384-bit bus, and a 225W TDP. The full GK110-based GPU has 2,880 CUDA cores, 256 TMUs, and 48 ROPs. Unfortunately, NVIDIA has not yet revealed clockspeeds for the GPU or memory.
Thanks to the GPU not having any SMX units disabled, the NVIDIA Quadro K6000 is rated for approximately 1.4 TFLOPS of peak double precision floating point performance of and 5.2 TFLOPS of single precision floating point performance.
The chart below illustrates the differences between the new flagship Quadro K6000 with full GK110 GPU and the highest tier Tesla and consumer graphics cards which have at least one SMX unit disabled.
NVIDIA GK110-Based Graphics Cards
|Quadro K6000||Tesla K20X||GTX TITAN|
|Memory Bandwidth||288 GB/s||250 GB/s||288 GB/s|
|Single Precision FP||5.2 TFLOPS||3.95 TFLOPS||4.5 TFLOPS|
|Double Precision FP||~1.4 TFLOPS||1.31 TFLOPS||1.31 TFLOPS|
The NVIDIA GTX TITAN gaming graphics card has 2,688 CUDA cores, 224 TMUs, and 48 ROPs and is rated for peak double and single precision of 1.31 TFLOPS and 4.5 TFLOPS respectively. On the other hand, the lower-clocked Tesla K20X compute accelerator card has 2,688 CUDA cores, 224 TMUs, and 48 ROPs along with lower clockspeeds on the memory and GPU. Because of the lower clockspeeds, the K20X is rated for double and single precision floating point performance of 1.31 TFLOPS and 3.95 TFLOPS and memory bandwidth of 250GB/s versus the 288GB/s bandwidth on the TITAN and K6000.
NVIDIA® Quadro® K6000 GPU
In all, the new K6000 is an impressive card for professional users, and the GK110 chip should perform well in the workstation environment where GK104 was the only option before. NVIDIA claims that the GK110 is up to 3-times the performance of the Quadro 6000 (non K) predecessor. It is also the first Quadro GPU with 12GB of GDDR5 memory, which should lend itself well to high resolutions and artists working with highly detailed models and simulations.
Specifically, NVIDIA is aiming this graphics card at the visual computing market, which includes 3D designers, visual effects artists, 3d animation, and simulations. The company provided several examples in the press release, including using the GK110-based card to render nearly complete photorealistic vehicle models in RTT Deltagen that can run real time during design reviews.
The Quadro K6000 allows for larger and fully populated virtual sets with realistic lighting and scene detail when 3D animators and VFX artists are working with models and movie scenes in real time. Simulation work also takes advantage of the beefy double precision horsepower to support up to 3-times faster simulation run times in Terraspark's InsightEarth simulation. Users can run simulations with wider areas in less time than the previous generation Quardo cards, and is being used by oil companies to determine the best places to drill.
Pixar's Vice President of Software and R&D Guido Quaroni had the following to say regarding the K6000.
"The Kepler features are key to our next generation of real-time lighting and geometryhandling. The added memory and other features allow our artists to see much more of thefinal scene in a real-time, interactive form, which allows many more artistic iterations."
The K6000 is the final piece to the traditional NVIDIA Quadro lineup and is likely to be well recieved by workstation users that need the increased double precision performance that GK110 offers over the existing GK104 chips. Specific pricing and availability are still unknown, but the K6000 will be available from workstation providers, system integrators, and authorized distribution partners beginning this fall.
Subject: General Tech | July 22, 2013 - 03:19 PM | Tim Verry
Tagged: nvidia, shield, project shield, tegra 4, gaming
NVIDIA has announced that its Shield gaming portable will begin shipping on July 31. The portable gaming console was originally slated to launch on June 27th for $349 but due to an unspecified mechanical issue with a third party component (discovered at the last minute) the company delayed the launch until the problem was fixed. Now, it appears the issue has been resolved and the NVIDIA Shield will launch on July 31 for $299 or $50 less than the original MSRP.
As a refresher, Project Shield, or just Shield as it is known now, is a portable game console that is made up of a controller, mobile-class hardware internals, and an integrated 5” 720 touchscreen display that hinges clam shell style from the back of the controller.. It runs the Android Jelly Bean operating system and can play Android games as well as traditional PC games that are streamed from PCs with a qualifying NVIDIA graphics card. On the inside, Shield has a NVIDIA Tegra 4 SoC (quad core ARM Cortex A15-based CPU with NVIDIA’s proprietary GPU technology added in), 2GB RAM, and 16GB of storage. In all, the Shield measures 158mm (W) x 135mm (D) x 57mm (H) and weighs about 1.2 pounds. The controller is reminiscent of an Xbox 360 game pad.
With the third party mechanical issue out of the way, the Shield is ready to ship on July 31 and is already avialble for pre-order. Gamers will be able to test out the Shield at Shield Experience Centers located at certain GameStop, Microcenter, and Canada Computers shops in the US and Canada. The hardware will also be available for purchase at the usual online retailers for $299 (MSRP).
Subject: Graphics Cards, Displays | July 18, 2013 - 08:16 PM | Ryan Shrout
Tagged: pq321q, PQ321, nvidia, drivers, asus, 4k
It would appear that NVIDIA was paying attention to our recent live stream where we unboxed and setup our new ASUS PQ321Q 4K 3840x2160 monitor. During our setup on the AMD and NVIDIA based test beds I noticed (and the viewers saw) some less than desirable results during initial configuration. The driver support was pretty clunky, we had issues with reliability of booting and switching between SST and MST (single and multi stream transport) modes caused the card some issue as well.
Today NVIDIA released a new R326 driver, 326.19 beta, that improves performance in a couple of games but more importantly, adds support for "tiled 4K displays." If you don't know what that means, you aren't alone. A tiled display is one that is powered by multiple heads and essentially acts as multiple screens in a single housing. The ASUS PQ321Q monitor that we have in house, and the Sharp PN-K321, are tiled displays that use DisplayPort 1.2 MST technology to run at 3840x2160 @ 60 Hz.
It is great to see NVIDIA reacting quickly to new technologies and to our issues from just under a week gone by. If you have either of these displays, be sure to give the new driver a shot and let me know your results!
Subject: General Tech, Storage | July 18, 2013 - 04:56 PM | Scott Michaud
Tagged: Raspberry Pi, nvidia, HPC, amazon
Adam DeConinck, high performance computing (HPC) systems engineer for NVIDIA, built a personal computer cluster in his spare time. While not exactly high performance, especially when compared to the systems he maintains for Amazon and his employer, its case is made of Lego and seems to be under a third of a cubic foot in volume.
Image source: NVIDIA Blogs
Raspberry Pi is based on a single-core ARM CPU bundled on an SoC with a 24 GFLOP GPU and 256 or 512 MB of memory. While this misses the cutesy point of the story, I am skeptical of the expected 16W power rating. Five Raspberry Pis, with Ethernet, draw a combined maximum of 17.5W, alone, and even that neglects the draw of the networking switch. My, personal, 8-port unmanaged switch is rated to draw 12W which, when added to 17.5W, is not 16W and thus something is being neglected or averaged. Then again, his device, power is his concern.
Despite constant development and maintenance of interconnected computers, professionally, Adam's will for related hobbies has not been displaced. Even after the initial build, he already plans to graft the Hadoop framework and really reign in the five ARM cores for something useful...
... but, let's be honest, probably not too useful.
Introduction and Design
With the release of Haswell upon us, we’re being treated to an impacting refresh of some already-impressive notebooks. Chief among the benefits is the much-championed battery life improvements—and while better power efficiency is obviously valuable where portability is a primary focus, beefier models can also benefit by way of increased versatility. Sure, gaming notebooks are normally tethered to an AC adapter, but when it’s time to unplug for some more menial tasks, it’s good to know that you won’t be out of juice in a couple of hours.
Of course, an abundance of gaming muscle never hurts, either. As the test platform for one of our recent mobile GPU analyses, MSI’s 15.6” GT60 gaming notebook is, for lack of a better description, one hell of a beast. Following up on Ryan’s extensive GPU testing, we’ll now take a more balanced and comprehensive look at the GT60 itself. Is it worth the daunting $1,999 MSRP? Does the jump to Haswell provide ample and economical benefits? And really, how much of a difference does it make in terms of battery life?
Our GT60 test machine featured the following configuration:
In case it wasn’t already apparent, this device makes no compromises. Sporting a desktop-grade GPU and a quad-core Haswell CPU, it looks poised to be the most powerful notebook we’ve tested to date. Other configurations exist as well, spanning various CPU, GPU, and storage options. However, all available GT60 configurations feature a 1080p anti-glare screen, discrete graphics (starting at the GTX 670M and up), Killer Gigabit LAN, and a case built from metal and heavy-duty plastic. They also come preconfigured with Windows 8, so the only way to get Windows 7 with your GT60 is to purchase it through a reseller that performs customizations.