Podcast #263 - AMDs Crossfire Fix, Carmack Leaving id, Left 4 Dead 3 rumors and more!

Subject: General Tech | August 8, 2013 - 02:22 PM |
Tagged: podcast, video, amd, nvidia, crossfire, sli, frame rating, 7990, john carmack, Oculus

PC Perspective Podcast #263 - 08/08/2013

Join us this week as we discuss AMDs Crossfire Fix, Carmack Leaving id, Left 4 Dead 3 rumors and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

  • iTunes - Subscribe to the podcast directly through the iTunes Store
  • RSS - Subscribe through your regular RSS reader
  • MP3 - Direct download link to the MP3 file

Hosts: Ryan Shrout, Jeremy Hellstrom, Josh Walrath, and Allyn Malventano

Program length: 1:13:47

 

NVIDIA CloudLight: White Paper for Lighting in the Cloud

Subject: Editorial, General Tech | August 3, 2013 - 04:03 PM |
Tagged: nvidia, CloudLight, cloud gaming

Trust the cloud... be the cloud.

The executives on stage might as well have waved their hands while reciting that incantation during the announcement of the Xbox One. Why not? The audience would have just assumed Don Mattrick was trying to get some weird Kinect achievement on stage. You know, kill four people with one laser beam while trying to sink your next-generation platform in a ranked keynote. 50 Gamerscore!

cloudlight.png

Microsoft stated, during and after the keynote, that each Xbox One would have access to cloud servers for certain processing tasks. Xbox Live would be receiving enough servers such that each console could access three times its performance, at launch, to do... stuff. You know, things that are hard to calculate but are not too dependent upon latency. You know what we mean, right?

Apparently Microsoft did not realize that was a detail they were supposed to sell us on.

In the mean time, NVIDIA has been selling us on offloaded computation to cloud architectures. We knew Global Illumination (GI) was a very complicated problem. Most of the last couple decades has been progressively removing approximations to what light truly does.

CloudLight is their research project, presented at SIGRAPH Asia and via Williams College, to demonstrate server-processed indirect lighting. In their video, each of the three effects are demonstrated at multiple latencies. The results look pretty good until about 500ms which is where the brightest points are noticeably in the wrong locations.

cloudlight-2.jpg

Again, the video is available here.

The three methods used to generate indirect lighting are: irradiance maps, where lightmaps are continuously calculated on a server and streamed by H.264; photons, which raytraces lighting for the scene as previous rays expire and streams only the most current ones to clients who need it; and voxels, which stream fully computed frames to the clients. The most interesting part is that as you add more users, in most cases, server-processing remains fairly constant.

It should be noted, however, that each of these demonstrations only moved the most intense lights slowly. I would expect an effect such as switching a light on in an otherwise dark room would create a "pop-in" effect if it lags too far behind user interaction or the instantaneous dynamic lights.

That said, for a finite number of instant switches, it would be possible for a server to render both results and have the client choose the appropriate lightmap (or the appropriate set of pixels from the same, large, lightmap). For an Unreal Tournament 3 mod, I was experimenting with using a Global Illumination solver to calculate lighting. My intention was to allow users to turn on and off a handful of lights in each team's base. As lights were shot out or activated by a switch, the shader would switch to the appropriate pre-rendered solution. I would expect a similar method to work here.

What other effects do you believe can withstand a few hundred milliseconds of latency?

Source: NVIDIA

So you want a second opinion on Frame Pacing, eh?

Subject: General Tech, Graphics Cards | August 2, 2013 - 12:44 PM |
Tagged: video, stutter, radeon, nvidia, hd 7990, frame rating, frame pacing, amd

Scott Wasson from The Tech Report and Ryan have been discussing the microstuttering present in Crossfire and while Ryan got his hands on the hardware to capture the raw output first, The Tech Report have been investigating this issue as in depth as Ryan and Ken have been.  Their look at the new Catalyst and the effects of Frame Pacing show the same results as you saw yesterday in Ryan's article; for essentially no cost in performance you can get a much smoother experience when using a CrossFire system on a single display.  In their article they have done a great job of splicing together videos of runthroughs of several games with the Frame Pacing disabled on one side and enabled on the other, allowing you to see with your own eyes the difference in game play, without having to have your own Crossfire system.

7990-card-close.jpg

"Can a driver fix what ails the Radeon HD 7990? Will the new Catalysts magically transform this baby into the fastest graphics card on the planet? We go inside the second to find out."

Here is some more Tech News from around the web:

Tech Talk

New NVIDIA 326.41 Beta Graphics Drivers Add Shield PC Game Streaming Support

Subject: Graphics Cards | August 2, 2013 - 02:50 AM |
Tagged: graphics drivers, nvidia, shield, pc game streaming, gaming, geforce

NVIDIA recently released a new set of beta GeForce graphics card drivers targetted at the 400, 500, 600, and 700 series GPUs. The new version 326.41 beta drivers feature the same performance tweaks as the previous 326.19 drivers while baking in beta support for PC game streaming to NVIDIA’s Shield gaming portable from a compatible GeForce graphics card (GTX 650 or better). The new beta release is also the suggested version to use for those running the Windows 8.1 Preview.

NVIDIA has included the same performance tweaks as version 326.19. The tweaks offer up to 19% performance increases, depending on the particular GPU setup. For example, users running a GTX 770 will see as much as 15% better performance in Dirt: Showdown and 6% in Tomb Raider. Performance improvements are even higher for GTX 770 SLI setups, with boosts in Dirt: Showdown and F1 2012 of 19% and 11% respectively. NVIDIA has also added SLI profiles for Splinter Cell: Blacklist and Batman: Arkham Origins.

The NVIDIA Shield launched recently and reviews are making the rounds around the Internet. One of the exciting features of the Shield gaming handheld is the ability to stream PC games from a PC with NVIDIA graphics card to the Shield over Wi-Fi.

The 326.41 drivers improve performance across several games on the GTX 770.

The other major changes are improvements to tiled 4K displays, which are displays with 4K resolutions that are essentially made of two separate displays, and the monitor even shows up to the OS as two separate displays despite being in a single physical monitor. Using DisplayPort MST and tiled displays allows monitor manufacturers to deliver 4K displays with higher refresh rates.

Interested GeForce users can grab the latest beta drivers from the NVIDIA website or via the links below:

Source: Tech Spot
Author:
Manufacturer: AMD

Frame Pacing for CrossFire

When the Radeon HD 7990 launched in April of this year, we had some not-so-great things to say about it.  The HD 7990 depends on CrossFire technology to function and we had found quite a few problems with AMD's CrossFire technology over the last months of testing with our Frame Rating technology, the HD 7990 "had a hard time justifying its $1000 price tag."  Right at launch, AMD gave us a taste of a new driver that they were hoping would fix the frame pacing and frame time variance issues seen in CrossFire, and it looked positive.  The problem was that the driver wouldn't be available until summer.

As I said then: "But until that driver is perfected, is bug free and is presented to buyers as a made-for-primetime solution, I just cannot recommend an investment this large on the Radeon HD 7990."

Today could be a very big day for AMD - the release of the promised driver update that enables frame pacing on AMD 7000-series CrossFire configurations including the Radeon HD 7990 graphics cards with a pair of Tahiti GPUs. 

It's not perfect yet and there are some things to keep an eye on.  For example, this fix will not address Eyefinity configurations which includes multi-panel solutions and the new 4K 60 Hz displays that require a tiled display configuration.  Also, we found some issues with more than two GPU CrossFire that we'll address in a later page too.

 

New Driver Details

Starting with 13.8 and moving forward, AMD plans to have the frame pacing fix integrated into all future drivers.  The software team has implemented a software based frame pacing algorithm that simply monitors the time it takes for each GPU to render a frame, how long a frame is displayed on the screen and inserts delays into the present calls when necessary to prevent very tightly timed frame renders.  This balances or "paces" the frame output to the screen without lowering the overall frame rate.  The driver monitors this constantly in real-time and minor changes are made on a regular basis to keep the GPUs in check. 

7990card.JPG

As you would expect, this algorithm is completely game engine independent and the games should be completely oblivious to all that is going on (other than the feedback from present calls, etc). 

This fix is generic meaning it is not tied to any specific game and doesn't require profiles like CrossFire can from time to time.  The current implementation will work with DX10 and DX11 based titles only with DX9 support being added later with another release.  AMD claims this was simply a development time issue and since most modern GPU-bound titles are DX10/11 based they focused on that area first.  In phase 2 of the frame pacing implementation AMD will add in DX9 and OpenGL support.  AMD wouldn't give me a timeline for implementation though so we'll have to see how much pressure AMD continues with internally to get the job done.

Continue reading our story of the new AMD Catalyst 13.8 beta driver with frame pacing support!!

Podcast #262 - Live from QuakeCon 2013!

Subject: General Tech | August 1, 2013 - 01:35 PM |
Tagged: video, shield, Samsung, quakecon, podcast, nvidia, frame rating, crossfire, amd, 840 evo, 7990

PC Perspective Podcast #262 - 08/01/2013

Join us this week as we discuss NVIDIA SHIELD, the Samsung 840 EVO, Viewer Q&A, and much more LIVE from QuakeCon 2013!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

  • iTunes - Subscribe to the podcast directly through the iTunes Store
  • RSS - Subscribe through your regular RSS reader
  • MP3 - Direct download link to the MP3 file

Hosts: Ryan Shrout, Josh Walrath, and Allyn Malventano

Program length: 1:19:01

Author:
Subject: Mobile
Manufacturer: NVIDIA

The Hardware

Dear NVIDIA,

It has come to my attention that you are planning on producing and selling a device to be called “NVIDIA SHIELD.”  It should be noted that even though it shares the same name, this device has no matching attributes of the super-hero comic-based security agency.  Please adjust.

 

When SHIELD was previewed to the world at CES in January of this year, there were a hundred questions about the device.  What would it cost?  Would the build quality stand up to expectations?  Would the Android operating system hold up as a dedicated gaming platform?  After months of waiting a SHIELD unit finally arrived in our offices in early July, giving us plenty of time (I thought) to really get a feel for the device and its strengths and weakness.  As it turned out though, it still seemed like an inadequate amount of time to really gauge this product.  But I am going to take a stab at it, feature by feature.

IMG_9794.JPG

NVIDIA SHIELD aims to be a mobile gaming platform based on Android with a flip out touch-screen interface, high quality console design integrated controller, and added features like PC game streaming and Miracast support.

Initial Unboxing and Overview of Product Video

 

The Hardware

At the heart of NVIDIA SHIELD is the brand new Tegra 4 SoC, NVIDIA’s latest entry into the world of mobile processors.  Tegra 4 is a quad-core, ARM Cortex-A15 based SoC that includes a 5th A15 core built on lower power optimized process technology to run background and idle tasks using less power.  This is very similar to what NVIDIA did with Tegra 3’s 4+1 technology, and how ARM is tackling the problem with big.LITTLE philosophy. 

t4.jpg

Continue reading our review of the NVIDIA SHIELD Android gaming device!!

Unreal Engine 4 on Mobile Kepler at SIGGRAPH

Subject: General Tech, Graphics Cards, Mobile, Shows and Expos | July 24, 2013 - 05:15 PM |
Tagged: Siggraph, kepler, mobile, tegra, nvidia, unreal engine 4

SIGGRAPH 2013 is wrapping up in the next couple of days but, now that NVIDIA removed the veil surrounding Mobile Kepler, people are chatting about what is to follow Tegra 4. Tim Sweeney, founder of Epic Games, contributed to NVIDIA Blogs the number of ways that certain attendees can experience Unreal Engine 4 at the show. As it turns out, NVIDIA engineers have displayed the engine both on Mobile Kepler as well as behind closed doors on desktop PCs.

Not from SIGGRAPH, this is a leak from, I believe, GTC late last March.

Also, this is Battlefield 3, not Unreal Engine 4.

Tim, obviously taking the developer standpoint, is very excited about OpenGL 4.3 support within the mobile GPU. In all, he did not say too much of note. They are targeting Unreal Engine 4 at a broad range of platforms: mobile, desktop, console, and, while absent from this editorial, web standards. Each of these platforms are settling on the same set of features, albeit with huge gaps in performance, allowing developers to focus on a scale of performance instead of a flowchart of capabilities.

Unfortunately for us, there have yet to be leaks from the trade show. We will keep you up-to-date if we find any, however.

Source: NVIDIA Blogs
Author:
Manufacturer: NVIDIA

NVIDIA Finally Gets Serious with Tegra

Tegra has had an interesting run of things.  The original Tegra 1 was utilized only by Microsoft with Zune.  Tegra 2 had a better adoption, but did not produce the design wins to propel NVIDIA to a leadership position in cell phones and tablets.  Tegra 3 found a spot in Microsoft’s Surface, but that has turned out to be a far more bitter experience than expected.  Tegra 4 so far has been integrated into a handful of products and is being featured in NVIDIA’s upcoming Shield product.  It also hit some production snags that made it later to market than expected.

I think the primary issue with the first three generations of products is pretty simple.  There was a distinct lack of differentiation from the other ARM based products around.  Yes, NVIDIA brought their graphics prowess to the market, but never in a form that distanced itself adequately from the competition.  Tegra 2 boasted GeForce based graphics, but we did not find out until later that it was comprised of basically four pixel shaders and four vertex shaders that had more in common with the GeForce 7800/7900 series than it did with any of the modern unified architectures of the time.  Tegra 3 boasted a big graphical boost, but it was in the form of doubling the pixel shader units and leaving the vertex units alone.

kepler_smx.jpg

While NVIDIA had very strong developer relations and a leg up on the competition in terms of software support, it was never enough to propel Tegra beyond a handful of devices.  NVIDIA is trying to rectify that with Tegra 4 and the 72 shader units that it contains (still divided between pixel and vertex units).  Tegra 4 is not perfect in that it is late to market and the GPU is not OpenGL ES 3.0 compliant.  ARM, Imagination Technologies, and Qualcomm are offering new graphics processing units that are not only OpenGL ES 3.0 compliant, but also offer OpenCL 1.1 support.  Tegra 4 does not support OpenCL.  In fact, it does not support NVIDIA’s in-house CUDA.  Ouch.

Jumping into a new market is not an easy thing, and invariably mistakes will be made.  NVIDIA worked hard to make a solid foundation with their products, and certainly they had to learn to walk before they could run.  Unfortunately, running effectively entails having design wins due to outstanding features, performance, and power consumption.  NVIDIA was really only average in all of those areas.  NVIDIA is hoping to change that.  Their first salvo into offering a product that offers features and support that is a step above the competition is what we are talking about today.

Continue reading our article on the NVIDIA Kepler architecture making its way to mobile markets and Tegra!

NVIDIA Launches Flagship Quadro K6000 Graphics Card For Visual Computing Professionals

Subject: Graphics Cards | July 23, 2013 - 09:00 AM |
Tagged: workstation, simulation, quadro k6000, quadro, nvidia, k6000, gk110

Today, NVIDIA announced its flagship Quadro graphics card called the K6000. Back in March of this year, NVIDIA launched a new like of Quadro graphics cards for workstations. Those cards replaced the Fermi-based predecessors with new models based on NVIDIA’s GK-104 “Kepler” GPUs. Notably missing from that new lineup was NVIDIA Quadro K6000, which is the successor to the Quadro 6000.

NVIDIA Quadro K6000 GK110 GPU.jpg

Contrary to previous rumors, the Quadro K6000 will be based on the full GK110 chip. In fact, it will be the fastest single-GPU graphics card that NVIDIA has to offer.

The Quadro K6000 features a full GK110 GPU, 12GB of GDDR5 memory on a 384-bit bus, and a 225W TDP. The full GK110-based GPU has 2,880 CUDA cores, 256 TMUs, and 48 ROPs. Unfortunately, NVIDIA has not yet revealed clockspeeds for the GPU or memory.

NVIDIA Quadro K6000 GK110 GPU Specifications Comparison.jpg

Thanks to the GPU not having any SMX units disabled, the NVIDIA Quadro K6000 is rated for approximately 1.4 TFLOPS of peak double precision floating point performance of and 5.2 TFLOPS of single precision floating point performance.

The chart below illustrates the differences between the new flagship Quadro K6000 with full GK110 GPU and the highest tier Tesla and consumer graphics cards which have at least one SMX unit disabled.

NVIDIA GK110-Based Graphics Cards

  Quadro K6000 Tesla K20X GTX TITAN
CUDA 2,880 2,688 2,688
TMUs 256 224 224
ROPs 48 48 48
Memory 12GB 6GB 6GB
Memory Bus 384-bit 384-bit 384-bit
Memory Bandwidth 288 GB/s 250 GB/s 288 GB/s
Single Precision FP 5.2 TFLOPS 3.95 TFLOPS 4.5 TFLOPS
Double Precision FP ~1.4 TFLOPS 1.31 TFLOPS 1.31 TFLOPS

The NVIDIA GTX TITAN gaming graphics card has 2,688 CUDA cores, 224 TMUs, and 48 ROPs and is rated for peak double and single precision of 1.31 TFLOPS and 4.5 TFLOPS respectively. On the other hand, the lower-clocked Tesla K20X compute accelerator card has 2,688 CUDA cores, 224 TMUs, and 48 ROPs along with lower clockspeeds on the memory and GPU. Because of the lower clockspeeds, the K20X is rated for double and single precision floating point performance of 1.31 TFLOPS and 3.95 TFLOPS and memory bandwidth of 250GB/s versus the 288GB/s bandwidth on the TITAN and K6000.

NVIDIA_Quadro_K6000_workstation_graphics_card_gk110.jpg

NVIDIA® Quadro® K6000 GPU

In all, the new K6000 is an impressive card for professional users, and the GK110 chip should perform well in the workstation environment where GK104 was the only option before. NVIDIA claims that the GK110 is up to 3-times the performance of the Quadro 6000 (non K) predecessor. It is also the first Quadro GPU with 12GB of GDDR5 memory, which should lend itself well to high resolutions and artists working with highly detailed models and simulations.

NVIDIA Quadro K6000 GK110 GPU With 12GB GDDR5.jpg

Specifically, NVIDIA is aiming this graphics card at the visual computing market, which includes 3D designers, visual effects artists, 3d animation, and simulations. The company provided several examples in the press release, including using the GK110-based card to render nearly complete photorealistic vehicle models in RTT Deltagen that can run real time during design reviews.

NVIDIA Quadro K6000 GK110 GPU Used To Created Photorealistic Vehicle Models In Real Time.jpg

The Quadro K6000 allows for larger and fully populated virtual sets with realistic lighting and scene detail when 3D animators and VFX artists are working with models and movie scenes in real time. Simulation work also takes advantage of the beefy double precision horsepower to support up to 3-times faster simulation run times in Terraspark's InsightEarth simulation. Users can run simulations with wider areas in less time than the previous generation Quardo cards, and is being used by oil companies to determine the best places to drill.

NVIDIA Quadro K6000 GK110 GPU Content Creation.jpg

Pixar's Vice President of Software and R&D Guido Quaroni had the following to say regarding the K6000.

"The Kepler features are key to our next generation of real-time lighting and geometry 
handling. The added memory and other features allow our artists to see much more of the 
final scene in a real-time, interactive form, which allows many more artistic iterations."

The K6000 is the final piece to the traditional NVIDIA Quadro lineup and is likely to be well recieved by workstation users that need the increased double precision performance that GK110 offers over the existing GK104 chips. Specific pricing and availability are still unknown, but the K6000 will be available from workstation providers, system integrators, and authorized distribution partners beginning this fall.

Source: NVIDIA