NVIDIA Offers Preliminary Settlement To Geforce GTX 970 Buyers In False Advertising Class Action Lawsuit

Subject: Graphics Cards | July 28, 2016 - 07:07 PM |
Tagged: nvidia, maxwell, GTX 970, GM204, 3.5gb memory

A recent post on Top Class Actions suggests that buyers of NVIDIA GTX 970 graphics cards may soon see a payout from a settlement agreement as part of the series of class action lawsuits facing NVIDIA over claims of false advertising. NVIDIA has reportedly offered up a preliminary settlement of $30 to "all consumers who purchased the GTX 970 graphics card" with no cap on the total payout amount along with a whopping $1.3 million in attorney's fees.

This settlement offer is in response to several class action lawsuits that consumers filed against the graphics giant following the controversy over mis-advertised specifications (particularly the number of ROP units and amount of L2 cache) and the method in which NVIDIA's GM204 GPU addressed the four total gigabytes of graphics memory.

Specifically, the graphics card specifications initially indicated that it had 64 ROPs and 2048 KB of L2 cache, but later was revealed to have only 56 ROPs and 1792 KB of L2. On the memory front, the "3.5 GB memory controvesy"  spawned many memes and investigations into how the 3.5 GB and 0.5 GB pools of memory worked and how performance both real world and theoretical were affected by the memory setup.

EVGA GTX 970 Closeup.JPG

(My opinions follow)

It was quite the PR disaster and had NVIDIA been upfront with all the correct details on specifications and the new memory implementation the controversy could have been avoided. As is though buyers were not able to make informed decisions about the card and at the end of the day that is what is important and why the lawsuits have merit.

As such, I do expect both sides to reach a settlement rather than see this come to a full trial, but it may not be exactly the $30 per buyer payout as that amount still needs to be approved by the courts to ensure that it is "fair and reasonable."

For more background on the GTX 970 memory issue (it has been awhile since this all came about after all, so you may need a refresher):

Podcast #410 - Data Recovery, New Titan X Launch, AMD builds a GPU with SSDs and more!!

Subject: Editorial | July 28, 2016 - 01:03 PM |
Tagged: XSPC, wings, windows 10, VR, video, titan x, tegra, Silverstone, sapphire, rx 480, Raystorm, RapidSpar, radeon pro ssg, quadro, px1, podcast, p6000, p5000, nvidia, nintendo nx, MX300, gp102, evga, dg-87, crucial, angelbird

PC Perspective Podcast #410 - 07/28/2016

Join us this week as we discuss the new Pascal based Titan X, an AMD graphics card with 1TB of SSD storage on-board, data recovery with RapidSpar and more!!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

Hosts:  Ryan Shrout, Allyn Malventano, Sebastian Peak, and Josh Walrath

Program length: 1:46:33
  1. Week in Review:
  2. News items of interest:
  3. 1:29:15 Hardware/Software Picks of the Week
    1. Allyn: Wii emulation is absolutely usable now (Dolphin 5)
  4. Closing/outro

Rumor: Nintendo NX Uses NVIDIA Tegra... Something

Subject: Graphics Cards, Systems, Mobile | July 27, 2016 - 07:58 PM |
Tagged: nvidia, Nintendo, nintendo nx, tegra, Tegra X1, tegra x2, pascal, maxwell

Okay so there's a few rumors going around, mostly from Eurogamer / DigitalFoundry, that claim the Nintendo NX is going to be powered by an NVIDIA Tegra system on a chip (SoC). DigitalFoundry, specifically, cites multiple sources who claim that their Nintendo NX development kits integrate the Tegra X1 design, as seen in the Google Pixel C. That said, the Nintendo NX release date, March 2017, does provide enough time for them to switch to NVIDIA's upcoming Pascal Tegra design, rumored to be called the Tegra X2, which uses NVIDIA's custom-designed Denver CPU cores.

Preamble aside, here's what I think about the whole situation.

First, the Tegra X1 would be quite a small jump in performance over the WiiU. The WiiU's GPU, “Latte”, has 320 shaders clocked at 550 MHz, and it was based on AMD's TeraScale 1 architecture. Because these stream processors have single-cycle multiply-add for floating point values, you can get its FLOP rating by multiplying 320 shaders, 550,000,000 cycles per second, and 2 operations per clock (one multiply and one add). This yields 352 GFLOPs. The Tegra X1 is rated at 512 GFLOPs, which is just 45% more than the previous generation.

This is a very tiny jump, unless they indeed use Pascal-based graphics. If this is the case, you will likely see a launch selection of games ported from WiiU and a few games that use whatever new feature Nintendo has. One rumor is that the console will be kind-of like the WiiU controller, with detachable controllers. If this is true, it's a bit unclear how this will affect games in a revolutionary way, but we might be missing a key bit of info that ties it all together.

nvidia-2016-shieldx1consoles.png

As for the choice of ARM over x86... well. First, this obviously allows Nintendo to choose from a wider selection of manufacturers than AMD, Intel, and VIA, and certainly more than IBM with their previous, Power-based chips. That said, it also jives with Nintendo's interest in the mobile market. They joined The Khronos Group and I'm pretty sure they've said they are interested in Vulkan, which is becoming the high-end graphics API for Android, supported by Google and others. That said, I'm not sure how many engineers exist that specialize in ARM optimization, as most mobile platforms try to abstract this as much as possible, but this could be Nintendo's attempt to settle on a standardized instruction set, and they opted for mobile over PC (versus Sony and especially Microsoft, who want consoles to follow high-end gaming on the desktop).

Why? Well that would just be speculating on speculation about speculation. I'll stop here.

SIGGRAPH 2016 -- NVIDIA Announces Pascal Quadro GPUs: Quadro P5000 and Quadro P6000

Subject: Graphics Cards | July 25, 2016 - 04:48 PM |
Tagged: siggraph 2016, Siggraph, quadro, nvidia

SIGGRAPH is the big, professional graphics event of the year, bringing together tens of thousands of attendees. They include engineers from Adobe, AMD, Blender, Disney (including ILM, Pixar, etc.), NVIDIA, The Khronos Group, and many, many others. Not only are new products announced, but many technologies are explained in detail, down to the specific algorithms that are used, so colleagues can advance their own research and share in kind.

But new products will indeed be announced.

nvidia-2016-Quadro_P6000_7440.jpg

The NVIDIA Quadro P6000

NVIDIA, having just launched a few Pascal GPUs to other markets, decided to announce updates to their Quadro line at the event. Two cards have been added, the Quadro P5000 and the Quadro P6000, both at the top end of the product stack. Interestingly, both use GDDR5X memory, meaning that neither will be based on the GP100 design, which is built around HBM2 memory.

nvidia-2016-Quadro_P5000_7460.jpg

The NVIDIA Quadro P5000

The lower end one, the Quadro P5000, should look somewhat familiar to our reader. Exact clocks are not specified, but the chip has 2560 CUDA cores. This is identical to the GTX 1080, but with twice the memory: 16GB of GDDR5X.

Above it sits the Quadro P6000. This chip has 3840 CUDA cores, paired with 24GB of GDDR5X. We have not seen a GPU with exactly these specifications before. It has the same number of FP32 shaders as a fully unlocked GP100 die, but it doesn't have HBM2 memory. On the other hand, the new Titan X uses GP102, combining 3584 CUDA cores with GDDR5X memory, although only 12GB of it. This means that the Quadro P6000 has 256 more (single-precision) shader units than the Titan X, but otherwise very similar specifications.

Both graphics cards have four DisplayPort 1.4 connectors, as well as a single DVI output. These five connectors can be used to drive up to four, 4K, 120Hz monitors, or four, 5K, 60Hz ones. It would be nice if all five connections could be used at once, but what can you do.

nvidia-2016-irayvr.png

Pascal has other benefits for professional users, too. For instance, Simultaneous Multi-Projection (SMP) is used in VR applications to essentially double the GPU's geometry processing ability. NVIDIA will be pushing professional VR at SIGGRAPH this year, also launching Iray VR. This uses light fields, rendered on devices like the DGX-1, with its eight GP100 chips connected by NVLink, to provide accurately lit environments. This is particularly useful for architectural visualization.

No price is given for either of these cards, but they will launch in October of this year.

Source: NVIDIA

SIGGRAPH 2016: NVIDIA Takes Over mental ray for Maya

Subject: General Tech | July 25, 2016 - 04:47 PM |
Tagged: nvidia, mental ray, maya, 3D rendering

NVIDIA purchased Mental Images, the German software developer that makes the mental ray renderer, all the way back in 2007. It has been bundled with every copy of Maya for a very long time now. In fact, my license of Maya 8, which I purchased back in like, 2006, came with mental ray in both plug-in format, and stand-alone.

nvidia-2016-mentalray-benchmark.png

Interestingly, even though nearly a decade has passed since NVIDIA's acquisition, Autodesk has been the middle-person that end-users dealt with. This will end soon, as NVIDIA announced, at SIGGRAPH, that they will “be serving end users directly” with their mental ray for Maya plug-in. The new plug-in will show results directly in the viewport, starting at low quality and increasing until the view changes. They are obviously not the first company to do this, with Cycles in Blender being a good example, but I would expect that it is a welcome feature for users.

nvidia-2016-mentalray-benchmarknums.png

Benchmark results are by NVIDIA

At the same time, they are also announcing GI-Next. This will speed up global illumination in mental ray, and it will also reduce the number of options required to tune the results to just a single quality slider, making it easier for artists to pick up. One of their benchmarks shows a 26-fold increase in performance, although most of that can be attributed to GPU acceleration from a pair of GM200 Quadro cards. CPU-only tests of the same scene show a 4x increase, though, which is still pretty good.

The new version of mental ray for Maya is expected to ship in September, although it has been in an open beta (for existing Maya users) since February. They do say that “pricing and policies will be announced closer to availability” though, so we'll need to see, then, how different the licensing structure will be. Currently, Maya ships with a few licenses of mental ray out of the box, and has for quite some time.

Source: NVIDIA

NVIDIA Release 368.95 Hotfix Driver for DPC Latency

Subject: Graphics Cards | July 22, 2016 - 05:51 PM |
Tagged: pascal, nvidia, graphics drivers

Turns out the Pascal-based GPUs suffered from DPC latency issues, and there's been an ongoing discussion about it for a little over a month. This is not an area that I know a lot about, but it's a system that schedules workloads by priority, which provides regular windows of time for sound and video devices to update. It can be stalled by long-running driver code, though, which could manifest as stutter, audio hitches, and other performance issues. With a 10-series GeForce device installed, users have reported that this latency increases about 10-20x, from ~20us to ~300-400us. This can increase to 1000us or more under load. (8333us is ~1 whole frame at 120FPS.)

nvidia-2015-bandaid.png

NVIDIA has acknowledged the issue and, just yesterday, released an optional hotfix. Upon installing the driver, while it could just be psychosomatic, the system felt a lot more responsive. I ran LatencyMon (DPCLat isn't compatible with Windows 8.x or Windows 10) before and after, and the latency measurement did drop significantly. It was consistently the largest source of latency, spiking in the thousands of microseconds, before the update. After the update, it was hidden by other drivers for the first night, although today it seems to have a few spikes again. That said, Microsoft's networking driver is also spiking in the ~200-300us range, so a good portion of it might be the sad state of my current OS install. I've been meaning to do a good system wipe for a while...

nvidia-2016-hotfix-pascaldpc.png

Measurement taken after the hotfix, while running Spotify.
That said, my computer's a mess right now.

That said, some of the post-hotfix driver spikes are reaching ~570us (mostly when I play music on Spotify through my Blue Yeti Pro). Also, Photoshop CC 2015 started complaining about graphics acceleration issues after installing the hotfix, so only install it if you're experiencing problems. About the latency, if it's not just my machine, NVIDIA might still have some work to do.

It does feel a lot better, though.

Source: NVIDIA

NVIDIA Announces GP102-based TITAN X with 3,584 CUDA cores

Subject: Graphics Cards | July 21, 2016 - 10:21 PM |
Tagged: titan x, titan, pascal, nvidia, gp102

Donning the leather jacket he goes very few places without, NVIDIA CEO Jen-Hsun Huang showed up at an AI meet-up at Stanford this evening to show, for the very first time, a graphics card based on a never before seen Pascal GP102 GPU. 

titanxpascal1.jpg

Source: Twitter (NVIDIA)

Rehashing an old name, NVIDIA will call this new graphics card the Titan X. You know, like the "new iPad" this is the "new TitanX." Here is the data we know about thus far:

  Titan X (Pascal) GTX 1080 GTX 980 Ti TITAN X GTX 980 R9 Fury X R9 Fury R9 Nano R9 390X
GPU GP102 GP104 GM200 GM200 GM204 Fiji XT Fiji Pro Fiji XT Hawaii XT
GPU Cores 3584 2560 2816 3072 2048 4096 3584 4096 2816
Rated Clock 1417 MHz 1607 MHz 1000 MHz 1000 MHz 1126 MHz 1050 MHz 1000 MHz up to 1000 MHz 1050 MHz
Texture Units 224 (?) 160 176 192 128 256 224 256 176
ROP Units 96 (?) 64 96 96 64 64 64 64 64
Memory 12GB 8GB 6GB 12GB 4GB 4GB 4GB 4GB 8GB
Memory Clock 10000 MHz 10000 MHz 7000 MHz 7000 MHz 7000 MHz 500 MHz 500 MHz 500 MHz 6000 MHz
Memory Interface 384-bit G5X 256-bit G5X 384-bit 384-bit 256-bit 4096-bit (HBM) 4096-bit (HBM) 4096-bit (HBM) 512-bit
Memory Bandwidth 480 GB/s 320 GB/s 336 GB/s 336 GB/s 224 GB/s 512 GB/s 512 GB/s 512 GB/s 320 GB/s
TDP 250 watts 180 watts 250 watts 250 watts 165 watts 275 watts 275 watts 175 watts 275 watts
Peak Compute 11.0 TFLOPS 8.2 TFLOPS 5.63 TFLOPS 6.14 TFLOPS 4.61 TFLOPS 8.60 TFLOPS 7.20 TFLOPS 8.19 TFLOPS 5.63 TFLOPS
Transistor Count 11.0B 7.2B 8.0B 8.0B 5.2B 8.9B 8.9B 8.9B 6.2B
Process Tech 16nm 16nm 28nm 28nm 28nm 28nm 28nm 28nm 28nm
MSRP (current) $1,200 $599 $649 $999 $499 $649 $549 $499 $329

Note: everything with a ? on is educated guesses on our part.

Obviously there is a lot for us to still learn about this new GPU and graphics card, including why in the WORLD it is still being called Titan X, rather than...just about anything else. That aside, GP102 will feature 40% more CUDA cores than the GP104 at slightly lower clock speeds. The rated 11 TFLOPS of single precision compute of the new Titan X is 34% better than that of the GeForce GTX 1080 and I would expect gaming performance to scale in line with that difference.

The new Titan X will feature 12GB of GDDR5X memory, not HBM as the GP100 chip has, so this is clearly a new chip with a new memory interface. NVIDIA claims it will have 480 GB/s of bandwidth, and I am guessing is built on a 384-bit memory controller interface running at the same 10 Gbps as the GTX 1080. It's truly amazing hardware.

titanxpascal2.jpg

What will you be asked to pay? $1200, going on sale on August 2nd, and only on NVIDIA.com, at least for now. Considering the prices of GeForce GTX 1080 cards with such limited availability, the $1200 price tag MIGHT NOT seem so insane. That's higher than the $999 starting price of the Titan X based on Maxwell in March of 2015 - the claims that NVIDIA is artificially raising prices of cards in each segment will continue, it seems.

I am curious about the TDP on the new Titan X - will it hit the 250 watt mark of the previous version? Yes, apparently it will it that 250 watt TDP - specs above updated. Does this also mean we'll see a GeForce GTX 1080 Ti that falls between the GTX 1080 and this new Titan X? Maybe, but we are likely looking at an $899 or higher SEP - so get those wallets ready. 

That's it for now; we'll have a briefing where we can get more details soon, and hopefully a review ready for you on August 2nd when the cards go on sale!

Source: NVIDIA

Podcast #409 - GTX 1060 Review, 3DMark Time Spy Controversy, Tiny Nintendo and more!

Subject: General Tech | July 21, 2016 - 12:21 PM |
Tagged: Wraith, Volta, video, time spy, softbank, riotoro, retroarch, podcast, nvidia, new, kaby lake, Intel, gtx 1060, geforce, asynchronous compute, async compute, arm, apollo lake, amd, 3dmark, 10nm, 1070m, 1060m

PC Perspective Podcast #409 - 07/21/2016

Join us this week as we discuss the GTX 1060 review, controversy surrounding the async compute of 3DMark Time Spy and more!!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

This episode of the PC Perspective Podcast is sponsored by Casper!

Hosts:  Ryan Shrout, Allyn Malventano, Jeremy Hellstrom, and Josh Walrath

Program length: 1:34:57
  1. Week in Review:
  2. 0:51:17 This episode of the PC Perspective Podcast is sponsored by Casper!
  3. News items of interest:
  4. 1:26:26 Hardware/Software Picks of the Week
    1. Ryan: Sapphire Nitro Bot
    2. Allyn: klocki - chill puzzle game (also on iOS / Android)
  5. Closing/outro

Report: NVIDIA GeForce GTX 1070M and 1060M Specs Leaked

Subject: Graphics Cards | July 20, 2016 - 12:19 PM |
Tagged: VideoCardz, rumor, report, nvidia, GTX 1070M, GTX 1060M, GeForce GTX 1070, GeForce GTX 1060, 2048 CUDA Cores

Specifications for the upcoming mobile version of NVIDIA's GTX 1070 GPU may have leaked, and according to the report at VideoCardz.com this GTX 1070M will have 2048 CUDA cores; 128 more than the desktop version's 1920 cores.

nvidia-geforce-gtx-1070-mobile-specs.jpg

Image credit: BenchLife via VideoCardz

The report comes via BenchLife, with the screenshot of GPU-Z showing the higher CUDA core count (though VideoCardz mentions the TMU count should be 128). The memory interface remains at 256-bit for the mobile version, with 8GB of GDDR5.

VideoCardz reported another GPU-Z screenshot (via PurePC) of the mobile GTX 1060, which appears to offer the same specs of the desktop version, at a slightly lower clock speed.

nvidia-geforce-gtx-1060-mobile-specs.jpg

Image credit: PurePC via VideoCardz

Finally, this chart was provided for reference:

videocardz_chart.PNG

Image credit: VideoCardz

Note the absence of information about a mobile variant of the GTX 1080, details of which are still unknown (for now).

Source: VideoCardz
Manufacturer: Overclock.net

Yes, We're Writing About a Forum Post

Update - July 19th @ 7:15pm EDT: Well that was fast. Futuremark published their statement today. I haven't read it through yet, but there's no reason to wait to link it until I do.

Update 2 - July 20th @ 6:50pm EDT: We interviewed Jani Joki, Futuremark's Director of Engineering, on our YouTube page. The interview is embed just below this update.

Original post below

The comments of a previous post notified us of an Overclock.net thread, whose author claims that 3DMark's implementation of asynchronous compute is designed to show NVIDIA in the best possible light. At the end of the linked post, they note that asynchronous compute is a general blanket, and that we should better understand what is actually going on.

amd-mantle-queues.jpg

So, before we address the controversy, let's actually explain what asynchronous compute is. The main problem is that it actually is a broad term. Asynchronous compute could describe any optimization that allows tasks to execute when it is most convenient, rather than just blindly doing them in a row.

I will use JavaScript as a metaphor. In this language, you can assign tasks to be executed asynchronously by passing functions as parameters. This allows events to execute code when it is convenient. JavaScript, however, is still only single threaded (without Web Workers and newer technologies). It cannot run callbacks from multiple events simultaneously, even if you have an available core on your CPU. What it does, however, is allow the browser to manage its time better. Many events can be delayed until the browser renders the page, it performs other high-priority tasks, or until the asynchronous code has everything it needs, like assets that are loaded from the internet.

mozilla-architecture.jpg

This is asynchronous computing.

However, if JavaScript was designed differently, it would have been possible to run callbacks on any available thread, not just the main thread when available. Again, JavaScript is not designed in this way, but this is where I pull the analogy back into AMD's Asynchronous Compute Engines. In an ideal situation, a graphics driver will be able to see all the functionality that a task will require, and shove them down an at-work GPU, provided the specific resources that this task requires are not fully utilized by the existing work.

Read on to see how this is being implemented, and what the controversy is.