Subject: Graphics Cards | July 16, 2016 - 06:37 PM | Scott Michaud
Tagged: Volta, pascal, nvidia, maxwell, 16nm
For the past few generations, NVIDIA has been roughly trying to release a new architecture with a new process node, and release a refresh the following year. This ran into a hitch as Maxwell was delayed a year, apart from the GTX 750 Ti, and then pushed back to the same 28nm process that Kepler utilized. Pascal caught up with 16nm, although we know that some hard, physical limitations are right around the corner. The lattice spacing for silicon at room temperature is around ~0.5nm, so we're talking about features the size of ~the low 30s of atoms in width.
This rumor claims that NVIDIA is not trying to go with 10nm for Volta. Instead, it will take place on the same, 16nm node that Pascal is currently occupying. This is quite interesting, because GPUs scale quite well with complexity changes, as they have many features with a relatively low clock rate, so the only real ways to increase performance are to make the existing architecture more efficient, or make a larger chip.
That said, GP100 leaves a lot of room on the table for an FP32-optimized, ~600mm2 part to crush its performance at the high end, similar to how GM200 replaced GK110. The rumored GP102, expected in the ~450mm2 range for Titan or GTX 1080 Ti-style parts, has some room to grow. Like GM200, however, it would also be unappealing to GPU compute users who need FP64. If this is what is going on, and we're totally just speculating at the moment, it would signal that enterprise customers should expect a new GPGPU card every second gaming generation.
That is, of course, unless NVIDIA recognized ways to make the Maxwell-based architecture significantly more die-space efficient in Volta. Clocks could get higher, or the circuits themselves could get simpler. You would think that, especially in the latter case, they would have integrated those ideas into Maxwell and Pascal, though; but, like HBM2 memory, there might have been a reason why they couldn't.
We'll need to wait and see. The entire rumor could be crap, who knows?
Subject: General Tech | July 14, 2016 - 06:06 PM | Jeremy Hellstrom
Tagged: nvidia, vr funhouse, ansel, vrworks
A while back Scott wrote about NVIDIA's Ansel, a screenshot application on performance enhancing drugs. Today it arrives, paired with their new driver and adds support for Mirror's Edge Catalyst to the list of supported games such as The Witcher 3: Wild Hunt, Unreal Tournament, Tom Clancy’s The Division, and No Man’s Sky just to name a few. The tool allows you to take 360 degree screen captures, allowing you to completely rotate around the image on a 2D screen or with VR headsets like the Vive or Rift. Just trigger the recording while you are in game, the game will pause and you can roll, zoom, and position your focus to get the screenshot you want. From there hit the Super Resolution button and your screenshot will be of significantly greater quality than the game ever could be. The thumbnail below is available in its original 46080x25920 resolution by visting NVIDIA's Ansel page, it is a mere 1.7GB in size.
NVIDIA also released their first game today, a VR Funhouse available on Steam for no charge ... apart from the HTC Vive and minimum hardware requirements of an GTX 1060 and i7 4790 or the recommended GTX 1080 and i7 5930, which are enough of an investment as it is. There are seven games to play, expect skeet shooting, whack a mole and other standard carny games. At the same time it is a showcase of NVIDIA's VR technology, not just the *Works which we are familiar with but also VR SLI support for those with multiple GPUs and VRWorks Multi-res Shading which reduces processing load by only rendering full detail to objects within your field of view. If you have the hardware you should check out the game, it is certanly worth the admission price.
Subject: Graphics Cards | July 13, 2016 - 09:20 PM | Scott Michaud
Tagged: vulkan, R9 Fury X, nvidia, Mantle, gtx 1070, fury x, doom, amd
We haven't yet benchmarked DOOM on Vulkan Update (immediately after posting): Ryan has just informed me that, apparently, we did benchmark Vulkan on our YouTube page (embed below). I knew we were working on it, I just didn't realize we published content yet. Original post continues below.
As far as I know, we're trying to get our testing software for frame time analysis running on the new API, but other sites have posted framerate-based results. The results show that AMD's cards benefit greatly from the new, Mantle-derived interface (versus the OpenGL one). On the other hand, while NVIDIA never really sees a decrease, more than 1% at least, it doesn't really get much of a boost, either.
Image Credit: ComputerBase.de
I tweeted out to ID's lead renderer programmer, Tiago Sousa, to ask whether they take advantage of NVIDIA-specific extensions on the OpenGL path (like command submission queues). I haven't got a response yet, so it's difficult to tell whether this speaks more toward NVIDIA's OpenGL performance, or AMD's Vulkan performance. In the end, it doesn't really matter, though. AMD's Fury X (which can be found for as low as $399 with a mail-in rebate) is beating the GTX 1070 (which is in stock for the low $400s) by a fair margin. The Fury X also beats its own OpenGL performance by up to 66% (at 1080p) with the new API.
The API should also make it easier for games to pace their frames, too, which should allow smoother animation at these higher rates. That said, we don't know for sure because we can't test that from just seeing FPS numbers. The gains are impressive from AMD, though.
Subject: Graphics Cards | July 7, 2016 - 10:13 PM | Scott Michaud
Tagged: nvidia, GTX 1080, ea, dice, battlefield, battlefield 1
Battlefield 1 looks pretty good. To compare how it scales between its settings, DigitalFoundry took a short amount of video at 4K across all four, omnibus graphics settings: Low, Medium, High, and Ultra. These are, as should be expected for a high-end PC game, broken down into more specific categories, like lighting quality and texture filtering, but I can't blame them for not adding that many permutations to a single video. It would just be a mess.
The rendering itself doesn't change too much between settings to my eye. Higher quality settings draw more distant objects than lower ones, and increases range that level of detail falls off, too. About a third of the way into the video, they show a house from a moderate distance. The lowest quality version was almost completely devoid of shadowing and its windows would not even draw. The lighting then scaled up from there as the settings were moved progressively toward Ultra.
Image Credit: DigitalFoundry
While it's still Alpha-level code, a single GTX 1080 was getting between 50 and 60 FPS at 4K. This is a good range to be in for a G-Sync monitor, as the single-card machine doesn't need to deal with multi-GPU issues, like pacing and driver support.
It’s probably not going to come as a surprise to anyone that reads the internet, but NVIDIA is officially taking the covers off its latest GeForce card in the Pascal family today, the GeForce GTX 1060. As the number scheme would suggest, this is a more budget-friendly version of NVIDIA’s latest architecture, lowering performance in line with expectations. The GP106-based GPU will still offer impressive specifications and capabilities and will probably push AMD’s new Radeon RX 480 to its limits.
Let’s take a quick look at the card’s details.
|GTX 1060||RX 480||R9 390||R9 380||GTX 980||GTX 970||GTX 960||R9 Nano||GTX 1070|
|GPU||GP106||Polaris 10||Grenada||Tonga||GM204||GM204||GM206||Fiji XT||GP104|
|Rated Clock||1506 MHz||1120 MHz||1000 MHz||970 MHz||1126 MHz||1050 MHz||1126 MHz||up to 1000 MHz||1506 MHz|
|Texture Units||80 (?)||144||160||112||128||104||64||256||120|
|ROP Units||48 (?)||32||64||32||64||56||32||64||64|
|Memory Clock||8000 MHz||7000 MHz
|6000 MHz||5700 MHz||7000 MHz||7000 MHz||7000 MHz||500 MHz||8000 MHz|
|Memory Interface||192-bit||256-bit||512-bit||256-bit||256-bit||256-bit||128-bit||4096-bit (HBM)||256-bit|
|Memory Bandwidth||192 GB/s||224 GB/s
|384 GB/s||182.4 GB/s||224 GB/s||196 GB/s||112 GB/s||512 GB/s||256 GB/s|
|TDP||120 watts||150 watts||275 watts||190 watts||165 watts||145 watts||120 watts||275 watts||150 watts|
|Peak Compute||3.85 TFLOPS||5.1 TFLOPS||5.1 TFLOPS||3.48 TFLOPS||4.61 TFLOPS||3.4 TFLOPS||2.3 TFLOPS||8.19 TFLOPS||5.7 TFLOPS|
The GeForce GTX 1060 will sport 1280 CUDA cores with a GPU Boost clock speed rated at 1.7 GHz. Though the card will be available in only 6GB varieties, the reference / Founders Edition will ship with 6GB of GDDR5 memory running at 8.0 GHz / 8 Gbps. With 1280 CUDA cores, the GP106 GPU is essentially one half of a GP104 in terms of compute capability. NVIDIA decided not to cut the memory interface in half though, instead going with a 192-bit design compared to the GP104 and its 256-bit option.
The rated GPU clock speeds paint an interesting picture for peak performance of the new card. At the rated boost clock speed, the GeForce GTX 1070 produces 6.46 TFLOPS of performance. The GTX 1060 by comparison will hit 4.35 TFLOPS, a 48% difference. The GTX 1080 offers nearly the same delta of performance above the GTX 1070; clearly NVIDIA has set the scale Pascal and product deviation.
NVIDIA wants us to compare the new GeForce GTX 1060 to the GeForce GTX 980 in gaming performance, but the peak theoretical performance results don’t really match up. The GeForce GTX 980 is rated at 4.61 TFLOPS at BASE clock speed, while the GTX 1060 doesn’t hit that number at its Boost clock. Obviously Pascal improves on performance with memory compression advancements, but the 192-bit memory bus is only able to run at 192 GB/s, compared to the 224 GB/s of the GTX 980. Obviously we’ll have to wait for performance result from our own testing to be sure, but it seems possible that NVIDIA’s performance claims might depend on technology like Simultaneous Multi-Projection and VR gaming to be validated.
Subject: Graphics Cards | July 6, 2016 - 11:56 PM | Scott Michaud
Tagged: titan, pascal, nvidia, gtx 1080 ti, gp102, GP100
Normally, I pose these sorts of rumors as “Well, here you go, and here's a grain of salt.” This one I'm fairly sure is bogus, at least to some extent. I could be wrong, but especially the GP100 aspects of it just doesn't make sense.
Before I get to that, the rumor is that NVIDIA will announce a GeForce GTX Titan P at Gamescom in Germany. The event occurs mid-August (17th - 21st) and it has been basically Europe's E3 in terms of gaming announcements. It also overlaps with Europe's Game Developers Conference (GDC), which occurs in March for us. The rumor says that it will use GP100 (!?!) with either 12GB of VRAM, 16GB of VRAM, or two variants as we've seen with the Tesla P100 accelerator.
The rumor also acknowledges the previously rumored GP102 die, claims that it will be for the GTX 1080 Ti, and suggests that it will have up to 3840 CUDA cores. This is the same number of CUDA cores as the GP100, which is where I get confused. This would mean that NVIDIA made a special die, which other rumors claim is ~450mm2, for just the GeForce GTX 1080 Ti.
I mean, it's possible that NVIDIA would split the GTX 1080 Ti and the next Titan by similar gaming performance, just with better half- and double-precision performance and faster memory for GPGPU developers. That would be a very weird to me, though, developing two different GPU dies for the consumer market with probably the same gaming performance.
And they would be announcing the Titan P first???
The harder to yield one???
When the Tesla version isn't even expected until Q4???
I can see it happening, but I seriously doubt it. Something may be announced, but I'd have to believe it will be at least slightly different from the rumors that we are hearing now.
Subject: Graphics Cards | July 6, 2016 - 05:10 PM | Scott Michaud
Tagged: VR, Oculus, nvidia, graphics drivers, DiRT Rally
A Game Ready Driver has just launched for DiRT Rally VR. GeForce Drivers 368.69 WHQL increments upon the last release, obviously adding optimizations for DiRT Rally VR, but it also includes a few new SLI profiles (Armored Warfare, Dangerous Golf, iRacing: Motorsport Simulator, Lost Ark, and Tiger Knight) and probably other bug fixes.
The update doesn't yet have a release date, but it should be soon. According to NVIDIA's blog post, it sounds like it will come first to the Oculus Store, but arrive on Steam later this month. I haven't been following the game too heavily, but there doesn't seem to be any announcement about official HTC Vive support that I can find.
You can pick them up at NVIDIA's website or through GeForce Experience. Thankfully, the GeForce Experience 3 Beta seems to pick up on new drivers much quicker than the previous version.
Subject: Graphics Cards | July 6, 2016 - 07:15 AM | Scott Michaud
Tagged: pascal, nvidia, htc vive, GTX 1080, gtx 1070, GP104
NVIDIA is working on a fix to allow the HTC Vive to be connected to the GeForce GTX 1070 and GTX 1080 over DisplayPort. The HTC Vive apparently has the choice between HDMI and Mini DisplayPort, but the headset will not be identified when connected over that connection. Currently, the two workarounds are to connect the HTC Vive over HDMI, or use a DisplayPort to HDMI adapter if your card's HDMI output is already occupied.
It has apparently been an open issue for over a month now. That said, NVIDIA's Manuel Guzman has acknowledged the issue. Other threads claim that there are other displays that have a similar issue, and, within the last 24 hours, some users have experienced luck with modifying their motherboard's settings. I'd expect that it's something the can fix in an upcoming driver, though. For now, I guess plan your monitor outputs accordingly if you were planning on getting the HTC Vive.
Subject: Graphics Cards | July 2, 2016 - 01:25 AM | Scott Michaud
Tagged: nvidia, geforce, geforce experience
GeForce Experience will be getting an updated UI soon, and a beta release is available now. It has basically been fully redesigned, although the NVIDIA Control Panel is the same as it has been. That said, even though it is newer, GeForce Experience could benefit from a good overhaul, especially in terms of start-up delay. NVIDIA says it uses 2X less memory and loads 3X faster. It still has a slightly loading bar, but less than a second.
Interestingly, I noticed that, even though I skipped over Sharing Settings on first launch, Instant Replay was set to On by default. This could have been carried over from my previous instance of GeForce Experience, although I'm pretty sure I left it off. Privacy-conscious folks might want to verify that ShadowPlay isn't running, just in case.
One downside for some of our users is that you now require an NVIDIA account (or connect your Google Account to NVIDIA) to access it. Previously, you could use features, like ShadowPlay, while logged out, but that doesn't appear to be the case anymore. This will no-doubt upset some of our audience, but it's not entirely unexpected, given NVIDIA's previous statements about requiring an NVIDIA account for Beta drivers. The rest of GeForce Experience isn't too surprising considering that.
We'll now end where we began: installation. For testing (and hopefully providing feedback) during the beta, NVIDIA will be giving away GTX 1080s on a weekly basis. To enter, you apparently just need to install the Beta and log in with your NVIDIA (or Google) account.
Subject: Graphics Cards | June 30, 2016 - 07:54 PM | Scott Michaud
Tagged: amd, nvidia, FinFET, Polaris, polaris 10, pascal
If you're trying to purchase a Pascal or Polaris-based GPU, then you are probably well aware that patience is a required virtue. The problem is that, as a hardware website, we don't really know whether the issue is high demand or low supply. Both are manufactured on a new process node, which could mean that yield is a problem. On the other hand, it's been about four years since the last fabrication node, which means that chips got much smaller for the same performance.
Over time, manufacturing processes will mature, and yield will increase. But what about right now? AMD made a very small chip that produces ~GTX 970-level performance. NVIDIA is sticking with their typical, 3XXmm2 chip, which ended up producing higher than Titan X levels of performance.
It turns out that, according to online retailer, Overclockers UK, via Fudzilla, both the RX480 and GTX 1080 have sold over a thousand units at that location alone. That's quite a bit, especially when you consider that it only considers one (large) online retailer from Europe. It's difficult to say how much stock other stores (and regions) received compared to them, but it's still a thousand units in a day.
It's sounding like, for both vendors, pent-up demand might be the dominant factor.