Subject: Graphics Cards | June 21, 2016 - 05:22 PM | Scott Michaud
Tagged: nvidia, fermi, kepler, maxwell, pascal, gf100, gf110, GK104, gk110, GM204, gm200, GP104
Techspot published an article that compared eight GPUs across six, high-end dies in NVIDIA's last four architectures: Fermi to Pascal. Average frame rates were listed across nine games, each measured at three resolutions:1366x768 (~720p HD), 1920x1080 (1080p FHD), and 2560x1600 (~1440p QHD).
The results are interesting. Comparing GP104 to GF100, mainstream Pascal is typically on the order of four times faster than big Fermi. Over that time, we've had three full generational leaps in fabrication technology, leading to over twice the number of transistors packed into a die that is almost half the size. It does, however, show that prices have remained relatively constant, except that the GTX 1080 is sort-of priced in the x80 Ti category despite the die size placing it in the non-Ti class. (They list the 1080 at $600, but you can't really find anything outside the $650-700 USD range).
It would be interesting to see this data set compared against AMD. It's informative for an NVIDIA-only article, though.
Subject: Graphics Cards | June 20, 2016 - 01:57 PM | Scott Michaud
Tagged: tesla, pascal, nvidia, GP100
GP100, the “Big Pascal” chip that was announced at GTC, will be coming to PCIe for enterprise and supercomputer customers in Q4 2016. Previously, it was only announced using NVIDIA's proprietary connection. In fact, they also gave themselves some lead time with their first-party DGX-1 system, which retails for $129,000 USD, although we expect that was more for yield reasons. Josh calculated that each GPU in that system is worth more than the full wafer that its die was manufactured on.
This brings us to the PCIe versions. Interestingly, they have been down-binned from the NVLink version. The boost clock has been dropped to 1300 MHz, from 1480 MHz, although that is matched with a slightly lower TDP (250W versus the NVLink's 300W). This lowers the FP16 performance to 18.7 TFLOPs, down from 21.2, FP32 performance to 9.3 TFLOPs, down from 10.6, and FP64 performance to 4.7 TFLOPs, down from 5.3. This is where we get to the question: did NVIDIA reduce the clocks to hit a 250W TDP and be compatible with the passive cooling technology that previous Tesla cards utilize, or were the clocks dropped to increase yield?
They are also providing a 12GB version of the PCIe Tesla P100. I didn't realize that GPU vendors could selectively disable HBM2 stacks, but NVIDIA disabled 4GB of memory, which also dropped the bus width to 3072-bit. You would think that the simplicity of the circuit would want to divide work in a power-of-two fashion, but, knowing that they can, it makes me wonder why they did. Again, my first reaction is to question GP100 yield, but you wouldn't think that HBM, being such a small part of the die, is something that they can reclaim a lot of chips by disabling a chunk, right? That is, unless the HBM2 stacks themselves have yield issues -- which would be interesting.
There is also still no word on a 32GB version. Samsung claimed the memory technology, 8GB stacks of HBM2, would be ready for products in Q4 2016 or early 2017. We'll need to wait and see where, when, and why it will appear.
Subject: Graphics Cards | June 8, 2016 - 08:44 PM | Ryan Shrout
Tagged: sli, pascal, nvidia, GTX 1080, GP104, geforce, 4-way sli, 3-way sli
IMPORTANT UPDATE: After writing this story, but before publication, we went to NVIDIA for comment. As we were getting ready to publish, the company updated me with a shift in its stance on multi-GPU configurations. NVIDIA will no longer require an "enthusiast key" to enable SLI on more than two GPUs. However, NVIDIA will also only be enabling 3-Way and 4-Way SLI for a select few applications. More details are at the bottom of the story!
You'll likely recall that during our initial review of the GeForce GTX 1080 Founders Edition graphics card, we mentioned that NVIDIA was going to be moving people towards the idea that "only 2-Way SLI will be supported" and promoted. There would still be a path for users that wanted 3 and 4 GPU configurations anyway, and it would be called the Enthusiast Key.
As it turns out, after returning from an AMD event focused on its upcoming Polaris GPUs, I happen to have amassed a total of four GeForce GTX 1080 cards.
Courtesy of some friends at EVGA and two readers that were awesome enough to let me open up their brand new hardware for a day or so, I was able to go through the 3-Way and 4-Way SLI configuration process. Once all four were installed, and I must point out how great it is that each card only required a single 8-pin power connector, I installed the latest NVIDIA driver I had on hand, 368.19.
Knowing about the need for the Enthusiast Key, and also knowing that I did not yet have one and that the website that was supposed to be live to enable me to get one is still not live, I thought I might have stumbled upon some magic. The driver appeared to let me enable SLI anyway.
Enthusiasts will note however that the green marker under the four GPUs with the "SLI" text is clearly only pointing at two of the GTX 1080s, leaving the remaining two...unused. Crap.
At this point, if you have purchased more than two GeForce GTX 1080 cards are simply out of luck and are waiting on NVIDIA to make good on it's promise to allow for 3-Way and 4-Way configurations via the Enthusiast Key. Or some other way. It's way too late now to simply say "we aren't supporting it at all."
While I wait...what is there for a gamer with four GeForce GTX 1080 cards to do? Well, you could run Ashes of the Singularity. It's multi-GPU mode uses MDA mode, which means the game engine itself accesses each GPU on its own, without the need for the driver to handle anything regarding GPU load balancing. Unfortunately, Ashes only supports two GPUs today.
Well...you could run an OpenCL based benchmark like LuxMark that access all the GPUs independently as well.
I did so, and the result is an impressive score of 17,127!!
How does that compare to some other products?
The four GTX 1080 cards produce a score that is 2.57x the result provided by the AMD Radeon Pro Duo and 2.29x the score of SLI GeForce GTX 980 Ti cards. Nice!
So there you go! We are just as eager to get our hands on the ability to test 3-Way and 4-Way SLI with new Pascal GPUs as some of the most extreme and dedicated enthusiasts out there are. With any luck, NVIDIA will finally figure out a way to allow it - no matter how it finally takes place.
IMPORTANT UPDATE: Before going to press with this story I asked NVIDIA for comment directly: when was the community finally going to get the Enthusiast Key website to unlock 3-Way and 4-Way SLI for those people crazy enough to have purchased that many GTX 1080s? The answer was quite surprising: NVIDIA is backing away from the idea of an "Enthusiast Key" and will no longer require it for enabling 3-Way and 4-Way SLI.
Here is the official NVIDIA statement given to PC Perspective on the subject:
With the GeForce 10-series we’re investing heavily in 2-way SLI with our new High Bandwidth bridge (which doubles the SLI bandwidth for faster, smoother gaming at ultra-high resolutions and refresh rates) and NVIDIA Game Ready Driver SLI profiles. To ensure the best possible gaming experience on our GeForce 10-series GPUs, we’re focusing our efforts on 2-way SLI only and will continue to include 2-way SLI profiles in our Game Ready Drivers.
DX12 and NVIDIA VR Works SLI technology also allows developers to directly implement and control multi-GPU support within their games. If a developer chooses to use these technologies then their game will not need SLI profiles. Some developers may also decide to support more than 2 GPUs in their games. We continue to work with all developers creating games and VR applications that take advantage of 2 or more GPUs to make sure they’ll work great on GeForce 10-series GPUs.
For our overclocking community, our Game Ready Drivers will also include SLI profiles for 3- and 4-way configurations for specific OC applications only, including Fire Strike, Unigine and Catzilla.
NVIDIA clearly wants to reiterate that only 2-Way SLI will get the attention that we have come to expect from the GeForce driver dev team. As DX12 and Vulkan next-generation APIs become more prolific, the game developers will still have the ability to directly access more than two GeForce GTX 10-series GPUs, though I expect that be a very narrow window of games simply due to development costs and time.
NVIDIA will enable support for three and four card configurations in future drivers (without a key) for specific overclocking/benchmarking tools only, as a way to make sure the GeForce brand doesn't fall off the 3DMark charts. Only those specific applications will be able operate in the 3-Way and 4-Way SLI configurations that you have come to know. There are no profiles to change manually and even the rare games that might have "just worked" with three or four GPUs will not take advantage of more than two GTX 10-series cards. It's fair to say at this point that except for the benchmarking crowd, NVIDIA 3-Way and 4-Way SLI is over.
We expect the "benchmark only" mode of 3-Way and 4-Way SLI to be ready for consumers with the next "Game Ready" driver release. If you happened to get your hands on more than two GTX 1080s but aren't into benchmarking, then find those receipts and send a couple back.
So there you have it. Honestly, this is what I was expecting from NVIDIA with the initial launch of Pascal and the GeForce GTX 1080/1070 and I was surprised when I first heard about the idea of the "enthusiast key." It took a bit longer than expected, and NVIDIA will get more flak for the iterated dismissal of this very niche, but still pretty cool, technology. In the end, this won't have much impact on the company's bottom line as the quantity of users that were buying 3+ GTX GPUs for a single system was understandably small.
Subject: Graphics Cards | June 7, 2016 - 08:07 PM | Scott Michaud
Tagged: zotac, pascal, nvidia, GTX 1080, GP104, asus
Update @ 10:30pm, June 7th: Annnnnnnnd it's gone.
Update @ 9:45pm, June 7th: ASUS is now out-of-stock, so I crossed out the relevant links. ZOTAC is still around for now.
Update @ 8:45pm, June 7th: Turns out that it's also available on Newegg US. In fact, it's possible that both sites share from the same stock pool, at least for the
US ASUS and US ZOTAC cards, given that Newegg Canada says it ships them from the US.
A couple of GeForce GTX 1080s are available at Newegg Canada at the moment. Both models, one from
ASUS and one from ZOTAC, are listed at $909. This seems high, but it's actually the current US-to-Canada exchange rate from the $699 MSRP. If you were interested in the Founders Edition cards, then you have a brief moment to pick one up.
That said, it's looking like the custom-cooled versions might be a better bet. The EVGA dual-fan GAMING SC ACX 3.0 version is listed at $824.99 CDN (~$635 USD) and, from what we've seen so far, seems to be quite a bit cooler than the Founders Edition (albeit we haven't tested sound levels yet). Those should be coming out fairly soon, and will apparently lean on the cheaper side of the Founders Edition fence.
But, if you don't care, go go go go go.
Subject: Graphics Cards | June 7, 2016 - 03:50 PM | Jeremy Hellstrom
Tagged: geforce, GP104, gtx 1070, nvidia, pascal
With Computex behind us it is time to catch up on all the reviews which were launched during the show, including [H]ard|OCP's review of the GTX 1070 Founders Edition. Their testing was done using an NVIDIA provided driver, GeForce 368.19 the same one which Ryan used in his review. They did not have a chance to delve into overclocking or utilizing the new power settings. From their testing they concluded the GTX 1070 is a great upgrade for those using a vanilla GTX 980 or R9 390X, while the card performs faster than a R9 Fury X or GTX 980 Ti the jump is not quite enough to recommend dumping it for anything less than a GTX1080.
"In our review of the NVIDIA GeForce GTX 1070 Founders Edition video card we will explore the price competitive performance and find out what kind of gameplay advantage the new GeForce GTX 1070 Founders Edition offers over the previous generation cards. We compare both the GTX 980 and Radeon R9 Fury GPUs to the new GTX 1070"
Here are some more Graphics Card articles from around the web:
- Asus Republic Of Gamers Strix GTX 1080 Aura RGB OC @ Kitguru
- Nvidia GTX 1070 Founders Edition @ Kitguru
- NVIDIA GeForce GTX 1080 Overclocking & Best Playable Settings At 4K & Ultrawide @ Techgage
- NVIDIA GeForce GTX 1070 Overclocking Review @ OCC
- NVIDIA GeForce GTX 1080 On Linux: OpenGL, OpenCL, Vulkan Performance @ Phoronix
- Performance & Perf-Per-Watt From NVIDIA's GeForce 9800GTX To GTX 1080 @ Phoronix
- OpenGL Performance & Perf-Per-Watt From The Radeon HD 3850 Through R9 Fury @ Phoronix
Subject: Graphics Cards, Motherboards | June 6, 2016 - 05:14 PM | Scott Michaud
Tagged: pascal, nvidia, motherboard, gtx 1070, GP104, colorful
So here's an interesting bit of news from Colorful, via Videocardz and LG Nilsson. Remember when on-board graphics was a pejorative? Since the GPUs that are attached to many CPUs tend to sufficiently cover everything below a discrete graphics add-in board, there is not a whole lot of mind-share for discrete, on-board GPUs. You get the occasional desktop-style device with a mobile add-in module, but that's about it.
Image Credit: LG Nilsson (via Videocardz)
In this case, it looks like Colorful, the Chinese PC hardware manufacturer, integrated the required components from a GTX 1070 directly onto a motherboard's PCB. We've heard rumors that GP104 would be available in mobile form-factors in a few months, so it's possible that this draws from some laptop initiatives, but it's interesting to see others consider it too. As Videocardz pointed out, this is not an ATX-standard board, so it's possible that Colorful is planning on getting into (or supporting someone getting into) small form factor desktops or is building hardware for all-in-one PCs.
So what's next? A vendor like ASUS making a VRWorks Audio sound card with integrated Pascal?
Subject: Graphics Cards, Mobile | June 4, 2016 - 04:28 PM | Scott Michaud
Tagged: nvidia, GTX 1080, gtx 1070, pascal
Normally, when a GPU developer creates a laptop SKU, they re-use the desktop branding, add an M at the end, but release a very different, significantly slower part. This changed with the GTX 980, as NVIDIA cherry-picked the heck out of their production to find chips that could operate full-speed at a lower-than-usual TDP. With less power (and cooling) to consider, they were sent to laptop manufacturers and integrated into high-end designs.
They still had the lower-performance 980M, though, which was confusing for potential customers. You needed to know to avoid the M, and trust the product page to correctly add the M as applicable. This is where PCGamer's scoop comes into play. Apparently, NVIDIA will stop “producing separate M versions of its desktop GPUs”. Also, they are expected to release their 10-series desktop GPUs to their laptop partners by late-summer.
Last time, NVIDIA took almost a year to bin enough GPUs for laptops. While we don't know how long they've been stockpiling GP104 GPUs, this, if the rumors are true, would just be about three months of lead-time for the desktop SKUs. Granted, Pascal is significantly more efficient than Maxwell. Maxwell tried to squeeze extra performance out of an existing fabrication node, while Pascal is a relatively smaller chip, benefiting from the industry's double-shrink in process technology. It's possible that they didn't need to drop the TDP threshold that far below what they accept for desktop.
For us desktop users, this also suggests that NVIDIA is not having too many issues with yield in general. I mean, if they were expecting GPU shortages to persist for months, you wouldn't expect that they would cut their supply further with a new product segment, particularly one that should require both decent volume and well-binned chips. This, again, might mean that we'll see desktop GPUs restock soon. Either that, or NVIDIA significantly miscalculated demand for new GPUs, and they needed to fulfill partner obligations that they made before reality struck.
Call it wishful thinking, but I don't think it's the latter.
Subject: Graphics Cards | May 30, 2016 - 03:50 PM | Jeremy Hellstrom
Tagged: geforce, GP104, gtx 1070, nvidia, pascal
If we missed your favourite game, synthetic benchmark or a specific competitors card in our review of the new GTX 1070 then perhaps one of the sites below might satisfy your cravings. For instance, if it is Ashes of the Singularity or The Division which you want to see benchmarked the [H]ard|OCP has you covered. They also had a go at overclocking, with the new software they tweaked the card's fan speed to 100%, power target at 112%, and GPU Offset overclocking at +230. That resulted in a peak GPU speed of 2113MHz although the averaged frequency over a 30 minute gaming session was 2052MHz, they will revisit the card to overclock the memory in the near future. Check out their full review here.
"The second video card in the NVIDIA next generation Pascal GPU architecture is finally here, we will explore the GeForce GTX 1070 Founders Edition video card. In this limited preview today we will look at performance in comparison to GeForce GTX 980 Ti and Radeon R9 Fury X as well as some preview overclocking."
Here are some more Graphics Card articles from around the web:
- GeForce GTX 1070 FCAT Frametime Anaysis @ Guru of 3D
- NVIDIA GTX 1070 Review - The Revolution Continues @HiTech Legion
- Nvidia GeForce GTX 1070 @ Legion Hardware
- NVIDIA GeForce GTX 1070 Founders Edition Review @ OCC
- Radeon Linux 4.6 + Mesa 11.3 vs. NVIDIA Linux Performance & Perf-Per-Watt @ Phoronix
- XFX Radeon R9 Fury Triple Dissipation @ [H]ard|OCP
- NVIDIA GeForce GTX 1080 vs Titan X vs R9 Fury vs GTX 980 Ti vs GTX 980 vs R9 390X @HiTech Legion
- NVIDIA GeForce GTX 1080 Overclocking Review @ OCC
GP104 Strikes Again
It’s only been three weeks since NVIDIA unveiled the GeForce GTX 1080 and GTX 1070 graphics cards at a live streaming event in Austin, TX. But it feels like those two GPUs, one of which hasn't even been reviewed until today, have already drastically shifted the landscape of graphics, VR and PC gaming.
Half of the “new GPU” stories are told, with AMD due to follow up soon with Polaris, but it was clear to anyone watching the enthusiast segment with a hint of history that a line was drawn in the sand that day. There is THEN, and there is NOW. Today’s detailed review of the GeForce GTX 1070 completes NVIDIA’s first wave of NOW products, following closely behind the GeForce GTX 1080.
Interestingly, and in a move that is very uncharacteristic of NVIDIA, detailed specifications of the GeForce GTX 1070 were released on GeForce.com well before today’s reviews. With information on the CUDA core count, clock speeds, and memory bandwidth it was possible to get a solid sense of where the GTX 1070 performed; and I imagine that many of you already did the napkin math to figure that out. There is no more guessing though - reviews and testing are all done, and I think you'll find that the GTX 1070 is as exciting, if not more so, than the GTX 1080 due to the performance and pricing combination that it provides.
Let’s dive in.
First, Some Background
NVIDIA's Rumored GP102
When GP100 was announced, Josh and I were discussing, internally, how it would make sense in the gaming industry. Recently, an article on WCCFTech cited anonymous sources, which should always be taken with a dash of salt, that claimed NVIDIA was planning a second architecture, GP102, between GP104 and GP100. As I was writing this editorial about it, relating it to our own speculation about the physics of Pascal, VideoCardz claims to have been contacted by the developers of AIDA64, seemingly on-the-record, also citing a GP102 design.
I will retell chunks of the rumor, but also add my opinion to it.
In the last few generations, each architecture had a flagship chip that was released in both gaming and professional SKUs. Neither audience had access to a chip that was larger than the other's largest of that generation. Clock rates and disabled portions varied by specific product, with gaming usually getting the more aggressive performance for slightly better benchmarks. Fermi had GF100/GF110, Kepler had GK110/GK210, and Maxwell had GM200. Each of these were available in Tesla, Quadro, and GeForce cards, especially Titans.
Maxwell was interesting, though. NVIDIA was unable to leave 28nm, which Kepler launched on, so they created a second architecture at that node. To increase performance without having access to more feature density, you need to make your designs bigger, more optimized, or more simple. GM200 was giant and optimized, but, to get the performance levels it achieved, also needed to be more simple. Something needed to go, and double-precision (FP64) performance was the big omission. NVIDIA was upfront about it at the Titan X launch, and told their GPU compute customers to keep purchasing Kepler if they valued FP64.