Author:
Manufacturer: NVIDIA

GP106 Preview

It’s probably not going to come as a surprise to anyone that reads the internet, but NVIDIA is officially taking the covers off its latest GeForce card in the Pascal family today, the GeForce GTX 1060. As the number scheme would suggest, this is a more budget-friendly version of NVIDIA’s latest architecture, lowering performance in line with expectations. The GP106-based GPU will still offer impressive specifications and capabilities and will probably push AMD’s new Radeon RX 480 to its limits.

01.jpg

Let’s take a quick look at the card’s details.

  GTX 1060 RX 480 R9 390 R9 380 GTX 980 GTX 970 GTX 960 R9 Nano GTX 1070
GPU GP106 Polaris 10 Grenada Tonga GM204 GM204 GM206 Fiji XT GP104
GPU Cores 1280 2304 2560 1792 2048 1664 1024 4096 1920
Rated Clock 1506 MHz 1120 MHz 1000 MHz 970 MHz 1126 MHz 1050 MHz 1126 MHz up to 1000 MHz 1506 MHz
Texture Units 80 (?) 144 160 112 128 104 64 256 120
ROP Units 48 (?) 32 64 32 64 56 32 64 64
Memory 6GB 4GB
8GB
8GB 4GB 4GB 4GB 2GB 4GB 8GB
Memory Clock 8000 MHz 7000 MHz
8000 MHz
6000 MHz 5700 MHz 7000 MHz 7000 MHz 7000 MHz 500 MHz 8000 MHz
Memory Interface 192-bit 256-bit 512-bit 256-bit 256-bit 256-bit 128-bit 4096-bit (HBM) 256-bit
Memory Bandwidth 192 GB/s 224 GB/s
256 GB/s
384 GB/s 182.4 GB/s 224 GB/s 196 GB/s 112 GB/s 512 GB/s 256 GB/s
TDP 120 watts 150 watts 275 watts 190 watts 165 watts 145 watts 120 watts 275 watts 150 watts
Peak Compute 3.85 TFLOPS 5.1 TFLOPS 5.1 TFLOPS 3.48 TFLOPS 4.61 TFLOPS 3.4 TFLOPS 2.3 TFLOPS 8.19 TFLOPS 5.7 TFLOPS
Transistor Count ? 5.7B 6.2B 5.0B 5.2B 5.2B 2.94B 8.9B 7.2B
Process Tech 16nm 14nm 28nm 28nm 28nm 28nm 28nm 28nm 16nm
MSRP (current) $249 $199 $299 $199 $379 $329 $279 $499 $379

The GeForce GTX 1060 will sport 1280 CUDA cores with a GPU Boost clock speed rated at 1.7 GHz. Though the card will be available in only 6GB varieties, the reference / Founders Edition will ship with 6GB of GDDR5 memory running at 8.0 GHz / 8 Gbps. With 1280 CUDA cores, the GP106 GPU is essentially one half of a GP104 in terms of compute capability. NVIDIA decided not to cut the memory interface in half though, instead going with a 192-bit design compared to the GP104 and its 256-bit option.

The rated GPU clock speeds paint an interesting picture for peak performance of the new card. At the rated boost clock speed, the GeForce GTX 1070 produces 6.46 TFLOPS of performance. The GTX 1060 by comparison will hit 4.35 TFLOPS, a 48% difference. The GTX 1080 offers nearly the same delta of performance above the GTX 1070; clearly NVIDIA has set the scale Pascal and product deviation.

NVIDIA wants us to compare the new GeForce GTX 1060 to the GeForce GTX 980 in gaming performance, but the peak theoretical performance results don’t really match up. The GeForce GTX 980 is rated at 4.61 TFLOPS at BASE clock speed, while the GTX 1060 doesn’t hit that number at its Boost clock. Obviously Pascal improves on performance with memory compression advancements, but the 192-bit memory bus is only able to run at 192 GB/s, compared to the 224 GB/s of the GTX 980. Obviously we’ll have to wait for performance result from our own testing to be sure, but it seems possible that NVIDIA’s performance claims might depend on technology like Simultaneous Multi-Projection and VR gaming to be validated.

Continue reading our preview of the new NVIDIA GeForce GTX 1060!!

This Has to Be Wrong... GP100 Titan P at Gamescom

Subject: Graphics Cards | July 6, 2016 - 11:56 PM |
Tagged: titan, pascal, nvidia, gtx 1080 ti, gp102, GP100

Normally, I pose these sorts of rumors as “Well, here you go, and here's a grain of salt.” This one I'm fairly sure is bogus, at least to some extent. I could be wrong, but especially the GP100 aspects of it just doesn't make sense.

nvidia-2016-gp100tesla.jpg

Before I get to that, the rumor is that NVIDIA will announce a GeForce GTX Titan P at Gamescom in Germany. The event occurs mid-August (17th - 21st) and it has been basically Europe's E3 in terms of gaming announcements. It also overlaps with Europe's Game Developers Conference (GDC), which occurs in March for us. The rumor says that it will use GP100 (!?!) with either 12GB of VRAM, 16GB of VRAM, or two variants as we've seen with the Tesla P100 accelerator.

The rumor also acknowledges the previously rumored GP102 die, claims that it will be for the GTX 1080 Ti, and suggests that it will have up to 3840 CUDA cores. This is the same number of CUDA cores as the GP100, which is where I get confused. This would mean that NVIDIA made a special die, which other rumors claim is ~450mm2, for just the GeForce GTX 1080 Ti.

I mean, it's possible that NVIDIA would split the GTX 1080 Ti and the next Titan by similar gaming performance, just with better half- and double-precision performance and faster memory for GPGPU developers. That would be a very weird to me, though, developing two different GPU dies for the consumer market with probably the same gaming performance.

And they would be announcing the Titan P first???
The harder to yield one???
When the Tesla version isn't even expected until Q4???

I can see it happening, but I seriously doubt it. Something may be announced, but I'd have to believe it will be at least slightly different from the rumors that we are hearing now.

Source: TechPowerUp

Vive, DisplayPort, and GP104 Apparently Don't Mix For Now

Subject: Graphics Cards | July 6, 2016 - 07:15 AM |
Tagged: pascal, nvidia, htc vive, GTX 1080, gtx 1070, GP104

NVIDIA is working on a fix to allow the HTC Vive to be connected to the GeForce GTX 1070 and GTX 1080 over DisplayPort. The HTC Vive apparently has the choice between HDMI and Mini DisplayPort, but the headset will not be identified when connected over that connection. Currently, the two workarounds are to connect the HTC Vive over HDMI, or use a DisplayPort to HDMI adapter if your card's HDMI output is already occupied.

nvidia-2016-dreamhack-1080-stockphoto.png

It has apparently been an open issue for over a month now. That said, NVIDIA's Manuel Guzman has acknowledged the issue. Other threads claim that there are other displays that have a similar issue, and, within the last 24 hours, some users have experienced luck with modifying their motherboard's settings. I'd expect that it's something the can fix in an upcoming driver, though. For now, I guess plan your monitor outputs accordingly if you were planning on getting the HTC Vive.

Source: NVIDIA

Gigabyte Shows Off Bite Sized GTX 1070 Mini ITX OC Graphics Card

Subject: Graphics Cards | July 5, 2016 - 01:49 AM |
Tagged: gigabyte, gtx 1070, pascal, mini ITX, factory overclocked

Custom graphics cards based on NVIDIA’s GTX 1070 GPU have been rolling out from all the usual suspects, and today small form factor enthusiasts have a new option with Gigabyte’s Mini ITX friendly GTX 1070 Mini ITX OC. As the name implies, this is a factory overclocked card that can hit 1746 MHz boost with the right checkboxes ticked in the company’s vBIOS utility.

Gigabyte GTX 1070 Mini ITX OC.png

The new SFF graphics card measures a mere 6.7-inches long and is a dual slot design with a custom single 90mm fan HSF. It is a custom design that uses a 5+1 power phase design which Gigabyte claims is engineered to provide lower temperatures and more stable voltage compared to Nvidia’s reference design which is a 4+1 setup. The cooler on the dual slot card uses an aluminum fin array that is fed by three direct touch heatpipes. The 90mm fan is able to spin down to 0 rpm when the card is not under load which would make it a good candidate for a gaming capable living room PC that also doubles as your media center. Gigabyte further claims that their "3D stripe" ridged fan blade design helps to reduce noise and improve cooling performance.

Rear IO on the card includes two dual link DVI connectors, one HDMI, and one DisplayPort output. The graphics card is powered by a single 8-pin PCI-E power connector.

As far as the nitty gritty specifications are concerned, Gigabyte has the GTX 1070 GPU clocked out of the box at 1531 MHz base and 1721 MHz boost. Using the company’s Xtreme Engine utility, users can enable the “OC Mode” which automatically clocks the card further to 1556 MHz base and 1746 MHz boost. The OC Mode in particular is a decent factory overclock over the reference clocks of 1506 MHz base and 1683 MHz boost respectively. The 8 GB of GDDR5 memory remains effectively untouched at 8008 MHz.

Unfortunately as is usually the case with these kinds of launches pricing and availability has not yet been announced. From a cursory look around Newegg I would guess that the card will be somewhere around $465 (both the factory overclock and SFF premium).

Source: Gigabyte

AMD RX 480 (and NVIDIA GTX 1080) Launch Demand

Subject: Graphics Cards | June 30, 2016 - 07:54 PM |
Tagged: amd, nvidia, FinFET, Polaris, polaris 10, pascal

If you're trying to purchase a Pascal or Polaris-based GPU, then you are probably well aware that patience is a required virtue. The problem is that, as a hardware website, we don't really know whether the issue is high demand or low supply. Both are manufactured on a new process node, which could mean that yield is a problem. On the other hand, it's been about four years since the last fabrication node, which means that chips got much smaller for the same performance.

amd-2016-rx480-candid.jpg

Over time, manufacturing processes will mature, and yield will increase. But what about right now? AMD made a very small chip that produces ~GTX 970-level performance. NVIDIA is sticking with their typical, 3XXmm2 chip, which ended up producing higher than Titan X levels of performance.

It turns out that, according to online retailer, Overclockers UK, via Fudzilla, both the RX480 and GTX 1080 have sold over a thousand units at that location alone. That's quite a bit, especially when you consider that it only considers one (large) online retailer from Europe. It's difficult to say how much stock other stores (and regions) received compared to them, but it's still a thousand units in a day.

It's sounding like, for both vendors, pent-up demand might be the dominant factor.

Source: Fudzilla

Report: Image of Reference Design NVIDIA GTX 1060 Leaked

Subject: Graphics Cards | June 28, 2016 - 10:26 AM |
Tagged: nvidia, GeForce GTX 1060, GTX1060, rumor, report, leak, pascal, graphics card, video card

A report from VideoCardz.com shows what appears to be an NVIDIA GeForce GTX 1060 graphics card with a cooler similar to the "Founders Edition" GTX 1080/1070 design.

NVIDIA-GeForce-GTX-1060.jpg

Is this the GTX 1060 reference design? (Image via VideoCardz.com)

The image comes via Reddit (original source links in the VideoCardz post), and we cannot verify the validity of the image - though it certainly looks convincing to this writer.

So what does VideoCardz offer as to the specifications of this GTX 1060 card? Quoting from the post:

"NVIDIA GeForce GTX 1060 will most likely use GP106 GPU with at least 1280 CUDA cores. Earlier rumors suggested that GTX 1060 might get 6 GB GDDR5 memory and 192-bit memory bus."

We await official word on the GTX 1060 from NVIDIA, which VideoCardz surmises "is expected to hit the market shortly after Radeon RX 480".

Source: VideoCardz

Frame Time Monday; this time with the GTX 1080

Subject: Graphics Cards | June 27, 2016 - 04:55 PM |
Tagged: pascal, nvidia, GTX 1080, gtx, GP104, geforce, founders edition

You have already seen our delve into the frame times provided by the GTX 1080 but perhaps you would like another opinion.  The Tech Report also uses the FCAT process which we depend upon to bring you frame time data, however they present the data in a slightly different way which might help you to comprehend the data.  They also included Crysis 3 to ensure that the card can indeed play it.  Check out their full review here.

chip.jpg

"Nvidia's GeForce GTX 1080 is the company's first consumer graphics card to feature its new Pascal architecture, fabricated on a next-generation 16-nm process. We dig deep into the GTX 1080 to see what the confluence of these advances means for the high-end graphics market."

Here are some more Graphics Card articles from around the web:

Graphics Cards

Fermi, Kepler, Maxwell, and Pascal Comparison Benchmarks

Subject: Graphics Cards | June 21, 2016 - 05:22 PM |
Tagged: nvidia, fermi, kepler, maxwell, pascal, gf100, gf110, GK104, gk110, GM204, gm200, GP104

Techspot published an article that compared eight GPUs across six, high-end dies in NVIDIA's last four architectures: Fermi to Pascal. Average frame rates were listed across nine games, each measured at three resolutions:1366x768 (~720p HD), 1920x1080 (1080p FHD), and 2560x1600 (~1440p QHD).

nvidia-2016-dreamhack-1080-stockphoto.png

The results are interesting. Comparing GP104 to GF100, mainstream Pascal is typically on the order of four times faster than big Fermi. Over that time, we've had three full generational leaps in fabrication technology, leading to over twice the number of transistors packed into a die that is almost half the size. It does, however, show that prices have remained relatively constant, except that the GTX 1080 is sort-of priced in the x80 Ti category despite the die size placing it in the non-Ti class. (They list the 1080 at $600, but you can't really find anything outside the $650-700 USD range).

It would be interesting to see this data set compared against AMD. It's informative for an NVIDIA-only article, though.

Source: Techspot

NVIDIA Announces PCIe Versions of Tesla P100

Subject: Graphics Cards | June 20, 2016 - 01:57 PM |
Tagged: tesla, pascal, nvidia, GP100

GP100, the “Big Pascal” chip that was announced at GTC, will be coming to PCIe for enterprise and supercomputer customers in Q4 2016. Previously, it was only announced using NVIDIA's proprietary connection. In fact, they also gave themselves some lead time with their first-party DGX-1 system, which retails for $129,000 USD, although we expect that was more for yield reasons. Josh calculated that each GPU in that system is worth more than the full wafer that its die was manufactured on.

nvidia-2016-gp100tesla.jpg

This brings us to the PCIe versions. Interestingly, they have been down-binned from the NVLink version. The boost clock has been dropped to 1300 MHz, from 1480 MHz, although that is matched with a slightly lower TDP (250W versus the NVLink's 300W). This lowers the FP16 performance to 18.7 TFLOPs, down from 21.2, FP32 performance to 9.3 TFLOPs, down from 10.6, and FP64 performance to 4.7 TFLOPs, down from 5.3. This is where we get to the question: did NVIDIA reduce the clocks to hit a 250W TDP and be compatible with the passive cooling technology that previous Tesla cards utilize, or were the clocks dropped to increase yield?

They are also providing a 12GB version of the PCIe Tesla P100. I didn't realize that GPU vendors could selectively disable HBM2 stacks, but NVIDIA disabled 4GB of memory, which also dropped the bus width to 3072-bit. You would think that the simplicity of the circuit would want to divide work in a power-of-two fashion, but, knowing that they can, it makes me wonder why they did. Again, my first reaction is to question GP100 yield, but you wouldn't think that HBM, being such a small part of the die, is something that they can reclaim a lot of chips by disabling a chunk, right? That is, unless the HBM2 stacks themselves have yield issues -- which would be interesting.

There is also still no word on a 32GB version. Samsung claimed the memory technology, 8GB stacks of HBM2, would be ready for products in Q4 2016 or early 2017. We'll need to wait and see where, when, and why it will appear.

Source: NVIDIA

GeForce GTX 1080 and 1070 3-Way and 4-Way SLI will not be enabled for games

Subject: Graphics Cards | June 8, 2016 - 08:44 PM |
Tagged: sli, pascal, nvidia, GTX 1080, GP104, geforce, 4-way sli, 3-way sli

IMPORTANT UPDATE: After writing this story, but before publication, we went to NVIDIA for comment. As we were getting ready to publish, the company updated me with a shift in its stance on multi-GPU configurations. NVIDIA will no longer require an "enthusiast key" to enable SLI on more than two GPUs. However, NVIDIA will also only be enabling 3-Way and 4-Way SLI for a select few applications. More details are at the bottom of the story!

You'll likely recall that during our initial review of the GeForce GTX 1080 Founders Edition graphics card, we mentioned that NVIDIA was going to be moving people towards the idea that "only 2-Way SLI will be supported" and promoted. There would still be a path for users that wanted 3 and 4 GPU configurations anyway, and it would be called the Enthusiast Key.

As it turns out, after returning from an AMD event focused on its upcoming Polaris GPUs, I happen to have amassed a total of four GeForce GTX 1080 cards.

01.jpg

Courtesy of some friends at EVGA and two readers that were awesome enough to let me open up their brand new hardware for a day or so, I was able to go through the 3-Way and 4-Way SLI configuration process. Once all four were installed, and I must point out how great it is that each card only required a single 8-pin power connector, I installed the latest NVIDIA driver I had on hand, 368.19.

driver2.jpg

Knowing about the need for the Enthusiast Key, and also knowing that I did not yet have one and that the website that was supposed to be live to enable me to get one is still not live, I thought I might have stumbled upon some magic. The driver appeared to let me enable SLI anyway. 

driver1.jpg

Enthusiasts will note however that the green marker under the four GPUs with the "SLI" text is clearly only pointing at two of the GTX 1080s, leaving the remaining two...unused. Crap.

At this point, if you have purchased more than two GeForce GTX 1080 cards are simply out of luck and are waiting on NVIDIA to make good on it's promise to allow for 3-Way and 4-Way configurations via the Enthusiast Key. Or some other way. It's way too late now to simply say "we aren't supporting it at all." 

03.jpg

While I wait...what is there for a gamer with four GeForce GTX 1080 cards to do? Well, you could run Ashes of the Singularity. It's multi-GPU mode uses MDA mode, which means the game engine itself accesses each GPU on its own, without the need for the driver to handle anything regarding GPU load balancing. Unfortunately, Ashes only supports two GPUs today.

Well...you could run an OpenCL based benchmark like LuxMark that access all the GPUs independently as well.

lux2.jpg

I did so, and the result is an impressive score of 17,127!!

lux.jpg

How does that compare to some other products?

luxmarkgraph.jpg

The four GTX 1080 cards produce a score that is 2.57x the result provided by the AMD Radeon Pro Duo and 2.29x the score of SLI GeForce GTX 980 Ti cards. Nice!

02.jpg

So there you go! We are just as eager to get our hands on the ability to test 3-Way and 4-Way SLI with new Pascal GPUs as some of the most extreme and dedicated enthusiasts out there are. With any luck, NVIDIA will finally figure out a way to allow it - no matter how it finally takes place.

IMPORTANT UPDATE: Before going to press with this story I asked NVIDIA for comment directly: when was the community finally going to get the Enthusiast Key website to unlock 3-Way and 4-Way SLI for those people crazy enough to have purchased that many GTX 1080s? The answer was quite surprising: NVIDIA is backing away from the idea of an "Enthusiast Key" and will no longer require it for enabling 3-Way and 4-Way SLI. 

Here is the official NVIDIA statement given to PC Perspective on the subject:

With the GeForce 10-series we’re investing heavily in 2-way SLI with our new High Bandwidth bridge (which doubles the SLI bandwidth for faster, smoother gaming at ultra-high resolutions and refresh rates) and NVIDIA Game Ready Driver SLI profiles.  To ensure the best possible gaming experience on our GeForce 10-series GPUs, we’re focusing our efforts on 2-way SLI only and will continue to include 2-way SLI profiles in our Game Ready Drivers.
 
DX12 and NVIDIA VR Works SLI technology also allows developers to directly implement and control multi-GPU support within their games.  If a developer chooses to use these technologies then their game will not need SLI profiles.  Some developers may also decide to support more than 2 GPUs in their games. We continue to work with all developers creating games and VR applications that take advantage of 2 or more GPUs to make sure they’ll work great on GeForce 10-series GPUs.
 
For our overclocking community, our Game Ready Drivers will also include SLI profiles for 3- and 4-way configurations for specific OC applications only, including Fire Strike, Unigine and Catzilla.

NVIDIA clearly wants to reiterate that only 2-Way SLI will get the attention that we have come to expect from the GeForce driver dev team. As DX12 and Vulkan next-generation APIs become more prolific, the game developers will still have the ability to directly access more than two GeForce GTX 10-series GPUs, though I expect that be a very narrow window of games simply due to development costs and time.

NVIDIA will enable support for three and four card configurations in future drivers (without a key) for specific overclocking/benchmarking tools only, as a way to make sure the GeForce brand doesn't fall off the 3DMark charts. Only those specific applications will be able operate in the 3-Way and 4-Way SLI configurations that you have come to know. There are no profiles to change manually and even the rare games that might have "just worked" with three or four GPUs will not take advantage of more than two GTX 10-series cards. It's fair to say at this point that except for the benchmarking crowd, NVIDIA 3-Way and 4-Way SLI is over.

We expect the "benchmark only" mode of 3-Way and 4-Way SLI to be ready for consumers with the next "Game Ready" driver release. If you happened to get your hands on more than two GTX 1080s but aren't into benchmarking, then find those receipts and send a couple back.

So there you have it. Honestly, this is what I was expecting from NVIDIA with the initial launch of Pascal and the GeForce GTX 1080/1070 and I was surprised when I first heard about the idea of the "enthusiast key." It took a bit longer than expected, and NVIDIA will get more flak for the iterated dismissal of this very niche, but still pretty cool, technology. In the end, this won't have much impact on the company's bottom line as the quantity of users that were buying 3+ GTX GPUs for a single system was understandably small.