Subject: Graphics Cards | February 14, 2017 - 09:29 PM | Scott Michaud
Tagged: opencl 2.0, opencl, nvidia, graphics drivers
While the headline of the GeForce 378.66 graphics driver release is support for For Honor, Halo Wars 2, and Sniper Elite 4, NVIDIA has snuck something major into the 378 branch: OpenCL 2.0 is now available for evaluation. (I double-checked 378.49 release notes and confirmed that this is new to 378.66.)
OpenCL 2.0 support is not complete yet, but at least NVIDIA is now clearly intending to roll it out to end-users. Among other benefits, OpenCL 2.0 allows kernels (think shaders) to, without the host intervening, enqueue work onto the GPU. This saves one (or more) round-trips to the CPU, especially in workloads where you don’t know which kernel will be required until you see the results of the previous run, like recursive sorting algorithms.
So yeah, that’s good, albeit you usually see big changes at the start of version branches.
Another major addition is Video SDK 8.0. This version allows 10- and 12-bit decoding of VP9 and HEVC video. So... yeah. Applications that want to accelerate video encoding or decoding can now hook up to NVIDIA GPUs for more codecs and features.
NVIDIA’s GeForce 378.66 drivers are available now.
Subject: Graphics Cards | February 14, 2017 - 05:57 PM | Scott Michaud
Tagged: amd, graphics drivers
Just in time for For Honor and Sniper Elite 4, AMD has released a new set of graphics drivers, Radeon Software Crimson ReLive 17.2.1, that target these games. The performance improvements that they quote are in the 4-5% range, when compared to their previous driver on the RX 480, which would be equivalent to saving a whole millisecond per frame at 60 FPS. (This is just for mathematical reference; I don’t know what performance users should expect with an RX 480.)
Beyond driver overhead improvements, you will now be able to utilize multiple GPUs in CrossFire (for DirectX 11) on both titles.
Also, several issues have been fixed with this version. If you have a FreeSync monitor, and some games fail to activate variable refresh mode, then this driver might solve this problem for you. Scrubbing through some videos (DXVA H.264) should no longer cause visible corruption. A couple applications, like GRID and DayZ, should no longer crash under certain situations. You get the idea.
If you have an AMD GPU on Windows, pick up these drivers from their support page.
The new EVGA GTX 1080 FTW2 with iCX Technology
Back in November of 2016, EVGA had a problem on its hands. The company had a batch of GTX 10-series graphics cards using the new ACX 3.0 cooler solution leave the warehouse missing thermal pads required to keep the power management hardware on its cards within reasonable temperature margins. To its credit, the company took the oversight seriously and instituted a set of solutions for consumers to select from: RMA, new VBIOS to increase fan speeds, or to install thermal pads on your hardware manually. Still, as is the case with any kind of product quality lapse like that, there were (and are) lingering questions about EVGA’s ability to maintain reliable product; with features and new options that don’t compromise the basics.
Internally, the drive to correct these lapses was…strong. From the very top of the food chain on down, it was hammered home that something like this simply couldn’t occur again, and even more so, EVGA was to develop and showcase a new feature set and product lineup demonstrating its ability to innovate. Thus was born, and accelerated, the EVGA iCX Technology infrastructure. While this was something in the pipeline for some time already, it was moved up to counter any negative bias that might have formed for EVGA’s graphics cards over the last several months. The goal was simple: prove that EVGA was the leader in graphics card design and prove that EVGA has learned from previous mistakes.
EVGA iCX Technology
Previous issues aside, the creation of iCX Technology is built around one simple question: is one GPU temperature sensor enough? For nearly all of today’s graphics cards, cooling is based around the temperature of the GPU silicon itself, as measured by NVIDIA (for all of EVGA’s cards). This is how fan curves are built, how GPU clock speeds are handled with GPU Boost, how noise profiles are created, and more. But as process technology has improved, and GPU design has weighed towards power efficiency, the GPU itself is often no longer the thermally limiting factor.
As it turns out, converting 12V (from the power supply) to ~1V (necessary for the GPU) is a simple process that creates a lot of excess heat. The thermal images above clearly demonstrate that and EVGA isn’t the only card vendor to take notice of this. As it turns out, EVGA’s product issue from last year was related to this – the fans were only spinning fast enough to keep the GPU cool and did not take into account the temperature of memory or power delivery.
The fix from EVGA is to ratchet up the number of sensors on the card PCB and wrap them with intelligence in the form of MCUs, updated Precision XOC software and user viewable LEDs on the card itself.
EVGA graphics cards with iCX Technology will include 9 total thermal sensors on the board, independent of the GPU temperature sensor directly integrated by NVIDIA. There are three sensors for memory, five for power delivery and an additional sensor for the GPU temperature. Some are located on the back of the PCB to avoid any conflicts with trace routing between critical components, including the secondary GPU sensor.
Subject: Graphics Cards | February 9, 2017 - 02:46 PM | Jeremy Hellstrom
Tagged: amd, nvidia
New graphics drivers are a boon to everyone who isn't a hardware reviewer, especially one who has just wrapped up benchmarking a new card the same day one is released. To address this issue see what changes have been implemented by AMD and NVIDIA in their last few releases, [H]ard|OCP tested a slew of recent drivers from both companies. The performance of AMD's past releases, up to and including the AMD Crimson ReLive Edition 17.1.1 Beta can be found here. For NVIDIA users, recent drivers covering up to the 378.57 Beta Hotfix are right here. The tests show both companies generally increasing the performance of their drivers, however the change is so small you are not going to notice a large difference.
"We take the AMD Radeon R9 Fury X and AMD Radeon RX 480 for a ride in 11 games using drivers from the time of each video card’s launch date, to the latest AMD Radeon Software Crimson ReLive Edition 17.1.1 Beta driver. We will see how performance in old and newer games has changed over the course of 2015-2017 with new drivers. "
Here are some more Graphics Card articles from around the web:
- Intel Celeron/Pentium/Core i3/i5/i7 - NVIDIA vs. AMD Linux Gaming Performance @ Phoronix
- PowerColor Radeon RX 470 Red Devil (4GB) @ Custom PC Review
- GeForce GTX 1080 @ Hardware Secrets
NVIDIA P100 comes to Quadro
At the start of the SOLIDWORKS World conference this week, NVIDIA took the cover off of a handful of new Quadro cards targeting professional graphics workloads. Though the bulk of NVIDIA’s discussion covered lower cost options like the Quadro P4000, P2000, and below, the most interesting product sits at the high end, the Quadro GP100.
As you might guess from the name alone, the Quadro GP100 is based on the GP100 GPU, the same silicon used on the Tesla P100 announced back in April of 2016. At the time, the GP100 GPU was specifically billed as an HPC accelerator for servers. It had a unique form factor with a passive cooler that required additional chassis fans. Just a couple of months later, a PCIe version of the GP100 was released under the Tesla GP100 brand with the same specifications.
Today that GPU hardware gets a third iteration as the Quadro GP100. Let’s take a look at the Quadro GP100 specifications and how it compares to some recent Quadro offerings.
|Quadro GP100||Quadro P6000||Quadro M6000||Full GP100|
|FP32 CUDA Cores / SM||64||64||64||64|
|FP32 CUDA Cores / GPU||3584||3840||3072||3840|
|FP64 CUDA Cores / SM||32||2||2||32|
|FP64 CUDA Cores / GPU||1792||120||96||1920|
|Base Clock||1303 MHz||1417 MHz||1026 MHz||TBD|
|GPU Boost Clock||1442 MHz||1530 MHz||1152 MHz||TBD|
|FP32 TFLOPS (SP)||10.3||12.0||7.0||TBD|
|FP64 TFLOPS (DP)||5.15||0.375||0.221||TBD|
|Memory Interface||1.4 Gbps
|Memory Bandwidth||716 GB/s||432 GB/s||316.8 GB/s||?|
|Memory Size||16GB||24 GB||12GB||16GB|
|TDP||235 W||250 W||250 W||TBD|
|Transistors||15.3 billion||12 billion||8 billion||15.3 billion|
|GPU Die Size||610mm2||471 mm2||601 mm2||610mm2|
There are some interesting stats here that may not be obvious at first glance. Most interesting is that despite the pricing and segmentation, the GP100 is not the de facto fastest Quadro card from NVIDIA depending on your workload. With 3584 CUDA cores running at somewhere around 1400 MHz at Boost speeds, the single precision (32-bit) rating for GP100 is 10.3 TFLOPS, less than the recently released P6000 card. Based on GP102, the P6000 has 3840 CUDA cores running at something around 1500 MHz for a total of 12 TFLOPS.
GP100 (full) Block Diagram
Clearly the placement for Quadro GP100 is based around its 64-bit, double precision performance, and its ability to offer real-time simulations on more complex workloads than other Pascal-based Quadro cards can offer. The Quadro GP100 offers 1/2 DP compute rate, totaling 5.2 TFLOPS. The P6000 on the other hand is only capable of 0.375 TLOPS with the standard, consumer level 1/32 DP rate. Inclusion of ECC memory support on GP100 is also something no other recent Quadro card has.
Raw graphics performance and throughput is going to be questionable until someone does some testing, but it seems likely that the Quadro P6000 will still be the best solution for that by at least a slim margin. With a higher CUDA core count, higher clock speeds and equivalent architecture, the P6000 should run games, graphics rendering and design applications very well.
There are other important differences offered by the GP100. The memory system is built around a 16GB HBM2 implementation which means more total memory bandwidth but at a lower capacity than the 24GB Quadro P6000. Offering 66% more memory bandwidth does mean that the GP100 offers applications that are pixel throughput bound an advantage, as long as the compute capability keeps up on the backend.
Subject: Graphics Cards | February 6, 2017 - 11:43 AM | Sebastian Peak
Tagged: video card, silent, Passive, palit, nvidia, KalmX, GTX 1050 Ti, graphics card, gpu, geforce
Palit is offering a passively-cooled GTX 1050 Ti option with their new KalmX card, which features a large heatsink and (of course) zero fan noise.
"With passive cooler and the advanced powerful Pascal architecture, Palit GeForce GTX 1050 Ti KalmX - pursue the silent 0dB gaming environment. Palit GeForce GTX 1050 Ti gives you the gaming horsepower to take on today’s most demanding titles in full 1080p HD @ 60 FPS."
The specs are identical to a reference GTX 1050 Ti (4GB GDDR5 @ 7 Gb/s, Base 1290/Boost 1392 MHz, etc.), so expect the full performance of this GPU - with some moderate case airflow, no doubt.
We don't have specifics on pricing or availablity just yet.
Subject: Graphics Cards | February 4, 2017 - 03:29 PM | Tim Verry
Tagged: micron, graphics memory, gddr6
This year is shaping up to be a good year for memory with the promise of 3D XPoint (Intel/Micron), HBM2 (SK Hynix and Samsung), and now GDDR6 graphics memory from Micron launching this year. While GDDR6 was originally planned to be launched next year, Micron recently announced its intentions to start producing the memory chips by the later half of 2017 which would put it much earlier than previously expected.
Computer World reports that Micron is citing the rise of e-sports and gaming driving the computer market that now sees three year upgrade cycles rather than five year cycles (I am not sure how accurate that is, however as it seems like PCs are actually lasting longer between upgrade as far as relevance but i digress) as the primary reason for shifting GDDR6 production into high gear and moving up the launch window. The company expects the e-sports market to grow to 500 million fans by 2020, and it is a growing market that Micron wants to stay relevant in.
If you missed our previous coverage, GDDR6 is the successor to GDDR5 and offers twice the bandwidth at 16 Gb/s (gigabits per second) per die. It is also faster than GDDR5X (12 Gb/s) and uses 20% less power which the gaming laptop market will appreciate. HBM2 still holds the bandwidth crown though as it offers 256 GB/s per stack and up to 1TB/s with four stacks connected to a GPU on package.
As such, High Bandwidth Memory (HBM2 and then HBM3) will power the high end gaming and professional graphics cards while GDDR6 will become the memory used for mid range cards and GDDR5X (which is actually capable of going faster but will likely not be pushed much past 12 Gbps after all if GDDR6 does come out this soon) will replace GDDR5 on most if not all of the lower end products.
I am not sure if Micron’s reasoning of e-sports, faster upgrade cycles, and VR being the motivating factor(s) to ramping up production early is sound or not, but I will certainly take the faster memory coming out sooner rather than later! Depending on exactly when in 2017 the chips start rolling off the fabs, we could see graphics cards using the new memory technology as soon as early 2018 (just in time for CES announcements? oh boy I can see the PR flooding in already! hehe).
Will Samsung change course as well and try for a 2017 release for its GDDR6 memory as well?
Are you ready for GDDR6?
Subject: Graphics Cards | February 3, 2017 - 05:58 PM | Scott Michaud
Tagged: nvidia, graphics drivers, vulkan
On February 1st, NVIDIA released a new developer beta driver, which fixes a couple of issues with their Vulkan API implementation. Unlike what some sites have been reporting, you should not download it to play games that use the Vulkan API, like DOOM. In short, it is designed for developers, not end-users. The goal is to provide correct results when software interacts with the driver, not the best gaming performance or anything like that.
In a little more detail, it looks like 376.80 implements the Vulkan 220.127.116.11 SDK. This update addresses two issues with accessing devices and extensions, under certain conditions, when using the 18.104.22.168 SDK. 22.214.171.124 was released on January 23rd, and thus it will not even be a part of current video games. Even worse, it, like most graphics drivers for software developers, is based on the old, GeForce 376 branch, so it won’t even have NVIDIA’s most recent fixes and optimizations. NVIDIA does this so they can add or change the features that Vulkan developers require without needing to roll-in patches every time they make a "Game Ready" optimization or something. There is no reason to use this driver unless you are developing Vulkan applications, and you want to try out the new extensions. It will eventually make it to end users... when it's time.
If you are wishing to develop software using Vulkan’s bleeding-edge features, then check out NVIDIA’s developer portal to pick up the latest drivers. Basically everyone else should use 378.49 or its 378.57 hotfix.
Subject: Graphics Cards | February 2, 2017 - 07:02 AM | Scott Michaud
Tagged: graphics drivers, amd
A few days ago, AMD released their second graphics drivers of January 2017: Radeon Software Crimson ReLive 17.1.2. The main goal of these drivers are to support the early access of Conan Exiles as well as tomorrow’s closed beta for Tom Clancy’s Ghost Recon Wildlands. Optimization that AMD has been working on prior to release, for either game, are targeted at this version.
Beyond game-specific optimizations, a handful of bugs are also fixed, ranging from crashes to rendering artifacts. There was also an issue with configuring WattMan on a system that has multiple monitors, where the memory clock would drop or bounce around. There is driver also has a bunch of known issues, including a couple of hangs and crashes under certain situations.
Subject: Graphics Cards | February 2, 2017 - 07:01 AM | Scott Michaud
Tagged: nvidia, graphics drivers
If you were having issues with Minecraft on NVIDIA’s recent 378.49 drivers, then you probably want to try out their latest hotfix. This version, numbered 378.57, will not be pushed down GeForce Experience, so you will need to grab them from NVIDIA’s customer support page.
Beyond Minecraft, this also fixes an issue with “debug mode”. For some Pascal-based graphics cards, the option in NVIDIA Control Panel > Help > Debug Mode might be on by default. This option will reduce factory-overclocked GPUs down to NVIDIA’s reference speeds, which is useful to eliminate stability issues in testing, but pointlessly slow if you’re already stable. I mean, you bought the factory overclock, right? I’m guessing someone at NVIDIA used it to test 378.49 during its development, fixed an issue, and accidentally commit the config file with the rest of the fix. Either way, someone caught it, and it’s now fixed, even though you should be able to just untick it if you have a factory-overclocked GPU.