Subject: Graphics Cards | February 17, 2017 - 07:42 AM | Scott Michaud
Tagged: nvidia, graphics drivers
Just a couple of days after publishing 378.66, NVIDIA released GeForce 378.72 Hotfix drivers. This fixes a bug encoding video in Steam’s In-Home Streaming, and it also fixes PhysX not being enabled on the GPU under certain conditions. Normally, hotfix drivers solve large-enough issues that were introduced with the previous release. This time, as far as I can tell, is a little different, though. Instead, these fixes seem to be intended for 378.66 but, for one reason or another, couldn’t be integrated and tested in time for the driver to be available for the game launches.
This is an interesting effect of the Game Ready program. There is value in having a graphics driver available on the same day (or early) as a major game releases, so that people can enjoy the title as soon as it is available. There is also value in having as many fixes as the vendor can provide. These conditions oppose each other to some extent.
From a user standpoint, driver updates are cumulative, so they are able to skip a driver or two if they are not affected by any given issue. AMD has taken up a similar structure, some times releasing three or four drivers in a month with only, like, one of them being WHQL certified. For these reasons, I tend to lean on the side of “release ‘em as you got them”. Still, I can see people feeling a little uneasy about a driver being released incomplete to hit a due-date.
But, again, that due-date has value.
It’s interesting. I’m personally glad that AMD and NVIDIA are on a rapid-release schedule, but I can see where complaints could arise. What’s your opinion?
Living Long and Prospering
The open fork of AMD’s Mantle, the Vulkan API, was released exactly a year ago with, as we reported, a hard launch. This meant public, but not main-branch drivers for developers, a few public SDKs, a proof-of-concept patch for The Talos Principle, and, of course, the ratified specification. This sets up the API to find success right out of the gate, and we can now look back over the year since.
Thor's hammer, or a tempest in a teapot?
The elephant in the room is DOOM. This game has successfully integrated the API and it uses many of its more interesting features, like asynchronous compute. Because the API is designed in a sort-of “make a command, drop it on a list” paradigm, the driver is able to select commands based on priority and available resources. AMD’s products got a significant performance boost, relative to OpenGL, catapulting their Fury X GPU up to the enthusiast level that its theoretical performance suggested.
Mobile developers have been picking up the API, too. Google, who is known for banishing OpenCL from their Nexus line and challenging OpenGL ES with their Android Extension Pack (later integrated into OpenGL ES with version 3.2), has strongly backed Vulkan. The API was integrated as a core feature of Android 7.0.
On the engine and middleware side of things, Vulkan is currently “ready for shipping games” as of Unreal Engine 4.14. It is also included in Unity 5.6 Beta, which is expected for full release in March. Frameworks for emulators are also integrating Vulkan, often just to say they did, but sometimes to emulate the quirks of these system’s offbeat graphics co-processors. Many other engines, from Source 2 to Torque 3D, have also announced or added Vulkan support.
Finally, for the API itself, The Khronos Group announced (pg 22 from SIGGRAPH 2016) areas that they are actively working on. The top feature is “better” multi-GPU support. While Vulkan, like OpenCL, allows developers to enumerate all graphics devices and target them, individually, with work, it doesn’t have certain mechanisms, like being able to directly ingest output from one GPU into another. They haven’t announced a timeline for this.
Subject: Graphics Cards | February 16, 2017 - 03:35 PM | Jeremy Hellstrom
Tagged: msi, AERO ITX, gtx 1070, gtx 1060, gtx 1050, GTX 1050 Ti, SFF, itx
MSI have just release their new series of ITX compatible GPUs, covering NVIDIA's latest series of cards from the GTX 1050 through to the GTX 1070; the GTX 1080 is not available in this form factor. The GTX 1070 and 1060 are available in both factory overclocked and standard versions.
All models share a similar design, with a single TORX fan with 8mm Super Pipes and the Zero Frozr feature which stops the fan to give silent operation when temperatures are below 60C. They are all compatible with the Afterburner Overclocking Utility, including recordings via Predator and wireless control from your phone.
The overclocked cards run slightly over reference, from the GTX 1070 at 1721MHz boost, 1531MHz base with the GDDR5 at 8GHz to the GTX 1050 at 1518MHz boost, 1404MHz base and the GDDR5 at 7GHz. The models which do not bear the OC moniker run at NVIDIA's reference clocks even if they are not quite fully grown.
Subject: Graphics Cards | February 14, 2017 - 09:29 PM | Scott Michaud
Tagged: opencl 2.0, opencl, nvidia, graphics drivers
While the headline of the GeForce 378.66 graphics driver release is support for For Honor, Halo Wars 2, and Sniper Elite 4, NVIDIA has snuck something major into the 378 branch: OpenCL 2.0 is now available for evaluation. (I double-checked 378.49 release notes and confirmed that this is new to 378.66.)
OpenCL 2.0 support is not complete yet, but at least NVIDIA is now clearly intending to roll it out to end-users. Among other benefits, OpenCL 2.0 allows kernels (think shaders) to, without the host intervening, enqueue work onto the GPU. This saves one (or more) round-trips to the CPU, especially in workloads where you don’t know which kernel will be required until you see the results of the previous run, like recursive sorting algorithms.
So yeah, that’s good, albeit you usually see big changes at the start of version branches.
Another major addition is Video SDK 8.0. This version allows 10- and 12-bit decoding of VP9 and HEVC video. So... yeah. Applications that want to accelerate video encoding or decoding can now hook up to NVIDIA GPUs for more codecs and features.
NVIDIA’s GeForce 378.66 drivers are available now.
Subject: Graphics Cards | February 14, 2017 - 05:57 PM | Scott Michaud
Tagged: amd, graphics drivers
Just in time for For Honor and Sniper Elite 4, AMD has released a new set of graphics drivers, Radeon Software Crimson ReLive 17.2.1, that target these games. The performance improvements that they quote are in the 4-5% range, when compared to their previous driver on the RX 480, which would be equivalent to saving a whole millisecond per frame at 60 FPS. (This is just for mathematical reference; I don’t know what performance users should expect with an RX 480.)
Beyond driver overhead improvements, you will now be able to utilize multiple GPUs in CrossFire (for DirectX 11) on both titles.
Also, several issues have been fixed with this version. If you have a FreeSync monitor, and some games fail to activate variable refresh mode, then this driver might solve this problem for you. Scrubbing through some videos (DXVA H.264) should no longer cause visible corruption. A couple applications, like GRID and DayZ, should no longer crash under certain situations. You get the idea.
If you have an AMD GPU on Windows, pick up these drivers from their support page.
The new EVGA GTX 1080 FTW2 with iCX Technology
Back in November of 2016, EVGA had a problem on its hands. The company had a batch of GTX 10-series graphics cards using the new ACX 3.0 cooler solution leave the warehouse missing thermal pads required to keep the power management hardware on its cards within reasonable temperature margins. To its credit, the company took the oversight seriously and instituted a set of solutions for consumers to select from: RMA, new VBIOS to increase fan speeds, or to install thermal pads on your hardware manually. Still, as is the case with any kind of product quality lapse like that, there were (and are) lingering questions about EVGA’s ability to maintain reliable product; with features and new options that don’t compromise the basics.
Internally, the drive to correct these lapses was…strong. From the very top of the food chain on down, it was hammered home that something like this simply couldn’t occur again, and even more so, EVGA was to develop and showcase a new feature set and product lineup demonstrating its ability to innovate. Thus was born, and accelerated, the EVGA iCX Technology infrastructure. While this was something in the pipeline for some time already, it was moved up to counter any negative bias that might have formed for EVGA’s graphics cards over the last several months. The goal was simple: prove that EVGA was the leader in graphics card design and prove that EVGA has learned from previous mistakes.
EVGA iCX Technology
Previous issues aside, the creation of iCX Technology is built around one simple question: is one GPU temperature sensor enough? For nearly all of today’s graphics cards, cooling is based around the temperature of the GPU silicon itself, as measured by NVIDIA (for all of EVGA’s cards). This is how fan curves are built, how GPU clock speeds are handled with GPU Boost, how noise profiles are created, and more. But as process technology has improved, and GPU design has weighed towards power efficiency, the GPU itself is often no longer the thermally limiting factor.
As it turns out, converting 12V (from the power supply) to ~1V (necessary for the GPU) is a simple process that creates a lot of excess heat. The thermal images above clearly demonstrate that and EVGA isn’t the only card vendor to take notice of this. As it turns out, EVGA’s product issue from last year was related to this – the fans were only spinning fast enough to keep the GPU cool and did not take into account the temperature of memory or power delivery.
The fix from EVGA is to ratchet up the number of sensors on the card PCB and wrap them with intelligence in the form of MCUs, updated Precision XOC software and user viewable LEDs on the card itself.
EVGA graphics cards with iCX Technology will include 9 total thermal sensors on the board, independent of the GPU temperature sensor directly integrated by NVIDIA. There are three sensors for memory, five for power delivery and an additional sensor for the GPU temperature. Some are located on the back of the PCB to avoid any conflicts with trace routing between critical components, including the secondary GPU sensor.
Subject: Graphics Cards | February 9, 2017 - 02:46 PM | Jeremy Hellstrom
Tagged: amd, nvidia
New graphics drivers are a boon to everyone who isn't a hardware reviewer, especially one who has just wrapped up benchmarking a new card the same day one is released. To address this issue see what changes have been implemented by AMD and NVIDIA in their last few releases, [H]ard|OCP tested a slew of recent drivers from both companies. The performance of AMD's past releases, up to and including the AMD Crimson ReLive Edition 17.1.1 Beta can be found here. For NVIDIA users, recent drivers covering up to the 378.57 Beta Hotfix are right here. The tests show both companies generally increasing the performance of their drivers, however the change is so small you are not going to notice a large difference.
"We take the AMD Radeon R9 Fury X and AMD Radeon RX 480 for a ride in 11 games using drivers from the time of each video card’s launch date, to the latest AMD Radeon Software Crimson ReLive Edition 17.1.1 Beta driver. We will see how performance in old and newer games has changed over the course of 2015-2017 with new drivers. "
Here are some more Graphics Card articles from around the web:
- Intel Celeron/Pentium/Core i3/i5/i7 - NVIDIA vs. AMD Linux Gaming Performance @ Phoronix
- PowerColor Radeon RX 470 Red Devil (4GB) @ Custom PC Review
- GeForce GTX 1080 @ Hardware Secrets
NVIDIA P100 comes to Quadro
At the start of the SOLIDWORKS World conference this week, NVIDIA took the cover off of a handful of new Quadro cards targeting professional graphics workloads. Though the bulk of NVIDIA’s discussion covered lower cost options like the Quadro P4000, P2000, and below, the most interesting product sits at the high end, the Quadro GP100.
As you might guess from the name alone, the Quadro GP100 is based on the GP100 GPU, the same silicon used on the Tesla P100 announced back in April of 2016. At the time, the GP100 GPU was specifically billed as an HPC accelerator for servers. It had a unique form factor with a passive cooler that required additional chassis fans. Just a couple of months later, a PCIe version of the GP100 was released under the Tesla GP100 brand with the same specifications.
Today that GPU hardware gets a third iteration as the Quadro GP100. Let’s take a look at the Quadro GP100 specifications and how it compares to some recent Quadro offerings.
|Quadro GP100||Quadro P6000||Quadro M6000||Full GP100|
|FP32 CUDA Cores / SM||64||64||64||64|
|FP32 CUDA Cores / GPU||3584||3840||3072||3840|
|FP64 CUDA Cores / SM||32||2||2||32|
|FP64 CUDA Cores / GPU||1792||120||96||1920|
|Base Clock||1303 MHz||1417 MHz||1026 MHz||TBD|
|GPU Boost Clock||1442 MHz||1530 MHz||1152 MHz||TBD|
|FP32 TFLOPS (SP)||10.3||12.0||7.0||TBD|
|FP64 TFLOPS (DP)||5.15||0.375||0.221||TBD|
|Memory Interface||1.4 Gbps
|Memory Bandwidth||716 GB/s||432 GB/s||316.8 GB/s||?|
|Memory Size||16GB||24 GB||12GB||16GB|
|TDP||235 W||250 W||250 W||TBD|
|Transistors||15.3 billion||12 billion||8 billion||15.3 billion|
|GPU Die Size||610mm2||471 mm2||601 mm2||610mm2|
There are some interesting stats here that may not be obvious at first glance. Most interesting is that despite the pricing and segmentation, the GP100 is not the de facto fastest Quadro card from NVIDIA depending on your workload. With 3584 CUDA cores running at somewhere around 1400 MHz at Boost speeds, the single precision (32-bit) rating for GP100 is 10.3 TFLOPS, less than the recently released P6000 card. Based on GP102, the P6000 has 3840 CUDA cores running at something around 1500 MHz for a total of 12 TFLOPS.
GP100 (full) Block Diagram
Clearly the placement for Quadro GP100 is based around its 64-bit, double precision performance, and its ability to offer real-time simulations on more complex workloads than other Pascal-based Quadro cards can offer. The Quadro GP100 offers 1/2 DP compute rate, totaling 5.2 TFLOPS. The P6000 on the other hand is only capable of 0.375 TLOPS with the standard, consumer level 1/32 DP rate. Inclusion of ECC memory support on GP100 is also something no other recent Quadro card has.
Raw graphics performance and throughput is going to be questionable until someone does some testing, but it seems likely that the Quadro P6000 will still be the best solution for that by at least a slim margin. With a higher CUDA core count, higher clock speeds and equivalent architecture, the P6000 should run games, graphics rendering and design applications very well.
There are other important differences offered by the GP100. The memory system is built around a 16GB HBM2 implementation which means more total memory bandwidth but at a lower capacity than the 24GB Quadro P6000. Offering 66% more memory bandwidth does mean that the GP100 offers applications that are pixel throughput bound an advantage, as long as the compute capability keeps up on the backend.
Subject: Graphics Cards | February 6, 2017 - 11:43 AM | Sebastian Peak
Tagged: video card, silent, Passive, palit, nvidia, KalmX, GTX 1050 Ti, graphics card, gpu, geforce
Palit is offering a passively-cooled GTX 1050 Ti option with their new KalmX card, which features a large heatsink and (of course) zero fan noise.
"With passive cooler and the advanced powerful Pascal architecture, Palit GeForce GTX 1050 Ti KalmX - pursue the silent 0dB gaming environment. Palit GeForce GTX 1050 Ti gives you the gaming horsepower to take on today’s most demanding titles in full 1080p HD @ 60 FPS."
The specs are identical to a reference GTX 1050 Ti (4GB GDDR5 @ 7 Gb/s, Base 1290/Boost 1392 MHz, etc.), so expect the full performance of this GPU - with some moderate case airflow, no doubt.
We don't have specifics on pricing or availablity just yet.
Subject: Graphics Cards | February 4, 2017 - 03:29 PM | Tim Verry
Tagged: micron, graphics memory, gddr6
This year is shaping up to be a good year for memory with the promise of 3D XPoint (Intel/Micron), HBM2 (SK Hynix and Samsung), and now GDDR6 graphics memory from Micron launching this year. While GDDR6 was originally planned to be launched next year, Micron recently announced its intentions to start producing the memory chips by the later half of 2017 which would put it much earlier than previously expected.
Computer World reports that Micron is citing the rise of e-sports and gaming driving the computer market that now sees three year upgrade cycles rather than five year cycles (I am not sure how accurate that is, however as it seems like PCs are actually lasting longer between upgrade as far as relevance but i digress) as the primary reason for shifting GDDR6 production into high gear and moving up the launch window. The company expects the e-sports market to grow to 500 million fans by 2020, and it is a growing market that Micron wants to stay relevant in.
If you missed our previous coverage, GDDR6 is the successor to GDDR5 and offers twice the bandwidth at 16 Gb/s (gigabits per second) per die. It is also faster than GDDR5X (12 Gb/s) and uses 20% less power which the gaming laptop market will appreciate. HBM2 still holds the bandwidth crown though as it offers 256 GB/s per stack and up to 1TB/s with four stacks connected to a GPU on package.
As such, High Bandwidth Memory (HBM2 and then HBM3) will power the high end gaming and professional graphics cards while GDDR6 will become the memory used for mid range cards and GDDR5X (which is actually capable of going faster but will likely not be pushed much past 12 Gbps after all if GDDR6 does come out this soon) will replace GDDR5 on most if not all of the lower end products.
I am not sure if Micron’s reasoning of e-sports, faster upgrade cycles, and VR being the motivating factor(s) to ramping up production early is sound or not, but I will certainly take the faster memory coming out sooner rather than later! Depending on exactly when in 2017 the chips start rolling off the fabs, we could see graphics cards using the new memory technology as soon as early 2018 (just in time for CES announcements? oh boy I can see the PR flooding in already! hehe).
Will Samsung change course as well and try for a 2017 release for its GDDR6 memory as well?
Are you ready for GDDR6?