NVIDIA Releases 358.50 WHQL Game Ready Drivers

Subject: Graphics Cards | October 7, 2015 - 01:45 PM |
Tagged: opengl es 3.2, nvidia, graphics drivers, geforce

The GeForce Game Ready 358.50 WHQL driver has been released so users can perform their updates before the Star Wars Battlefront beta goes live tomorrow (unless you already received a key). As with every “Game Ready” driver, NVIDIA ensures that the essential performance and stability tweaks are rolled in to this version, and tests it against the title. It is WHQL certified too, which is a recent priority for NVIDIA. Years ago, “Game Ready” drivers were often classified as Beta, but the company now intends to pass their work through Microsoft for a final sniff test.


Another interesting addition to this driver is the inclusion of OpenGL 2015 ARB and OpenGL ES 3.2. To use OpenGL ES 3.2 on the PC, if you want to develop software in it for instance, you needed to use a separate release since it was released at SIGGRAPH. It has now been rolled into the main, public driver. The mobile devs who use their production machines to play Battlefront rejoice, I guess. It might also be useful if developers, for instance at Mozilla or Google, want to create pre-release implementations of future WebGL specs too.

Source: NVIDIA

Who Decided to Call a Lightweight API "Metal"?

Subject: Graphics Cards | October 7, 2015 - 07:01 AM |
Tagged: opengl, metal, apple

Ars Technica took it upon themselves to benchmark Metal in the latest OSX El Capitan release. Even though OpenGL on Mac OSX is not considered to be on par with its Linux counterparts, which is probably due to the driver situation until recently, it pulls ahead of Metal in many situations.


Image Credit: Ars Technica

Unlike the other graphics APIs, Metal uses the traditional binding model. Basically, you have a GPU object that you attach your data to, then call one of a handful of “draw” functions to signal the driver. DirectX 12, Vulkan, and Mantle, on the other hand, treat work like commands on queues. The latter model works better in multi-core environments, and it aligns with GPU compute APIs, but the former is easier to port OpenGL and DirectX 11 applications to.

Ars Technica notes that faster GPUs, such as the NVIDIA GeForce GTX 680MX, show higher gains than slower ones. Their “best explanation” is that “faster GPUs can offload more work from the CPU”. That is pretty much true, yes. The new APIs are designed to keep GPUs loaded and working as much as possible, because they really do sit around doing nothing a lot. If you are able to keep a GPU loaded, because it can't accept much load in the first place, then there is little benefit to decreasing CPU load or spreading out across multiple cores.

Granted, there are many ways that benchmarks like these could be incorrectly used. I'll assume that Ars Technica and GFXBench are not making any simple mistakes, though, but it's good to be critical just in case.

Source: Ars Technica

NVIDIA Announces New "Bullets or Blades" GeForce Bundle

Subject: Graphics Cards | October 7, 2015 - 01:51 AM |

The latest game bundle for NVIDIA GPU customers offers the buyer a choice between Tom Clancy’s Rainbow Six Siege or Assassin’s Creed Syndicate.


To qualify for the free game you need to purchase a GTX 980 Ti, GTX 980, or GTX 970 graphics card. On the mobile side of things purchasing a laptop with GTX 970M or above graphics earns the game.

"It’s the final few months of the year, and as always that means a rush of new triple-A games that promise to excite and delight over the Holiday season. This year, Ubisoft's Assassin’s Creed Syndicate andTom Clancy's Rainbow Six Siege are vying for glory. And to ensure the definitive versions are found on PC we’ve teamed up with Ubisoft once again to add NVIDIA GameWorks effects to each, bringing richer, more detailed experiences to your desktop."

The Bullets or Blades bundle is already underway as of 10/06/15, and to qualify for the game codes purchases require the retailer to be participating in this program. Full details are available from NVIDIA here.

Source: NVIDIA

4K performance when you can spend at least $1.3K

Subject: Graphics Cards | October 6, 2015 - 02:40 PM |
Tagged: 4k, gtx titan x, fury x, GTX 980 Ti, crossfire, sli

[H]ard|OCP shows off just what you can achieve when you spend over $1000 on graphics cards and have a 4K monitor in their latest review.  In Project Cars you can expect never to see less than 40fps with everything cranked to maximum and if you invested in Titan X's you can even enable DS2X AntiAliasing for double the resolution, before down sampling.  Witcher 3 is a bit more challenging and no card is up for HairWorks without a noticeable hit to performance.  FarCry 4 still refuses to believe in Crossfire and as far as NVIDIA performance goes, if you want to see soft shadows you are going to have to invest in a pair of Titan X's.  Check out the full review to see what the best of the current market is capable of.


"The ultimate 4K battle is about to begin, AMD Radeon R9 Fury X CrossFire, NVIDIA GeForce GTX 980 Ti SLI, and NVIDIA GeForce GTX TITAN X SLI will compete for the best gameplay experience at 4K resolution. Find out what $1300 to $2000 worth of GPU backbone will buy you. And find out if Fiji really can 4K."

Here are some more Graphics Card articles from around the web:

Graphics Cards

Source: [H]ard|OCP

AMD Releases Catalyst 15.9.1 to Fix Several Issues

Subject: Graphics Cards | October 5, 2015 - 07:13 AM |
Tagged: graphics drivers, amd

Apparently users of AMD's Catalyst 15.9 drivers have been experiencing issues. Specifically, “major memory leaks” could be caused by adjusting windows, such as resizing them or snapping them to edges of the desktop. According to PC Gamer, AMD immediately told users to roll back when they found out about the bug.


They have since fixed it with Catalyst 15.9.1 Beta. This subversion driver also fixes crashes and potential “signal loss” problems with a BenQ FreeSync monitor. As such, if you were interested in playing around with the Catalyst 15.9 beta driver, then it should be safe to do so now. I wish I could offer more input, but I just found out about it and it seems pretty cut-and-dry: if you had problems, they should be fixed. The update is available here.

Source: PC Gamer

Report: AMD's Dual-GPU Fiji XT Card Might Be Coming Soon

Subject: Graphics Cards | October 5, 2015 - 02:33 AM |
Tagged: rumor, report, radeon, graphics cards, Gemini, fury x, fiji xt, dual-GPU, amd

The AMD R9 Fury X, Fury, and Nano have all been released, but a dual-GPU Fiji XT card could be on the way soon according to a new report.


Back in June at AMD's E3 event we were shown Project Quantum, AMD's concept for a powerful dual-GPU system in a very small form-factor. It was speculated that the system was actually housing an unreleased dual-GPU graphic card, which would have made sense given the very small size of the system (and mini-ITX motherboard therein). Now a report from WCCFtech is pointing to a manifest that just might be a shipment of this new dual-GPU card, and the code-name is Gemini.


"Gemini is the code-name AMD has previously used in the past for dual GPU variants and surprisingly, the manifest also contains another phrase: ‘Tobermory’. Now this could simply be a reference to the port that the card shipped from...or it could be the actual codename of the card, with Gemini just being the class itself."

The manifest also indicates a Cooler Master cooler for the card, the maker of the liquid cooling solution for the Fury X. As the Fury X has had its share of criticism for pump whine issues it would be interesting to see how a dual-GPU cooling solution would fare in that department, though we could be seeing an entirely new generation of the pump as well. Of course speculation on an unreleased product like this could be incorrect, and verifiable hard details aren't available yet. Still, of the dual-GPU card is based on a pair of full Fiji XT cores the specs could be very impressive to say the least:

  • Core: Fiji XT x2
  • Stream Processors: 8192
  • GCN Compute Units: 128
  • ROPs: 128
  • TMUs: 512
  • Memory: 8 GB (4GB per GPU)
  • Memory Interface: 4096-bit x2
  • Memory Bandwidth: 1024 GB/s

In addition to the specifics above the report also discussed the possibility of 17.2 TFLOPS of performance based on 2x the performance of Fury X, which would make the Gemini product one of the most powerful single-card GPU solutions in the world. The card seems close enough to the final stage that we should expect to hear something official soon, but for now it's fun to speculate - unless of course the speculation concerns a high initial retail price, and unfortunately something at or above $1000 is quite likely. We shall see.

Source: WCCFtech
Manufacturer: NVIDIA

GPU Enthusiasts Are Throwing a FET

NVIDIA is rumored to launch Pascal in early (~April-ish) 2016, although some are skeptical that it will even appear before the summer. The design was finalized months ago, and unconfirmed shipping information claims that chips are being stockpiled, which is typical when preparing to launch a product. It is expected to compete against AMD's rumored Arctic Islands architecture, which will, according to its also rumored numbers, be very similar to Pascal.

This architecture is a big one for several reasons.


Image Credit: WCCFTech

First, it will jump two full process nodes. Current desktop GPUs are manufactured at 28nm, which was first introduced with the GeForce GTX 680 all the way back in early 2012, but Pascal will be manufactured on TSMC's 16nm FinFET+ technology. Smaller features have several advantages, but a huge one for GPUs is the ability to fit more complex circuitry in the same die area. This means that you can include more copies of elements, such as shader cores, and do more in fixed-function hardware, like video encode and decode.

That said, we got a lot more life out of 28nm than we really should have. Chips like GM200 and Fiji are huge, relatively power-hungry, and complex, which is a terrible idea to produce when yields are low. I asked Josh Walrath, who is our go-to for analysis of fab processes, and he believes that FinFET+ is probably even more complicated today than 28nm was in the 2012 timeframe, which was when it launched for GPUs.

It's two full steps forward from where we started, but we've been tiptoeing since then.


Image Credit: WCCFTech

Second, Pascal will introduce HBM 2.0 to NVIDIA hardware. HBM 1.0 was introduced with AMD's Radeon Fury X, and it helped in numerous ways -- from smaller card size to a triple-digit percentage increase in memory bandwidth. The 980 Ti can talk to its memory at about 300GB/s, while Pascal is rumored to push that to 1TB/s. Capacity won't be sacrificed, either. The top-end card is expected to contain 16GB of global memory, which is twice what any console has. This means less streaming, higher resolution textures, and probably even left-over scratch space for the GPU to generate content in with compute shaders. Also, according to AMD, HBM is an easier architecture to communicate with than GDDR, which should mean a savings in die space that could be used for other things.

Third, the architecture includes native support for three levels of floating point precision. Maxwell, due to how limited 28nm was, saved on complexity by reducing 64-bit IEEE 754 decimal number performance to 1/32nd of 32-bit numbers, because FP64 values are rarely used in video games. This saved transistors, but was a huge, order-of-magnitude step back from the 1/3rd ratio found on the Kepler-based GK110. While it probably won't be back to the 1/2 ratio that was found in Fermi, Pascal should be much better suited for GPU compute.


Image Credit: WCCFTech

Mixed precision could help video games too, though. Remember how I said it supports three levels? The third one is 16-bit, which is half of the format that is commonly used in video games. Sometimes, that is sufficient. If so, Pascal is said to do these calculations at twice the rate of 32-bit. We'll need to see whether enough games (and other applications) are willing to drop down in precision to justify the die space that these dedicated circuits require, but it should double the performance of anything that does.

So basically, this generation should provide a massive jump in performance that enthusiasts have been waiting for. Increases in GPU memory bandwidth and the amount of features that can be printed into the die are two major bottlenecks for most modern games and GPU-accelerated software. We'll need to wait for benchmarks to see how the theoretical maps to practical, but it's a good sign.

The fast and the Fury(ous): 4K

Subject: Graphics Cards | September 28, 2015 - 04:45 PM |
Tagged: R9 Fury, asus strix r9 fury, r9 390x, GTX 980, crossfire, sli, 4k

Bring your wallets to this review from [H]ard|OCP which pits multiple AMD and NVIDIA GPUs against each other at 4K resolutions and no matter the outcome it won't be cheap!  They used the Catalyst 15.8 Beta and the GeForce 355.82 WHQL which were the latest drivers available at the time of writing as well as trying out Windows 10 Pro x64.  There were some interesting results, for instance you want an AMD card when driving in the rain playing Project Cars as the GTX 980's immediately slowed down in inclement weather.  With Witcher 3, AMD again provided frames faster but unfortunately the old spectre of stuttering appeared, which those of you familiar with our Frame Rating tests will understand the source of.  Dying Light proved to be a game that liked VRAM with the 390X taking top spot though sadly neither AMD card could handle Crossfire in Far Cry 4.  There is a lot of interesting information in the review and AMD's cards certainly show their mettle but the overall winner is not perfectly clear, [H] chose Fury the R9 Fury with a caveat about Crossfire support.


"We gear up for multi-GPU gaming with AMD Radeon R9 Fury CrossFire, NVIDIA GeForce GTX 980 SLI, and AMD Radeon R9 390X CrossFire and share our head-to-head results at 4K resolution and find out which solution offers the best gameplay experience. How well does Fiji game when utilized in a CrossFire configuration?"

Here are some more Graphics Card articles from around the web:

Graphics Cards

Source: [H]ard|OCP

NVIDIA Publishes DirectX 12 Tips for Developers

Subject: Graphics Cards | September 26, 2015 - 09:10 PM |
Tagged: microsoft, windows 10, DirectX 12, dx12, nvidia

Programming with DirectX 12 (and Vulkan, and Mantle) is a much different process than most developers are used to. The biggest change is how work is submit to the driver. Previously, engines would bind attributes to a graphics API and issue one of a handful of “draw” commands, which turns the current state of the API into a message. Drivers would play around with queuing them and manipulating them, to optimize how these orders are sent to the graphics device, but the game developer had no control over that.


Now, the new graphics APIs are built more like command lists. Instead of bind, call, bind, call, and so forth, applications request queues to dump work into, and assemble the messages themselves. It even allows these messages to be bundled together and sent as a whole. This allows direct control over memory and the ability to distribute a lot of the command control across multiple CPU cores. Applications are only as fast as its slowest (relevant) thread, so the ability to spread work out increases actual performance.

NVIDIA has created a large list of things that developers should do, and others that they should not, to increase performance. Pretty much all of them apply equally, regardless of graphics vendor, but there are a few NVIDIA-specific comments, particularly the ones about NvAPI at the end and a few labeled notes in the “Root Signatures” category.

The tips are fairly diverse, covering everything from how to efficiently use things like command lists, to how to properly handle multiple GPUs, and even how to architect your engine itself. Even if you're not a developer, it might be interesting to look over to see how clues about what makes the API tick.

Source: NVIDIA

Nintendo Joins the Khronos Group

Subject: Graphics Cards | September 26, 2015 - 03:46 PM |
Tagged: Nintendo, Khronos

Console developers need to use the APIs that are laid out by the system's creator. Nintendo has their own graphics API for the last three generations, called GX, although it is rumored to be somewhat like OpenGL. A few days ago, Nintendo's logo appeared on the Khronos Group's website as a Contributor Member. This leads sites like The Register to speculate that Nintendo “pledges allegiance to the Vulkan (API)”.

I wouldn't be so hasty.


There are many reasons why a company would want to become a member of the Khronos Group. Microsoft, for instance, decided that the small, $15,000 USD/year membership fee was worth it to influence the future of WebGL. Nintendo, at least currently, does not make their own web browser, they license NetFront from Access Co. Ltd., but that could change (just like their original choice of Opera Mini did). Even with a licensed browser, they might want to discuss and vote on the specifics. But yes, WebGL is unlikely to be on their minds, let alone a driving reason, especially since they are not involved with the W3C. Another unlikely option is OpenCL, especially if they get into cloud services, but I can't see them caring enough about the API to do anything more than blindly use it.

Vulkan is, in fact, most likely what Nintendo is interested in, but that also doesn't mean that they will support it. The membership fee is quite low for a company like Nintendo, and, even if they don't use the API, their input could benefit them, especially since they rely upon third parties for graphics processors. Pushing for additions to Vulkan could force GPU vendors to adopt it, so it will be available for their own APIs, and so forth. There might even be some learning, up to the limits of the Khronos Group's confidentiality requirements.

Or, of course, Nintendo could adopt the Vulkan API to some extent. We'll see. Either way, the gaming company is beginning to open up with industry bodies. This could be positive.

Source: NeoGAF