All | Editorial | General Tech | Graphics Cards | Networking | Motherboards | Cases and Cooling | Processors | Chipsets | Memory | Displays | Systems | Storage | Mobile | Shows and Expos
GeForce GTX 980M Performance Testing
When NVIDIA launched the GeForce GTX 980 and GTX 970 graphics cards last month, part of the discussion at our meetings also centered around the mobile variants of Maxwell. The NDA was a bit later though and Scott wrote up a short story announcing the release of the GTX 980M and the GTX 970M mobility GPUs. Both of these GPUs are based on the same GM204 design as the desktop cards, though as you should have come to expect by now, do so with lower specifications than the similarly-named desktop options. Take a look:
|GTX 980M||GTX 970M||
|Memory||Up to 4GB||Up to 3GB||4GB||4GB||4GB/8GB|
|Memory Rate||2500 MHz||2500 MHz||7.0 (GT/s)||7.0 (GT/s)||2500 MHz|
Just like the desktop models, GTX 980M and GTX 970M are built on the 28nm process technology and are tweaked and built for power efficiency - one of the reasons the mobile release of this product is so interesting.
With a CUDA core count of 1536, the GTX 980M has 33% fewer shader cores than the desktop GTX 980, along with a slightly lower base clock speed. The result is a peak theoretical performance of 3.189 TFLOPs, compared to 4.6 TFLOPs on the GTX 980 desktop. In fact, that is only slightly higher than the GTX 880M based on Kepler, that clocks in with the same CUDA core count (1536) but a TFLOP capability of 2.9. Bear in mind that the GTX 880M is using a different architecture design than the GTX 980M; Maxwell's design advantages go beyond just CUDA core count and clock speed.
The GTX 970M is even smaller, with a CUDA core count of 1280 and peak performance rated at 2.365 TFLOPs. Also notice that the memory bus width has shrunk from 256-bit to 192-bit for this part.
As is typically the case with mobile GPUs, the memory speed of the GTX 980M and GTX 970M is significantly lower than the desktop parts. While the GeForce GTX 980 and 970 that install in your desktop PC will have memory running at 7.0 GHz, the mobile versions will run at 5.0 GHz in order to conserve power.
From a feature set stand point though, the GTX 980M/970M are very much the same as the desktop parts that I looked at in September. You will have support for VXGI, NVIDIA's new custom global illumination technology, Multi-Frame AA and maybe most interestingly, Dynamic Super Resolution (DSR). DSR allows you to render a game at a higher resolution and then use a custom filter to down sample it back to your panel's native resolution. For mobile gamers that are using 1080p screens (as our test sample shipped with) this is a good way to utilize the power of your GPU for less power-hungry games, while getting a surprisingly good image at the same time.
SLI Setup and Testing Configuration
The idea of multi-GPU gaming is pretty simple on the surface. By adding another GPU into your gaming PC, the game and the driver are able to divide the workload of the game engine and send half of the work to one GPU and half to another, then combining that work on to your screen in the form of successive frames. This should make the average frame rate much higher, improve smoothness and just basically make the gaming experience better. However, implementation of multi-GPU technologies like NVIDIA SLI and AMD CrossFire are much more difficult than the simply explanation above. We have traveled many steps in this journey and while things have improved in several key areas, there is still plenty of work to be done in others.
As it turns out, support for GPUs beyond two seems to be one of those areas ready for improvement.
When the new NVIDIA GeForce GTX 980 launched last month my initial review of the product included performance results for GTX 980 cards running in a 2-Way SLI configuration, by far the most common derivative. As it happens though, another set of reference GeForce GTX 980 cards found there way to our office and of course we needed to explore the world of 3-Way and 4-Way SLI support and performance on the new Maxwell GPU.
The dirty secret for 3-Way and 4-Way SLI (and CrossFire for that matter) is that it just doesn't work as well or as smoothly as 2-Way configurations. Much more work is put into standard SLI setups as those are by far the most common and it doesn't help that optimizing for 3-4 GPUs is more complex. Some games will scale well, others will scale poorly; hell some even scale the other direction.
Let's see what the current state of high GPU count SLI is with the GeForce GTX 980 and whether or not you should consider purchasing more than of these new flagship parts.
Quick Performance Comparison
Earlier this week, we posted a brief story that looked at the performance of Middle-earth: Shadow of Mordor on the latest GPUs from both NVIDIA and AMD. Last week also marked the release of the v1.11 patch for Sniper Elite 3 that introduced an integrated benchmark mode as well as support for AMD Mantle.
I decided that this was worth a quick look with the same line up of graphics cards that we used to test Shadow of Mordor. Let's see how the NVIDIA and AMD battle stacks up here.
For those unfamiliar with the Sniper Elite series, the focuses on the impact of an individual sniper on a particular conflict and Sniper Elite 3 doesn't change up that formula much. If you have ever seen video of a bullet slowly going through a body, allowing you to see the bones/muscle of the particular enemy being killed...you've probably been watching the Sniper Elite games.
Gore and such aside, the game is fun and combines sniper action with stealth and puzzles. It's worth a shot if you are the kind of gamer that likes to use the sniper rifles in other FPS titles.
But let's jump straight to performance. You'll notice that in this story we are not using our Frame Rating capture performance metrics. That is a direct result of wanting to compare Mantle to DX11 rendering paths - since we have no way to create an overlay for Mantle, we have resorted to using FRAPs and the integrated benchmark mode in Sniper Elite 3.
Our standard GPU test bed was used with a Core i7-3960X processor, an X79 motherboard, 16GB of DDR3 memory, and the latest drivers for both parties involved. That means we installed Catalyst 14.9 for AMD and 344.16 for NVIDIA. We'll be comparing the GeForce GTX 980 to the Radeon R9 290X, and the GTX 970 to the R9 290. We will also look at SLI/CrossFire scaling at the high end.
If there is one message that I get from NVIDIA's GeForce GTX 900M-series announcement, it is that laptop gaming is a first-class citizen in their product stack. Before even mentioning the products, the company provided relative performance differences between high-end desktops and laptops. Most of the rest of the slide deck is showing feature-parity with the desktop GTX 900-series, and a discussion about battery life.
First, the parts. Two products have been announced: The GeForce GTX 980M and the GeForce GTX 970M. Both are based on the 28nm Maxwell architecture. In terms of shading performance, the GTX 980M has a theoretical maximum of 3.189 TFLOPs, and the GTX 970M is calculated at 2.365 TFLOPs (at base clock). On the desktop, this is very close to the GeForce GTX 770 and the GeForce GTX 760 Ti, respectively. This metric is most useful when you're compute bandwidth-bound, at high resolution with complex shaders.
The full specifications are:
|GTX 980M||GTX 970M||
|Memory||Up to 4GB||Up to 3GB||4GB||4GB||4GB/8GB|
|Memory Rate||2500 MHz||2500 MHz||7.0 (GT/s)||7.0 (GT/s)||2500 MHz|
As for the features, it should be familiar for those paying attention to both desktop 900-series and the laptop 800M-series product launches. From desktop Maxwell, the 900M-series is getting VXGI, Dynamic Super Resolution, and Multi-Frame Sampled AA (MFAA). From the latest generation of Kepler laptops, the new GPUs are getting an updated BatteryBoost technology. From the rest of the GeForce ecosystem, they will also get GeForce Experience, ShadowPlay, and so forth.
For VXGI, DSR, and MFAA, please see Ryan's discussion for the desktop Maxwell launch. Information about these features is basically identical to what was given in September.
BatteryBoost, on the other hand, is a bit different. NVIDIA claims that the biggest change is just raw performance and efficiency, giving you more headroom to throttle. Perhaps more interesting though, is that GeForce Experience will allow separate one-click optimizations for both plugged-in and battery use cases.
The power efficiency demonstrated with the Maxwell GPU in Ryan's original GeForce GTX 980 and GTX 970 review is even more beneficial for the notebook market where thermal designs are physically constrained. Longer battery life, as well as thinner and lighter gaming notebooks, will see tremendous advantages using a GPU that can run at near peak performance on the maximum power output of an integrated battery. In NVIDIA's presentation, they mention that while notebooks on AC power can use as much as 230 watts of power, batteries tend to peak around 100 watts. Given that a full speed, desktop-class GTX 980 has a TDP of 165 watts, compared to the 250 watts of a Radeon R9 290X, translates into notebook GPU performance that will more closely mirror its desktop brethren.
Of course, you probably will not buy your own laptop GPU; rather, you will be buying devices which integrate these. There are currently five designs across four manufacturers that are revealed (see image above). Three contain the GeForce GTX 980M, one has a GTX 970M, and the other has a pair of GTX 970Ms. Prices and availability are not yet announced.
In what can most definitely be called the best surprise of the fall game release schedule, the open-world action game set in the Lord of the Rings world, Middle-earth: Shadow of Mordor has been receiving impressive reviews from gamers and the media. (GiantBomb.com has a great look at it if you are new to the title.) What also might be a surprise to some is that the PC version of the game can be quite demanding on even the latest PC hardware, pulling in frame rates only in the low-60s at 2560x1440 with its top quality presets.
Late last week I spent a couple of days playing around with Shadow of Mordor as well as the integrated benchmark found inside the Options menu. I wanted to get an idea of the performance characteristics of the game to determine if we might include this in our full-time game testing suite update we are planning later in the fall. To get some sample information I decided to run through a couple of quality presets with the top two cards from NVIDIA and AMD and compare them.
Without a doubt, the visual style of Shadow of Mordor is stunning – with the game settings cranked up high the world, characters and fighting scenes look and feel amazing. To be clear, in the build up to this release we had really not heard anything from the developer or NVIDIA (there is an NVIDIA splash screen at the beginning) about the title which is out of the ordinary. If you are looking for a game that is both fun to play (I am 4+ hours in myself) and can provide a “wow” factor to show off your PC rig then this is definitely worth picking up.
Installation and Overview
While once a very popular way to cool your PC, the art of custom water loops tapered off in the early 2000s as the benefits of better cooling, and overclocking in general, met with diminished returns. In its place grew a host of companies offering closed loop system, individually sealed coolers for processors and even graphics cards that offered some of the benefits of standard water cooling (noise, performance) without the hassle of setting up a water cooling configuration manually.
A bit of a resurgence has occurred in the last year or two though where the art and styling provided by custom water loop cooling is starting to reassert itself into the PC enthusiast mindset. Some companies never left (EVGA being one of them), but it appears that many of the users are returning to it. Consider me part of that crowd.
During a live stream we held with EVGA's Jacob Freeman, the very first prototype of the EVGA Hydro Copper was shown and discussed. Lucky for us, I was able to coerce Jacob into leaving the water block with me for a few days to do some of our testing and see just how much capability we could pull out of the GM204 GPU and a GeForce GTX 980.
Our performance preview today will look at the water block itself, installation, performance and temperature control. Keep in mind that this is a very early prototype, the first one to make its way to US shores. There will definitely be some changes and updates (in both the hardware and the software support for overclocking) before final release in mid to late October. Should you consider this ~$150 Hydro Copper water block for your GTX 980?
The GM204 Architecture
James Clerk Maxwell's equations are the foundation of our society's knowledge about optics and electrical circuits. It is a fitting tribute from NVIDIA to include Maxwell as a code name for a GPU architecture and NVIDIA hopes that features, performance, and efficiency that they have built into the GM204 GPU would be something Maxwell himself would be impressed by. Without giving away the surprise conclusion here in the lead, I can tell you that I have never seen a GPU perform as well as we have seen this week, all while changing the power efficiency discussion in as dramatic a fashion.
To be fair though, this isn't our first experience with the Maxwell architecture. With the release of the GeForce GTX 750 Ti and its GM107 GPU, NVIDIA put the industry on watch and let us all ponder if they could possibly bring such a design to a high end, enthusiast class market. The GTX 750 Ti brought a significantly lower power design to a market that desperately needed it, and we were even able to showcase that with some off-the-shelf PC upgrades, without the need for any kind of external power.
That was GM107 though; today's release is the GM204, indicating that not only are we seeing the larger cousin of the GTX 750 Ti but we also have at least some moderate GPU architecture and feature changes from the first run of Maxwell. The GeForce GTX 980 and GTX 970 are going to be taking on the best of the best products from the GeForce lineup as well as the AMD Radeon family of cards, with aggressive pricing and performance levels to match. And, for those that understand the technology at a fundamental level, you will likely be surprised by how much power it requires to achieve these goals. Toss in support for things like a new AA method, Dynamic Super Resolution, and even improved SLI performance and you can see why doing it all on the same process technology is impressive.
The NVIDIA Maxwell GM204 Architecture
The NVIDIA Maxwell GM204 graphics processor was built from the ground up with an emphasis on power efficiency. As it was stated many times during the technical sessions we attended last week, the architecture team learned quite a bit while developing the Kepler-based Tegra K1 SoC and much of that filtered its way into the larger, much more powerful product you see today. This product is fast and efficient, but it was all done while working on the same TSMC 28nm process technology used on the Kepler GTX 680 and even AMD's Radeon R9 series of products.
The fundamental structure of GM204 is setup like the GM107 product shipped as the GTX 750 Ti. There is an array of GPCs (Graphics Processing Clustsers), each comprised of multiple SMs (Streaming Multiprocessors, also called SMMs for this Maxwell derivative) and external memory controllers. The GM204 chip (the full implementation of which is found on the GTX 980), consists of 4 GPCs, 16 SMMs and four 64-bit memory controllers.
A few days with some magic monitors
Last month friend of the site and technology enthusiast Tom Petersen, who apparently does SOMETHING at NVIDIA, stopped by our offices to talk about G-Sync technology. A variable refresh rate feature added to new monitors with custom NVIDIA hardware, G-Sync is a technology that has been frequently discussed on PC Perspective.
The first monitor to ship with G-Sync is the ASUS ROG Swift PG278Q - a fantastic 2560x1440 27-in monitor with a 144 Hz maximum refresh rate. I wrote a glowing review of the display here recently with the only real negative to it being a high price tag: $799. But when Tom stopped out to talk about the G-Sync retail release, he happened to leave a set of three of these new displays for us to mess with in a G-Sync Surround configuration. Yummy.
So what exactly is the current experience of using a triple G-Sync monitor setup if you were lucky enough to pick up a set? The truth is that the G-Sync portion of the equation works great but that game support for Surround (or Eyefinity for that matter) is still somewhat cumbersome.
In this quick impressions article I'll walk through the setup and configuration of the system and tell you about my time playing seven different PC titles in G-Sync Surround.
Tonga GPU Features
On December 22, 2011, AMD launched the first 28nm GPU based on an architecture called GCN on the code name Tahiti silicon. That was the release of the Radeon HD 7970 and it was the beginning of an incredibly long adventure for PC enthusiasts and gamers. We eventually saw the HD 7970 GHz Edition and the R9 280/280X releases, all based on essentially identical silicon, keeping a spot in the market for nearly 3 years. Today AMD is launching the Tonga GPU and Radeon R9 285, a new piece of silicon that shares many traits of Tahiti but adds support for some additional features.
Replacing the Radeon R9 280 in the current product stack, the R9 285 will step in at $249, essentially the same price. Buyers will be treated to an updated feature set though including options that were only previously available on the R9 290 and R9 290X (and R7 260X). These include TrueAudio, FreeSync, XDMA CrossFire and PowerTune.
Many people have been calling this architecture GCN 1.1 though AMD internally doesn't have a moniker for it. The move from Tahiti, to Hawaii and now to Tonga, reveals a new design philosophy from AMD, one of smaller and more gradual steps forward as opposed to sudden, massive improvements in specifications. Whether this change was self-imposed or a result of the slowing of process technology advancement is really a matter of opinion.
The Waiting Game
NVIDIA G-Sync was announced at a media event held in Montreal way back in October, and promised to revolutionize the way the display and graphics card worked together to present images on the screen. It was designed to remove hitching, stutter, and tearing -- almost completely. Since that fateful day in October of 2013, we have been waiting. Patiently waiting. We were waiting for NVIDIA and its partners to actually release a monitor that utilizes the technology and that can, you know, be purchased.
In December of 2013 we took a look at the ASUS VG248QE monitor, the display for which NVIDIA released a mod kit to allow users that already had this monitor to upgrade to G-Sync compatibility. It worked, and I even came away impressed. I noted in my conclusion that, “there isn't a single doubt that I want a G-Sync monitor on my desk” and, “my short time with the NVIDIA G-Sync prototype display has been truly impressive…”. That was nearly 7 months ago and I don’t think anyone at that time really believed it would be THIS LONG before the real monitors began to show in the hands of gamers around the world.
Since NVIDIA’s October announcement, AMD has been on a marketing path with a technology they call “FreeSync” that claims to be a cheaper, standards-based alternative to NVIDIA G-Sync. They first previewed the idea of FreeSync on a notebook device during CES in January and then showed off a prototype monitor in June during Computex. Even more recently, AMD has posted a public FAQ that gives more details on the FreeSync technology and how it differs from NVIDIA’s creation; it has raised something of a stir with its claims on performance and cost advantages.
That doesn’t change the product that we are reviewing today of course. The ASUS ROG Swift PG278Q 27-in WQHD display with a 144 Hz refresh rate is truly an awesome monitor. What did change is the landscape, from NVIDIA's original announcement until now.
Experience with Silent Design
In the time periods between major GPU releases, companies like ASUS have the ability to really dig down and engineer truly unique products. With the expanded time between major GPU releases, from either NVIDIA or AMD, these products have continued evolving to offer better features and experiences than any graphics card before them. The ASUS Strix GTX 780 is exactly one of those solutions – taking a GTX 780 GPU that was originally released in May of last year and twisting it into a new design that offers better cooling, better power and lower noise levels.
ASUS intended, with the Strix GTX 780, to create a card that is perfect for high end PC gamers, without crossing into the realm of bank-breaking prices. They chose to go with the GeForce GTX 780 GPU from NVIDIA at a significant price drop from the GTX 780 Ti, with only a modest performance drop. They double the reference memory capacity from 3GB to 6GB of GDDR5, to assuage any buyer’s thoughts that 3GB wasn’t enough for multi-screen Surround gaming or 4K gaming. And they change the cooling solution to offer a near silent operation mode when used in “low impact” gaming titles.
The ASUS Strix GTX 780 Graphics Card
The ASUS Strix GTX 780 card is a pretty large beast, both in physical size and in performance. The cooler is a slightly modified version of the very popular DirectCU II thermal design used in many of the custom built ASUS graphics cards. It has a heat dissipation area more than twice that of the reference NVIDIA cooler and uses larger fans that allow them to spin slower (and quieter) at the improved cooling capacity.
Out of the box, the ASUS Strix GTX 780 will run at 889 MHz base clock and 941 MHz Boost clock, a fairly modest increase over the 863/900 MHz rates of the reference card. Obviously with much better cooling and a lot of work being done on the PCB of this custom design, users will have a lot of headroom to overclock on their own, but I continue to implore companies like ASUS and MSI to up the ante out of the box! One area where ASUS does impress is with the memory – the Strix card features a full 6GB of GDDR5 running 6.0 GHz, twice the capacity of the reference GTX 780 (and even GTX 780 Ti) cards. If you had any concerns about Surround or 4K gaming, know that memory capacity will not be a problem. (Though raw compute power may still be.)
When Magma Freezes Over...
Intel confirms that they have approached AMD about access to their Mantle API. The discussion, despite being clearly labeled as "an experiment" by an Intel spokesperson, was initiated by them -- not AMD. According to AMD's Gaming Scientist, Richard Huddy, via PCWorld, AMD's response was, "Give us a month or two" and "we'll go into the 1.0 phase sometime this year" which only has about five months left in it. When the API reaches 1.0, anyone who wants to participate (including hardware vendors) will be granted access.
AMD inside Intel Inside???
I do wonder why Intel would care, though. Intel has the fastest per-thread processors, and their GPUs are not known to be workhorses that are held back by API call bottlenecks, either. Of course, that is not to say that I cannot see any reason, however...
The Radeon R9 280
Though not really new, the AMD Radeon R9 280 GPU is a part that we really haven't spent time with at PC Perspective. Based on the same Tahiti GPU found in the R9 280X, the HD 7970, the HD 7950 and others, the R9 280 fits at a price point and performance level that I think many gamers will see as enticing. MSI sent along a model that includes some overclocked settings and an updated cooler, allowing the GPU to run at its top speed without much noise.
With a starting price of just $229 or so, the MSI Radeon R9 280 Gaming graphics cards has some interesting competition as well. From the AMD side it butts heads with the R9 280X and the R9 270X. The R9 280X costs $60-70 more though and as you'll see in our benchmarks, the R9 280 will likely cannibalize some of those sales. From NVIDIA, the GeForce GTX 760 is priced right at $229 as well, but does it really have the horsepower to keep with Tahiti?
The Road to 1080p
The stars of the show: a group of affordable GPU options
When preparing to build or upgrade a PC on any kind of a budget, how can you make sure you're extracting the highest performance per dollar from the parts you choose? Even if you do your homework comparing every combination of components is impossible. As system builders we always end up having to look at various benchmarks here and there and then ultimately make assumptions. It's the nature of choosing products within an industry that's completely congested at every price point.
Another problem is that lower-priced graphics cards are usually benchmarked on high-end test platforms with Core i7 processors - which is actually a necessary thing when you need to eliminate CPU bottlenecks from the mix when testing GPUs. So it seems like it might be valuable (and might help narrow buying choices down) if we could take a closer look at gaming performance from complete systems built with only budget parts, and see what these different combinations are capable of.
With this in mind I set out to see just how much it might take to reach acceptable gaming performance at 1080p (acceptable being 30 FPS+). I wanted to see where the real-world gaming bottlenecks might occur, and get a feel for the relationship between CPU and GPU performance. After all, if there was no difference in gaming performance between, say, a $40 and an $80 processor, why spend twice as much money? The same goes for graphics. We’re looking for “good enough” here, not “future-proof”.
The components in all their shiny boxy-ness (not everything made the final cut)
If money was no object we’d all have the most amazing high-end parts, and play every game at ultra settings with hundreds of frames per second (well, except at 4K). Of course most of us have limits, but the time and skill required to assemble a system with as little cash as possible can result in something that's actually a lot more rewarding (and impressive) than just throwing a bunch of money at top-shelf components.
The theme of this article is good enough, as in, don't spend more than you have to. I don't want this to sound like a bad thing. And if along the way you discover a bargain, or a part that overperforms for the price, even better!
Yet Another AM1 Story?
We’ve been talking about the AMD AM1 platform since its introduction, and it makes a compelling case for a low cost gaming PC. With the “high-end” CPU in the lineup (the Athlon 5350) just $60 and motherboards in the $35 range, it makes sense to start here. (I actually began this project with the Sempron 3820 as well, but it just wasn’t enough for 1080p gaming by a long shot so the test results were quickly discarded.) But while the 5350 is an APU, I didn't end up testing it without a dedicated GPU. (Ok, I eventually did but it just can't handle 1080p.)
But this isn’t just a story about AM1 after all. Jumping right in here, let's look at the result of my research (and mounting credit card debt). All prices were accurate as I wrote this, but are naturally prone to fluctuate:
|Memory||4GB Samsung OEM PC3-12800 DDR3-1600 (~$40 Value)|
|Storage||Western Digital Blue 1TB Hard Drive - $59.99|
|Power Supply||EVGA 430 Watt 80 PLUS PSU - $39.99|
|OS||Windows 8.1 64-bit - $99|
So there it is. I'm sure it won't please everyone, but there is enough variety in this list to support no less than 16 different combinations, and you'd better believe I ran each test on every one of those 16 system builds!
A powerful architecture
In March of this year, NVIDIA announced the GeForce GTX Titan Z at its GPU Technology Conference. It was touted as the world's fastest graphics card with its pair of full GK110 GPUs but it came with an equally stunning price of $2999. NVIDIA claimed it would be available by the end of April for gamers and CUDA developers to purchase but it was pushed back slightly and released at the very end of May, going on sale for the promised price of $2999.
The specifications of GTX Titan Z are damned impressive - 5,760 CUDA cores, 12GB of total graphics memory, 8.1 TFLOPs of peak compute performance. But something happened between the announcement and product release that perhaps NVIDIA hadn't accounted for. AMD's Radeon R9 295X2, a dual-GPU card with full-speed Hawaii chips on-board, was released at $1499. I think it's fair to say that AMD took some chances that NVIDIA was surprised to see them take, including going the route of a self-contained water cooler and blowing past the PCI Express recommended power limits to offer a ~500 watt graphics card. The R9 295X2 was damned fast and I think it caught NVIDIA a bit off-guard.
As a result, the GeForce GTX Titan Z release was a bit quieter than most of us expected. Yes, the Titan Black card was released without sampling the gaming media but that was nearly a mirror of the GeForce GTX 780 Ti, just with a larger frame buffer and the performance of that GPU was well known. For NVIDIA to release a flagship dual-GPU graphics cards, admittedly the most expensive one I have ever seen with the GeForce brand on it, and NOT send out samples, was telling.
NVIDIA is adamant though that the primary target of the Titan Z is not just gamers but the CUDA developer that needs the most performance possible in as small of a space as possible. For that specific user, one that doesn't quite have the income to invest in a lot of Tesla hardware but wants to be able to develop and use CUDA applications with a significant amount of horsepower, the Titan Z fits the bill perfectly.
Still, the company was touting the Titan Z as "offering supercomputer class performance to enthusiast gamers" and telling gamers in launch videos that the Titan Z is the "fastest graphics card ever built" and that it was "built for gamers." So, interest peaked, we decided to review the GeForce GTX Titan Z.
The GeForce GTX TITAN Z Graphics Card
Cost and performance not withstanding, the GeForce GTX Titan Z is an absolutely stunning looking graphics card. The industrial design started with the GeForce GTX 690 (the last dual-GPU card NVIDIA released) and continued with the GTX 780 and Titan family, lives on with the Titan Z.
The all metal finish looks good and stands up to abuse, keeping that PCB straight even with the heft of the heatsink. There is only a single fan on the Titan Z, center mounted, with a large heatsink covering both GPUs on opposite sides. The GeForce logo up top illuminates, as we have seen on all similar designs, which adds a nice touch.
With the GPU landscape mostly settled for 2014, we have the ability to really dig in and evaluate the retail models that continue to pop up from NVIDIA and AMD board partners. One of our favorite series of graphics cards over the years comes from MSI in the form of the Lightning brand. These cards tend to take the engineering levels to a point other designers simply won't do - and we love it! Obviously the target of this capability is additional overclocking headroom and stability, but what if the GPU target has issues scaling already?
That is more or less the premise of the Radeon R9 290X Lightning from MSI. AMD's Radeon R9 290X Hawaii GPU is definitely a hot and power hungry part and that caused quite a few issues at the initial release. Since then though, both AMD and its add-in card partners have worked to improve the coolers installed on these cards to improve performance reliability and decrease the LOUD NOISES produced by the stock, reference cooler.
Let's dive into the latest to hit our test bench, the MSI Radeon R9 290X Lightning.
The MSI Radeon R9 290X Lightning
MSI continues to utilize the yellow and black color scheme that many of the company's high end parts integrate and I love the combination. I know that both NVIDIA and AMD disapprove of the distinct lack of "green" and "red" in the cooler and box designs, but good on MSI for sticking to its own thing.
The box for the Lightning card is equal to the prominence of the card itself and you even get a nifty drawer for all of the included accessories.
We originally spotted the MSI R9 290X Lightning at CES in January and the design remains the same. The cooler is quite large (and damn heavy) and is cooled by a set of three fans. The yellow fan in the center is smaller and spins a bit faster, creating more noise than I would prefer. All fan speeds can be adjusted with MSI's included fan control software.
The AMD Argument
Earlier this week, a story was posted in a Forbes.com blog that dove into the idea of NVIDIA GameWorks and how it was doing a disservice not just on the latest Ubisoft title Watch_Dogs but on PC gamers in general. Using quotes from AMD directly, the author claims that NVIDIA is actively engaging in methods to prevent game developers from optimizing games for AMD graphics hardware. This is an incredibly bold statement and one that I hope AMD is not making lightly. Here is a quote from the story:
Gameworks represents a clear and present threat to gamers by deliberately crippling performance on AMD products (40% of the market) to widen the margin in favor of NVIDIA products. . . . Participation in the Gameworks program often precludes the developer from accepting AMD suggestions that would improve performance directly in the game code—the most desirable form of optimization.
The example cited on the Forbes story is the recently released Watch_Dogs title, which appears to show favoritism towards NVIDIA GPUs with performance of the GTX 770 ($369) coming close the performance of a Radeon R9 290X ($549).
It's evident that Watch Dogs is optimized for Nvidia hardware but it's staggering just how un-optimized it is on AMD hardware.
Watch_Dogs is the latest GameWorks title released this week.
I decided to get in touch with AMD directly to see exactly what stance the company was attempting to take with these kinds of claims. No surprise, AMD was just as forward with me as they appeared to be in the Forbes story originally.
The AMD Stance
Central to AMD’s latest annoyance with the competition is the NVIDIA GameWorks program. First unveiled last October during a press event in Montreal, GameWorks combines several NVIDIA built engine functions into libraries that can be utilized and accessed by game developers to build advanced features into games. NVIDIA’s website claims that GameWorks is “easy to integrate into games” while also including tutorials and tools to help quickly generate content with the software set. Included in the GameWorks suite are tools like VisualFX which offers rendering solutions like HBAO+, TXAA, Depth of Field, FaceWorks, HairWorks and more. Physics tools include the obvious like PhysX while also adding clothing, destruction, particles and more.
You need a bit of power for this
PC gamers. We do some dumb shit sometimes. Those on the outside looking in, forced to play on static hardware with fixed image quality and low expandability, turn up their noses and question why we do the things we do. It’s not an unfair reaction, they just don’t know what they are missing out on.
For example, what if you decided to upgrade your graphics hardware to improve performance and allow you to up the image quality on your games to unheard of levels? Rather than using a graphics configuration with performance found in a modern APU you could decide to run not one but FOUR discrete GPUs in a single machine. You could water cool them for optimal temperature and sound levels. This allows you to power not 1920x1080 (or 900p), not 2560x1400 but 4K gaming – 3840x2160.
All for the low, low price of $3000. Well, crap, I guess those console gamers have a right to question the sanity of SOME enthusiasts.
After the release of AMD’s latest flagship graphics card, the Radeon R9 295X2 8GB dual-GPU beast, our mind immediately started to wander to what magic could happen (and what might go wrong) if you combined a pair of them in a single system. Sure, two Hawaii GPUs running in tandem produced the “fastest gaming graphics card you can buy” but surely four GPUs would be even better.
The truth is though, that isn’t always the case. Multi-GPU is hard, just ask AMD or NVIDIA. The software and hardware demands placed on the driver team to coordinate data sharing, timing control, etc. are extremely high even when you are working with just two GPUs in series. Moving to three or four GPUs complicates the story even further and as a result it has been typical for us to note low performance scaling, increased frame time jitter and stutter and sometimes even complete incompatibility.
During our initial briefing covering the Radeon R9 295X2 with AMD there was a system photo that showed a pair of the cards inside a MAINGEAR box. As one of AMD’s biggest system builder partners, MAINGEAR and AMD were clearly insinuating that these configurations would be made available for those with the financial resources to pay for it. Even though we are talking about a very small subset of the PC gaming enthusiast base, these kinds of halo products are what bring PC gamers together to look and drool.
As it happens I was able to get a second R9 295X2 sample in our offices for a couple of quick days of testing.
Working with Kyle and Brent over at HardOCP, we decided to do some hardware sharing in order to give both outlets the ability to judge and measure Quad CrossFire independently. The results are impressive and awe inspiring.
Competition is a Great Thing
While doing some testing with the AMD Athlon 5350 Kabini APU to determine it's flexibility as a low cost gaming platform, we decided to run a handful of tests to measure something else that is getting a lot of attention right now: AMD Mantle and NVIDIA's 337.50 driver.
Earlier this week I posted a story that looked at performance scaling of NVIDIA's new 337.50 beta driver compared to the previous 335.23 WHQL. The goal was to assess the DX11 efficiency improvements that the company stated it had been working on and implemented into this latest beta driver offering. In the end, we found some instances where games scaled by as much as 35% and 26% but other cases where there was little to no gain with the new driver. We looked at both single GPU and multi-GPU scenarios on mostly high end CPU hardware though.
Earlier in April I posted an article looking at Mantle, AMD's answer to a lower level API that is unique to its ecosystem, and how it scaled on various pieces of hardware on Battlefield 4. This was the first major game to implement Mantle and it remains the biggest name in the field. While we definitely saw some improvements in gaming experiences with Mantle there was work to be done when it comes to multi-GPU scaling and frame pacing.
Both parties in this debate were showing promise but obviously both were far from perfect.
While we were benchmarking the new AMD Athlon 5350 Kabini based APU, an incredibly low cost processor that Josh reviewed in April, it made sense to test out both Mantle and NVIDIA's 337.50 driver in an interesting side by side.
Let's see if I can start this story without sounding too much like a broken record when compared to the news post I wrote late last week on the subject of NVIDIA's new 337.50 driver. In March, while attending the Game Developer's Conference to learn about the upcoming DirectX 12 API, I sat down with NVIDIA to talk about changes coming to its graphics driver that would affect current users with shipping DX9, DX10 and DX11 games.
As I wrote then:
What NVIDIA did want to focus on with us was the significant improvements that have been made on the efficiency and performance of DirectX 11. When NVIDIA is questioned as to why they didn’t create their Mantle-like API if Microsoft was dragging its feet, they point to the vast improvements possible and made with existing APIs like DX11 and OpenGL. The idea is that rather than spend resources on creating a completely new API that needs to be integrated in a totally unique engine port (see Frostbite, CryEngine, etc.) NVIDIA has instead improved the performance, scaling, and predictability of DirectX 11.
NVIDIA claims that these fixes are not game specific and will improve performance and efficiency for a lot of GeForce users. Even if that is the case, we will only really see these improvements surface in titles that have addressable CPU limits or very low end hardware, similar to how Mantle works today.
In truth, this is something that both NVIDIA and AMD have likely been doing all along but NVIDIA has renewed purpose with the pressure that AMD's Mantle has placed on them, at least from a marketing and PR point of view. It turns out that the driver that starts to implement all of these efficiency changes is the recent 337.50 release and on Friday I wrote up a short story that tested a particularly good example of the performance changes, Total War: Rome II, with a promise to follow up this week with additional hardware and games. (As it turns out, results from Rome II are...an interesting story. More on that on the next page.)
Today I will be looking at seemingly random collection of gaming titles, running on some reconfigured test bed we had in the office in an attempt to get some idea of the overall robustness of the 337.50 driver and its advantages over the 335.23 release that came before it. Does NVIDIA have solid ground to stand on when it comes to the capabilities of current APIs over what AMD is offering today?