Subject: Graphics Cards | January 28, 2015 - 10:21 AM | Ryan Shrout
Tagged: nvidia, memory issue, maxwell, GTX 970, GM204, geforce
UPDATE 1/29/15: This forum post has since been edited and basically removed, with statements made on Twitter that no driver changes are planned that will specifically target the performance of the GeForce GTX 970.
The story around the GeForce GTX 970 and its confusing and shifting memory architecture continues to update. On a post in the official GeForce.com forums (on page 160 of 184!), moderator and NVIDIA employee PeterS claims that the company is working on a driver to help improve performance concerns and will also be willing to "help out" for users that honestly want to return the product they already purchased. Here is the quote:
First, I want you to know that I'm not just a mod, I work for NVIDIA in Santa Clara.
I totally get why so many people are upset. We messed up some of the stats on the reviewer kit and we didn't properly explain the memory architecture. I realize a lot of you guys rely on product reviews to make purchase decisions and we let you down.
It sucks because we're really proud of this thing. The GTX970 is an amazing card and I genuinely believe it's the best card for the money that you can buy. We're working on a driver update that will tune what's allocated where in memory to further improve performance.
Having said that, I understand that this whole experience might have turned you off to the card. If you don't want the card anymore you should return it and get a refund or exchange. If you have any problems getting that done, let me know and I'll do my best to help.
This makes things a bit more interesting - based on my conversations with NVIDIA about the GTX 970 since this news broke, it was stated that the operating system had a much stronger role in the allocation of memory from a game's request than the driver. Based on the above statement though, NVIDIA seems to think it can at least improve on the current level of performance and tune things to help alleviate any potential bottlenecks that might exist simply in software.
As far as the return goes, PeterS at least offers to help this one forum user but I would assume the gesture would be available for anyone that has the same level of concern for the product. Again, as I stated in my detailed breakdown of the GTX 970 memory issue on Monday, I don't believe that users need to go that route - the GeForce GTX 970 is still a fantastic performing card in nearly all cases except (maybe) a tiny fraction where that last 500MB of frame buffer might come into play. I am working on another short piece going up today that details my experiences with the GTX 970 running up on those boundaries.
NVIDIA is trying to be proactive now, that much we can say. It seems that the company understands its mistake - not in the memory pooling decision but in the lack of clarity it offered to reviewers and consumers upon the product's launch.
A few secrets about GTX 970
UPDATE 1/28/15 @ 10:25am ET: NVIDIA has posted in its official GeForce.com forums that they are working on a driver update to help alleviate memory performance issues in the GTX 970 and that they will "help out" those users looking to get a refund or exchange.
Yes, that last 0.5GB of memory on your GeForce GTX 970 does run slower than the first 3.5GB. More interesting than that fact is the reason why it does, and why the result is better than you might have otherwise expected. Last night we got a chance to talk with NVIDIA’s Senior VP of GPU Engineering, Jonah Alben on this specific concern and got a detailed explanation to why gamers are seeing what they are seeing along with new disclosures on the architecture of the GM204 version of Maxwell.
NVIDIA's Jonah Alben, SVP of GPU Engineering
For those looking for a little background, you should read over my story from this weekend that looks at NVIDIA's first response to the claims that the GeForce GTX 970 cards currently selling were only properly utilizing 3.5GB of the 4GB frame buffer. While it definitely helped answer some questions it raised plenty more which is whey we requested a talk with Alben, even on a Sunday.
Let’s start with a new diagram drawn by Alben specifically for this discussion.
GTX 970 Memory System
Believe it or not, every issue discussed in any forum about the GTX 970 memory issue is going to be explained by this diagram. Along the top you will see 13 enabled SMMs, each with 128 CUDA cores for the total of 1664 as expected. (Three grayed out SMMs represent those disabled from a full GM204 / GTX 980.) The most important part here is the memory system though, connected to the SMMs through a crossbar interface. That interface has 8 total ports to connect to collections of L2 cache and memory controllers, all of which are utilized in a GTX 980. With a GTX 970 though, only 7 of those ports are enabled, taking one of the combination L2 cache / ROP units along with it. However, the 32-bit memory controller segment remains.
You should take two things away from that simple description. First, despite initial reviews and information from NVIDIA, the GTX 970 actually has fewer ROPs and less L2 cache than the GTX 980. NVIDIA says this was an error in the reviewer’s guide and a misunderstanding between the engineering team and the technical PR team on how the architecture itself functioned. That means the GTX 970 has 56 ROPs and 1792 KB of L2 cache compared to 64 ROPs and 2048 KB of L2 cache for the GTX 980. Before people complain about the ROP count difference as a performance bottleneck, keep in mind that the 13 SMMs in the GTX 970 can only output 52 pixels/clock and the seven segments of 8 ROPs each (56 total) can handle 56 pixels/clock. The SMMs are the bottleneck, not the ROPs.
Subject: Graphics Cards | January 24, 2015 - 11:51 AM | Ryan Shrout
Tagged: nvidia, maxwell, GTX 970, GM204, 3.5gb memory
UPDATE 1/28/15 @ 10:25am ET: NVIDIA has posted in its official GeForce.com forums that they are working on a driver update to help alleviate memory performance issues in the GTX 970 and that they will "help out" those users looking to get a refund or exchange.
UPDATE 1/26/25 @ 1:00pm ET: We have posted a much more detailed analysis and look at the GTX 970 memory system and what is causing the unusual memory divisions. Check it out right here!
UPDATE 1/26/15 @ 12:10am ET: I now have a lot more information on the technical details of the architecture that cause this issue and more information from NVIDIA to explain it. I spoke with SVP of GPU Engineering Jonah Alben on Sunday night to really dive into the quesitons everyone had. Expect an update here on this page at 10am PT / 1pm ET or so. Bookmark and check back!
UPDATE 1/24/15 @ 11:25pm ET: Apparently there is some concern online that the statement below is not legitimate. I can assure you that the information did come from NVIDIA, though is not attributal to any specific person - the message was sent through a couple of different PR people and is the result of meetings and multiple NVIDIA employee's input. It is really a message from the company, not any one individual. I have had several 10-20 minute phone calls with NVIDIA about this issue and this statement on Saturday alone, so I know that the information wasn't from a spoofed email, etc. Also, this statement was posted by an employee moderator on the GeForce.com forums about 6 hours ago, further proving that the statement is directly from NVIDIA. I hope this clears up any concerns around the validity of the below information!
Over the past couple of weeks users of GeForce GTX 970 cards have noticed and started researching a problem with memory allocation in memory-heavy gaming. Essentially, gamers noticed that the GTX 970 with its 4GB of system memory was only ever accessing 3.5GB of that memory. When it did attempt to access the final 500MB of memory, performance seemed to drop dramatically. What started as simply a forum discussion blew up into news that was being reported at tech and gaming sites across the web.
Image source: Lazygamer.net
NVIDIA has finally responded to the widespread online complaints about GeForce GTX 970 cards only utilizing 3.5GB of their 4GB frame buffer. From the horse's mouth:
The GeForce GTX 970 is equipped with 4GB of dedicated graphics memory. However the 970 has a different configuration of SMs than the 980, and fewer crossbar resources to the memory system. To optimally manage memory traffic in this configuration, we segment graphics memory into a 3.5GB section and a 0.5GB section. The GPU has higher priority access to the 3.5GB section. When a game needs less than 3.5GB of video memory per draw command then it will only access the first partition, and 3rd party applications that measure memory usage will report 3.5GB of memory in use on GTX 970, but may report more for GTX 980 if there is more memory used by other commands. When a game requires more than 3.5GB of memory then we use both segments.
We understand there have been some questions about how the GTX 970 will perform when it accesses the 0.5GB memory segment. The best way to test that is to look at game performance. Compare a GTX 980 to a 970 on a game that uses less than 3.5GB. Then turn up the settings so the game needs more than 3.5GB and compare 980 and 970 performance again.
Here’s an example of some performance data:
|GTX 980||GTX 970|
|Shadow of Mordor|
|<3.5GB setting = 2688x1512 Very High||72 FPS||60 FPS|
|>3.5GB setting = 3456x1944||55 FPS (-24%)||45 FPS (-25%)|
|<3.5GB setting = 3840x2160 2xMSAA||36 FPS||30 FPS|
|>3.5GB setting = 3840x2160 135% res||19 FPS (-47%)||15 FPS (-50%)|
|Call of Duty: Advanced Warfare|
|<3.5GB setting = 3840x2160 FSMAA T2x, Supersampling off||82 FPS||71 FPS|
|>3.5GB setting = 3840x2160 FSMAA T2x, Supersampling on||48 FPS (-41%)||40 FPS (-44%)|
On GTX 980, Shadows of Mordor drops about 24% on GTX 980 and 25% on GTX 970, a 1% difference. On Battlefield 4, the drop is 47% on GTX 980 and 50% on GTX 970, a 3% difference. On CoD: AW, the drop is 41% on GTX 980 and 44% on GTX 970, a 3% difference. As you can see, there is very little change in the performance of the GTX 970 relative to GTX 980 on these games when it is using the 0.5GB segment.
So it would appear that the severing of a trio of SMMs to make the GTX 970 different than the GTX 980 was the root cause of the issue. I'm not sure if this something that we have seen before with NVIDIA GPUs that are cut down in the same way, but I have asked for clarification from NVIDIA on that.
The ratios fit: 500MB is 1/8th of the 4GB total memory capacity and 2 SMMs is 1/8th of the total SMM count. (Edit: The ratios in fact do NOT match up...odd.)
The full GM204 GPU that is the root cause of this memory issue.
Another theory presented itself as well: is this possibly the reason we do not have a GTX 960 Ti yet? If the patterns were followed from previous generations a GTX 960 Ti would be a GM204 GPU with fewer cores enabled and additional SMs disconnected to enable a lower price point. If this memory issue were to be even more substantial, creating larger differentiated "pools" of memory, then it could be an issue for performance or driver development. To be clear, we are just guessing on this one and that could be something that would not occur at all. Again, I've asked NVIDIA for some technical clarification.
Requests for information aside, we may never know for sure if this is a bug with the GM204 ASIC or predetermined characteristic of design.
The questions remains: does NVIDIA's response appease GTX 970 owners? After all, this memory concern is really just a part of a GPU's story and thus performance testing and analysis already incorporates it essentially. Some users will still likely make a claim of a "bait and switch" but do the benchmarks above, as well as our own results at 4K, make it a less significant issue?
Our own Josh Walrath offers this analysis:
A few days ago when we were presented with evidence of the 970 not fully utilizing all 4 GB of memory, I theorized that it had to do with the reduction of SMM units. It makes sense from an efficiency standpoint to perhaps "hard code" memory addresses for each SMM. The thought behind that would be that 4 GB of memory is a huge amount of a video card, and the potential performance gains of a more flexible system would be pretty minimal.
I believe that the memory controller is working as intended and not a bug. When designing a large GPU, there will invariably be compromises made. From all indications NVIDIA decided to save time, die size, and power by simplifying the memory controller and crossbar setup. These things have a direct impact on time to market and power efficiency. NVIDIA probably figured that a couple percentage of performance lost was outweighed by the added complexity, power consumption, and engineering resources that it would have taken to gain those few percentage points back.
Subject: Graphics Cards | January 23, 2015 - 11:09 PM | Sebastian Peak
Tagged: nvidia, gtx 960, graphics drivers, graphics cards, GeForce 347.25, geforce, game ready, dying light
With the release of GTX 960 yesterday NVIDIA also introduced a new version of the GeForce graphics driver, 347.25 - WHQL.
NVIDIA states that the new driver adds "performance optimizations, SLI profiles, expanded Multi-Frame Sampled Anti-Aliasing support, and support for the new GeForce GTX 960".
While support for the newly released GPU goes without saying, the expanded MFAA support will help provide better anti-aliasing performance to many existing games, as “MFAA support is extended to nearly every DX10 and DX11 title”. In the release notes three games are listed that do not benefit from the MFAA support, as “Dead Rising 3, Dragon Age 2, and Max Payne 3 are incompatible with MFAA”.
347.25 also brings additional SLI profiles to add support for five new games, and a DirectX 11 SLI profile for one more:
SLI profiles added
- Black Desert
- Lara Croft and the Temple of Osiris
- Zhu Xian Shi Jie
- The Talos Principle
DirectX 11 SLI profile added
- Final Fantasy XIV: A Realm Reborn
The update is also the Game Ready Driver for Dying Light, a zombie action/survival game set to debut on January 27.
Much more information is available under the release notes on the driver download page, and be sure to check out Ryan’s chat with Tom Peterson from the live stream for a lot more information about this driver and the new GTX 960 graphics card.
Subject: General Tech, Graphics Cards | January 22, 2015 - 06:44 PM | Ryan Shrout
Tagged: video, tom petersen, nvidia, maxwell, live, gtx 960, gtx, GM206, geforce
UPDATE 2: If you missed the live stream you missed the prizes! But you can still watch the replay to get all the information and Q&A that went along with it as we discuss the GTX 960 and many more topics from the NVIDIA universe.
UPDATE (1/22): Well, the secret is out. Today's discussion will be about the new GeForce GTX 960, a $199 graphics card that takes power efficiency to a previously un-seen level! If you haven't read my review of the card yet, you should do so first, but then be sure you are ready for today's live stream and giveaway - details below! And don't forget: if you have questions, please leave them in the comments!
Get yourself ready, it’s time for another GeForce GTX live stream hosted by PC Perspective’s Ryan Shrout and NVIDIA’s Tom Petersen. Though we can’t dive into the exact details of what topics are going to be covered, intelligent readers that keep an eye on the rumors on our site will likely be able to guess what is happening on January 22nd.
On hand to talk about the products, answer questions about technologies in the GeForce family including GPUs, G-Sync, GameWorks, GeForce Experience and more will be Tom Petersen, well known on the LAN party and events circuit. To spice things up as well Tom has worked with graphics card partners to bring along a sizeable swag pack to give away LIVE during the event, including new GTX graphics cards. LOTS of graphics cards.
NVIDIA GeForce GTX 960 Live Stream and Giveaway
10am PT / 1pm ET - January 22nd
Need a reminder? Join our live mailing list!
Here are some of the prizes we have lined up for those of you that join us for the live stream:
- 3 x MSI GeForce GTX 960 Graphics Cards
- 4 x EVGA GeForce GTX 960 Graphics Cards
- 3 x ASUS GeForce GTX 960 Graphics Cards
Thanks to ASUS, EVGA and MSI for supporting the stream!
The event will take place Thursday, January 22nd at 1pm ET / 10am PT at http://www.pcper.com/live. There you’ll be able to catch the live video stream as well as use our chat room to interact with the audience, asking questions for me and Tom to answer live. To win the prizes you will have to be watching the live stream, with exact details of the methodology for handing out the goods coming at the time of the event.
Tom has a history of being both informative and entertaining and these live streaming events are always full of fun and technical information that you can get literally nowhere else. Previous streams have produced news as well – including statements on support for Adaptive Sync, release dates for displays and first-ever demos of triple display G-Sync functionality. You never know what’s going to happen or what will be said!
If you have questions, please leave them in the comments below and we'll look through them just before the start of the live stream. Of course you'll be able to tweet us questions @pcper and we'll be keeping an eye on the IRC chat as well for more inquiries. What do you want to know and hear from Tom or I?
So join us! Set your calendar for this coming Thursday at 1pm ET / 10am PT and be here at PC Perspective to catch it. If you are a forgetful type of person, sign up for the PC Perspective Live mailing list that we use exclusively to notify users of upcoming live streaming events including these types of specials and our regular live podcast. I promise, no spam will be had!
Subject: Graphics Cards | January 22, 2015 - 01:44 PM | Jeremy Hellstrom
Tagged: video, nvidia, msi gaming 2g, maxwell, gtx 960, GM206, geforce
Did Ryan somehow miss a benchmark that is important to you? Perhaps [H]ard|OCP's coverage of the MSI GeForce GTX 960 GAMING 2G will capture that certain something. MSI runs their 960 at a base of 1216MHz with the boost clock hitting 1279MHz, slightly slower than the ASUS STRIX at 1291 MHz and 1317 MHz. At the time this was posted the cards were available on Amazon for $210, that is obviously going to change so keep an eye out. As [H] states in their conclusions, it is a good value but not the great value which the GTX 970 offered at release, check out their full review here or one of the many down below.
"NVIDIA is today launching a GPU aimed at the "sweet spot" of the video card market. With an unexpectedly low MSRP, we find out if the new GeForce GTX 960 has what it takes to compete with the competition. The MSI GTX 960 GAMING reviewed here today is a retail card you will be able to purchase. No reference card in this review."
Here are some more Graphics Card articles from around the web:
- Nvidia's GeForce GTX 960 @ The Tech Report
- Zotac GTX 960 AMP!-edition @ Bjorn3d
- NVIDIA GeForce GTX 960: A Great $200 GPU For Linux Gamers @ Phoronix
- Palit GTX 960 Super JetStream 2 GB @ techPowerUp
- Gigabyte GTX 960 G1 Gaming 2GB @ Modders-Inc
- NVIDIA, MSI, EVGA GTX 960 Review @ OCC
- NVIDIA GeForce GTX 960 SLI @ techPowerUp
- EVGA GTX 960 Super Superclocked Video Card Review @ Hardware Asylum
- ASUS STRIX GTX 960 Review @ Neoseeker
- MSI GTX 960 Gaming OC 2 GB @ techPowerUp
- GTX 960 @ HardwareHeaven
- Gigabyte GTX960 G1 Gaming SOC @ Kitguru
- EVGA GTX 960 SSC 2 GB @ techPowerUp
- ASUS GTX 960 STRIX OC 2 GB @ techPowerUp
- Asus GTX960 Strix OC Edition @ Kitguru
- ASUS Strix Edition GeForce GTX 960 Graphics Card Review @ Techgage
- Palit GeForce GTX 960 JetStream @ Legion Hardware
- The NVIDIA GTX 960 Performance Review @ Hardware Canucks
- EVGA GeForce GTX 970 SSC ACX 2.0 @ HardwareOverlock
- NVIDIA GeForce GTX 970/980: Windows vs. Ubuntu Linux Performance @ Phoronix
- 22-Way AMD+NVIDIA Graphics Card Tests With Metro Redux On Steam For Linux @ Phoronix
A new GPU, a familiar problem
Editor's Note: Don't forget to join us today for a live streaming event featuring Ryan Shrout and NVIDIA's Tom Petersen to discuss the new GeForce GTX 960. It will be live at 1pm ET / 10am PT and will include ten (10!) GTX 960 prizes for participants! You can find it all at http://www.pcper.com/live
There are no secrets anymore. Calling today's release of the NVIDIA GeForce GTX 960 a surprise would be like calling another Avenger's movie unexpected. If you didn't just assume it was coming chances are the dozens of leaks of slides and performance would get your attention. So here it is, today's the day, NVIDIA finally upgrades the mainstream segment that was being fed by the GTX 760 for more than a year and half. But does the brand new GTX 960 based on Maxwell move the needle?
But as you'll soon see, the GeForce GTX 960 is a bit of an odd duck in terms of new GPU releases. As we have seen several times in the last year or two with a stagnant process technology landscape, the new cards aren't going be wildly better performing than the current cards from either NVIDIA for AMD. In fact, there are some interesting comparisons to make that may surprise fans of both parties.
The good news is that Maxwell and the GM206 GPU will price out starting at $199 including overclocked models at that level. But to understand what makes it different than the GM204 part we first need to dive a bit into the GM206 GPU and how it matches up with NVIDIA's "small" GPU strategy of the past few years.
The GM206 GPU - Generational Complexity
First and foremost, the GTX 960 is based on the exact same Maxwell architecture as the GTX 970 and GTX 980. The power efficiency, the improved memory bus compression and new features all make their way into the smaller version of Maxwell selling for $199 as of today. If you missed the discussion on those new features including MFAA, Dynamic Super Resolution, VXGI you should read that page of our original GTX 980 and GTX 970 story from last September for a bit of context; these are important aspects of Maxwell and the new GM206.
NVIDIA's GM206 is essentially half of the full GM204 GPU that you find on the GTX 980. That includes 1024 CUDA cores, 64 texture units and 32 ROPs for processing, a 128-bit memory bus and 2GB of graphics memory. This results in half of the memory bandwidth at 112 GB/s and half of the peak compute capability at 2.30 TFLOPS.
Subject: General Tech | January 15, 2015 - 02:09 PM | Ken Addison
Tagged: podcast, video, gtx 960, nvidia, maxwell, amd, r9 380x, corsair, carbide, 300R, CES, ces 2015, ECS, Z97-Machine, Intel, crucial
PC Perspective Podcast #332 - 01/15/2015
Join us this week as we discuss GTX 960 and R9 380X Rumors, Corsair Carbide 300R Titanium, and our CES 2015 wrap up
The URL for the podcast is: http://pcper.com/podcast - Share with your friends!
- iTunes - Subscribe to the podcast directly through the Store
- RSS - Subscribe through your regular RSS reader
- MP3 - Direct download link to the MP3 file
Hosts: Ryan Shrout, Jeremy Hellstrom, Josh Walrath, and Allyn Malventano
Program length: 1:11:25
Subject: Graphics Cards | January 13, 2015 - 02:28 PM | Sebastian Peak
Tagged: rumors, nvidia, multi monitor, mini-ITX GPU, leak, HDMI 2.0, gtx 960, gpu, geforce, DisplayPort
The crew at VideoCardz.com have been reporting some GTX 960 sightings lately, and today they've added no less than three new cards from KFA2, the "European premium brand" of Galaxy.
The reported reference design GTX 960 (VideoCardz.com)
Such reports are becoming more common, with the site posting photos that appear to be other vendors' versions of the new GPU here, here, and here. Of note with these new alleged photos on what appears to be a reference design board: no less than three DisplayPort outputs, as well as HDMI 2.0 and DVI:
Reported GTX 960 outputs (VideoCardz.com)
This would be big news for multi-monitor users as it would provide potential support three high-resolution DisplayPort monitors from a single card in a strictly non-gaming environment (unless you happen to enjoy the frame-rates of an oil painting).
The reported mini-ITX GTX 960 (VideoCardz.com)
The other designs shown in the post include a mini-ITX form-factor design still sporting the triple DisplayPorts, HDMI and DVI, and a larger EXOC edition built on a custom PCB.
Reported EXOC GTX 960 (VideoCardz.com)
The EXOC edition apparently drops the multi-DisplayPort option in favor of a second DVI output, leaving just one DisplayPort along with the lone HDMI 2.0 output.
With the GTX 960 leaks coming in daily now it seems likely that we would be hearing something official soon.
Subject: Graphics Cards | January 7, 2015 - 10:51 AM | Sebastian Peak
Tagged: nvidia, notebook, mobile graphics, mobile gpu, GeForce 965M
With zero fanfare NVIDIA has released a new mobile graphics chip today, the GeForce GTX 965M.
Based on the 28nm Maxwell GM204 core and positioned just below the existing GTX 970M, the new GTX 965M has 1024 CUDA cores (compared to the 970M's 1280) and the new 965M has a lower 128-bit memory interface (vs 192-bit with the 970M). The base clock is slightly faster at 944 MHz (plus unspecified Boost headroom).
Compared with the flagship GTX 980M which boasts 1536 CUDA cores and 256-bit GDDR5 this new GTX 965M will be a significantly lower performer, but NVIDIA is marketing it towards 1080p mobile gaming. At a lower cost to OEMs the 965M should help create some less expensive 1080p gaming notebooks as the new GPU is adopted.
The chip features proprietary NVIDIA Optimus and Battery Boost support, and is GameStream, ShadowPlay, and GameWorks ready.
Specs from NVIDIA:
- CUDA Cores: 1024
- Base Clock: 944 MHz + Boost
- Memory Clock: 2500 MHz
- Memory Interface: GDDR5
- Memory Interface Width: 128-bit
- Memory Bandwidth: 80 GB/sec
- DirectX API: 12
- OpenGL: 4.4
- OpenCL: 1.1
- Display Resolution: Up to 3840x2160
More information on this new mobile GPU can be found via the source link.