Feedback

NVIDIA Discloses Full Memory Structure and Limitations of GTX 970

Author:
Manufacturer: NVIDIA

A few secrets about GTX 970

UPDATE 1/28/15 @ 10:25am ET: NVIDIA has posted in its official GeForce.com forums that they are working on a driver update to help alleviate memory performance issues in the GTX 970 and that they will "help out" those users looking to get a refund or exchange.

Yes, that last 0.5GB of memory on your GeForce GTX 970 does run slower than the first 3.5GB. More interesting than that fact is the reason why it does, and why the result is better than you might have otherwise expected. Last night we got a chance to talk with NVIDIA’s Senior VP of GPU Engineering, Jonah Alben on this specific concern and got a detailed explanation to why gamers are seeing what they are seeing along with new disclosures on the architecture of the GM204 version of Maxwell.

View Full Size

NVIDIA's Jonah Alben, SVP of GPU Engineering

For those looking for a little background, you should read over my story from this weekend that looks at NVIDIA's first response to the claims that the GeForce GTX 970 cards currently selling were only properly utilizing 3.5GB of the 4GB frame buffer. While it definitely helped answer some questions it raised plenty more which is whey we requested a talk with Alben, even on a Sunday.

Let’s start with a new diagram drawn by Alben specifically for this discussion.

View Full Size

GTX 970 Memory System

Believe it or not, every issue discussed in any forum about the GTX 970 memory issue is going to be explained by this diagram. Along the top you will see 13 enabled SMMs, each with 128 CUDA cores for the total of 1664 as expected. (Three grayed out SMMs represent those disabled from a full GM204 / GTX 980.) The most important part here is the memory system though, connected to the SMMs through a crossbar interface. That interface has 8 total ports to connect to collections of L2 cache and memory controllers, all of which are utilized in a GTX 980. With a GTX 970 though, only 7 of those ports are enabled, taking one of the combination L2 cache / ROP units along with it. However, the 32-bit memory controller segment remains.

You should take two things away from that simple description. First, despite initial reviews and information from NVIDIA, the GTX 970 actually has fewer ROPs and less L2 cache than the GTX 980. NVIDIA says this was an error in the reviewer’s guide and a misunderstanding between the engineering team and the technical PR team on how the architecture itself functioned. That means the GTX 970 has 56 ROPs and 1792 KB of L2 cache compared to 64 ROPs and 2048 KB of L2 cache for the GTX 980. Before people complain about the ROP count difference as a performance bottleneck, keep in mind that the 13 SMMs in the GTX 970 can only output 52 pixels/clock and the seven segments of 8 ROPs each (56 total) can handle 56 pixels/clock. The SMMs are the bottleneck, not the ROPs.

Continue reading our explanation and summary about the NVIDIA GTX 970 3.5GB Memory Issue!!

  GeForce GTX 980 GeForce GTX 970 (Corrected)
GPU Code name GM204 GM204
GPU Cores 2048 1664
Rated Base Clock 1126 MHz 1050 MHz
Texture Units 128 104
ROP Units 64 56
L2 Cache 2048 KB 1792 KB
Memory 4GB 4GB
Memory Clock 7000 MHz 7000 MHz
Memory Interface 256-bit 256-bit
Memory Bandwidth 224 GB/s 224 GB/s*
TDP 165 watts 145 watts
Peak Compute 4.61 TFLOPS 3.49 TFLOPS
MSRP $549 $329

*To those wondering how peak bandwidth would remain at 224 GB/s despite the division of memory controllers on the GTX 970, Alben stated that it can reach that speed only when memory is being accessed in both pools.

Second to that, it turns out the disabled SMMs have nothing to do with the performance issues experienced or the memory system complications.

View Full Size

Full GM204 Block Diagram

In a GTX 980, each block of L2 / ROPs directly communicate through a 32-bit portion of the GM204 memory interface and then to a 512MB section of on-board memory. When designing the GTX 970, NVIDIA used a new capability of Maxwell to implement the system in an improved fashion than would not have been possible with Kepler or previous architectures. Maxwell’s configurability allowed NVIDIA to disable a portion of the L2 cache and ROP units while using a “buddy interface” to continue to light up and use all of the memory controller segments. Now, the SMMs use a single L2 interface to communicate with both banks of DRAM (on the far right) which does create a new concern.

A quick note about the GTX 980 here: it uses a 1KB memory access stride to walk across the memory bus from left to right, able to hit all 4GB in this capacity. But the GTX 970 and its altered design has to do things differently. If you walked across the memory interface in the exact same way, over the same 4GB capacity, the 7th crossbar port would tend to always get twice as many requests as the other port (because it has two memories attached).  In the short term that could be ok due to queuing in the memory path.  But in the long term if the 7th port is fully busy, and is getting twice as many requests as the other port, then the other six must be only half busy, to match with the 2:1 ratio.  So the overall bandwidth would be roughly half of peak. This would cause dramatic underutilization and would prevent optimal performance and efficiency for the GPU.

Let's be blunt here: access to the 0.5GB of memory, on its own and in a vacuum, would occur at 1/7th of the speed of the 3.5GB pool of memory.

To avert this, NVIDIA divided the memory into two pools, a 3.5GB pool which maps to seven of the DRAMs and a 0.5GB pool which maps to the eighth DRAM.  The larger, primary pool is given priority and is then accessed in the expected 1-2-3-4-5-6-7-1-2-3-4-5-6-7 pattern, with equal request rates on each crossbar port, so bandwidth is balanced and can be maximized. And since the vast majority of gaming situations occur well under the 3.5GB memory size this determination makes perfect sense. It is those instances where memory above 3.5GB needs to be accessed where things get more interesting.

Let's be blunt here: access to the 0.5GB of memory, on its own and in a vacuum, would occur at 1/7th of the speed of the 3.5GB pool of memory. If you look at the Nai benchmarks floating around, this is what you are seeing.

View Full Size

Check the result on the left: 22.35 GB/s is almost exacly 1/7th of 150 GB/s

But the net result for gaming scenarios is much less dramatic than that, so why is that the case? It comes down to the way that memory is allocated by the operating system for applications and games. As memory is requested by a game, the operating system will allocate portions for it depending on many factors. These include the exact data space that the game asked for, what the OS has available and what the heuristic patterns of the software models deem at the time. Not all memory is accessed in the same way, even for PC games.

UPDATE 1/27/15 @ 5:36pm ET: I wanted to clarify a point on the GTX 970's ability to access both the 3.5GB and 0.5GB pools of data at the same. Despite some other outlets reporting that the GPU cannot do that, Alben confirmed to me that because the L2 has multiple request busses, the 7th L2 can indeed access both memories that are attached to it at the same time.

If a game has allocated 3GB of graphics memory it might be using only 500MB on a regular basis with much of the rest only there for periodic, on-demand use. Things like compressed textures that are not as time sensitive as other material require much less bandwidth and can be moved around to other memory locations with less performance penalty. Not all allocated graphics memory is the same and innevitably there are large sections of this storage that is reserved but rarely used at any given point in time.

All gaming systems today already have multiple pools of graphics memory – what exists on the GPU and what the system memory has to offer via the PCI Express bus. With the GTX 970 and its 3.5GB/0.5GB division, the OS now has three pools of memory to access and to utilize. Yes, the 0.5GB of memory in the second pool on the GTX 970 cards is slower than the 3.5GB of memory but it is at least 4x as fast as the memory speed available through PCI Express and system memory. The goal for NVIDIA then is that the operating system would utilize the 3.5GB of memory capacity first, then access the 0.5GB and then finally move to the system memory if necessary.

The question then is, what is the real-world performance penalty of the GTX 970’s dual memory pool configuration? Though Alben didn’t have a specific number he wanted to discuss he encouraged us to continue doing our own testing to find cases where you can test games requesting less than 3.5GB of memory and then between 3.5GB and 4.0GB. By comparing the results on the GTX 980 and the GTX 970 in these specific scenarios you should be able to gauge the impact that the slower pool of memory has on the total memory configuration and gaming experience. The problem and risk is that this performance difference essentially depends on the heuristics of the OS and its ability to balance the pools effectively, putting data that needs to be used less frequently or in a less latency-dependent fashion in the 0.5GB portion.

NVIDIA’s performance labs continue to work away at finding examples of this occurring and the consensus seems to be something in the 4-6% range. A GTX 970 without this memory pool division would run 4-6% faster than the GTX 970s selling today in high memory utilization scenarios. Obviously this is something we can’t accurately test though – we don’t have the ability to run a GTX 970 without a disabled L2/ROP cluster like NVIDIA can. All we can do is compare the difference in performance between a reference GTX 980 and a reference GTX 970 and measure the differences as best we can, and that is our goal for this week.

Accessing that 500MB of memory on its own is slower. Accessing that 500MB as part of the 4GB total slows things down by 4-6%, at least according to NVIDIA. So now the difficult question: did NVIDIA lie to us?

At the very least, the company did not fully disclose the missing L2 and ROP partition on the GTX 970, even if it was due to miscommunication internally. The question “should the GTX 970 be called a 3.5GB card?” is more of a philosophical debate. There is 4GB of physical memory on the card and you can definitely access all 4GB of when the game and operating system determine it is necessary. But 1/8th of that memory can only be accessed in a slower manner than the other 7/8th, even if that 1/8th is 4x faster than system memory over PCI Express. NVIDIA claims that the architecture is working exactly as intended and that with competent OS heuristics the performance difference should be negligible in real-world gaming scenarios.

The performance of the GTX 970 is what the performance is. This information is incredibly interesting and warrants some debate, but at the end of the day, my recommendations for the GTX 970 really won’t change at all.

The configurability of the Maxwell architecture allowed NVIDIA to make this choice. Had the GeForce GTX 970 been built on the Kepler architecture, the company would have had to disable the entire L2/MC block on the right hand side, resulting in a 192-bit memory bus and a 3GB frame buffer. GM204 allows NVIDIA to expand that to a 256-bit 3.5GB/0.5GB memory configuration and offers performance advantages, obviously.

Alternatively to calling this a 4GB card, NVIDIA might have branded it as 3.5GB with the addition of 500MB of “cache” or “buffer” – something that designates its difference in implementation, its slower performance but also its advantages over not having it at all.

Let’s be clear – the performance of the GTX 970 is what the performance is. This information is incredibly interesting and warrants some debate, but at the end of the day, my recommendations for the GTX 970 really won’t change at all. It still offers incredible performance for your dollar and is able to run at 4K in my experience and testing. Yes, there might in fact be specific instances where performance drops are more severe because of this memory hierarchy design, but I don’t think it changes the outlook for the card as a whole.

View Full Size

Some other trailing notes. There should be no difference in performance or memory configuration results from one implementation of the GTX 970 to another. If your GTX 970 exhibits an issue (or does not) then your friends and his friends should match the whole way. The details about the memory issue also show us that a pending GeForce GTX 960 Ti, if it exists, will not necessarily have this complication. Imagine a GM204 GPU with a 192-bit memory bus, 3GB of GDDR5 and fewer enabled SMMs and you likely have a product you’ll see in 2015. (Interestingly, you have basically just described exactly the GTX 970M mobile variant.)

This is not the first time that NVIDIA has used interesting memory techniques to adjust performance characteristics of a card. The GTX 550 Ti and the GTX 660 Ti both used unbalanced memory configurations, allowing a GPU with a 192-bit memory bus to access 2GB. This also required some specific balancing on NVIDIA's side to make sure that the 64-bit portion of that GPU's memory controller with double the memory of the other two didn't weigh memory throughput down in the 1.5 GB to 2.0 GB range. NVIDIA was succeeded there an the GTX 660 Ti was one of the company's most successful products of the generation.

UPDATE 1/27/15 @ 8:50pm ET: I also got some more clarification on the relationship between the GTX 660 Ti and the GTX 970 memory implementation. As it turns out, the GTX 660 Ti with the unbalanced memory system (one memory controller having access to more DRAM) also reported separate pools of memory, one at 1.5GB and one at 0.5GB, to the operating system. The difference is that the performance penalty between them was not nearly as severe as the delta we are seeing here on the GTX 970. Still, Alben claims that the software tricks the company learned then were directly applicable to the GTX 970 and thus the integration should be improved over what we saw those years ago.

It would be interesting to see if future architectures that implement this kind of design should try to use drivers to better handle the heuristics of memory allocation. Surely NVIDIA’s driver should know better which assets could be placed in the slower pools of memory without affecting gaming performance better than Windows. I would imagine that this configurable architecture design will continue into the future and it’s possible it could be improved enough to allow NVIDIA to expand the pool sizes, improving efficiency even more and not affecting performance.

For users that are attempting to measure the impact of this issue you should be aware that in some cases the software you are using report the in-use graphics memory could be wrong. Some applications are only aware of the first "pool" of memory and may only ever show up to 3.5GB in use for a game. Other applications, including MSI Afterburner as an example, do properly report total memory usage of up to 4GB. Because of the unique allocation of memory in the system, the OS and driver and monitoring application may not always be on the page. Many users, like bootski over at NeoGAF have done a job of compiling examples where the memory issue occurs, so look around for the right tools to use to test your own GTX 970. (Side note: we are going to try to do some of our own testing this afternoon.)

NVIDIA has come clean; all that remains is the response from consumers to take hold. For those of you that read this and remain affronted by NVIDIA calling the GeForce GTX 970 a 4GB card without equivocation: I get it. But I also respectfully disagree. Should NVIDIA have been more upfront about the changes this GPU brought compared to the GTX 980? Absolutely and emphatically. But does this change the stance or position of the GTX 970 in the world of discrete PC graphics? I don’t think it does.

Leave me your thoughts in the comments below.


January 26, 2015 | 01:13 PM - Posted by John H (not verified)

Excellent article and analysis as always Ryan.

I'm a little disappointed as I bought this card with 'future proofing' in mind (i.e. higher resolutions, etc).

One thing I'd like to learn more about is how the drivers/OS/games choose which textures/artifacts to keep in the GPU's memory vs. elsewhere.. Is the problem going to be much worse in a year or two when games have better detailed/quality textures that require more textures to be active at the same time? etc..

January 26, 2015 | 01:20 PM - Posted by snowden (not verified)

Yes it looks like the reality is there is real performance degradation here and nvidia misrepresented the hardware configuration on the gtx 970.

If you're in Europe that alone is enough for a refund/replacement with our excellent consumer protection laws. If you're in north america you're likely up the creek and will have to take the deception with no recourse.

As we start to get testing data from sites working on this issue we will learn more. What this article is showing is that nvidia has screwed up/lied and are in full damage control mode now releasing what they can to try and downplay/cover up the issue.

January 26, 2015 | 01:32 PM - Posted by nathanddrews

It's unlikely that this will result in mass refunds in Europe. You still get 4GB of usable VRAM. Can you get a refund for buying a R5 240 with 4GB VRAM even though the GPU is too weak to play anything that would use it all?

January 26, 2015 | 01:37 PM - Posted by snowden (not verified)

It meets the false advertising threshold.

56 ROPS instead of the 64 ROPS claimed and it doesn't achieve the claimed 224 GB/s memory bandwidth when the full VRAM buffer is in use. appears nvidia is aware they are in trouble here with this article trying to diminish the error.

January 26, 2015 | 06:15 PM - Posted by Anonymous (not verified)

But who claimed it had 65 ROPS? NVidia? Nope.

January 26, 2015 | 06:45 PM - Posted by Anonymous (not verified)

yeah, they lied.
are you retarded?

January 26, 2015 | 09:20 PM - Posted by ThorAxe

But it can't use all 64 ROPS even if they were enabled so does it matter?

Only a moron would buy a card based on ROPS and not on actual performance so why would anyone want a refund for a card that works very well in real world scenarios.

Perhaps Nvidia should release an optional BIOS (if possible) to remove the partition so that all 4GB is accessed equally. Maybe then the idiots would be happy despite this resulting in a slower performing card.

I suspect that the vast majority of people complaining don't even own the card.

June 5, 2015 | 05:30 PM - Posted by Fabrythrash (not verified)

Totally agree with ThorAxe.
And also I agree with the NVIDIA never ending testing brain.
Read all the article about this thing, yeah ok not all 4G are at full speed, but. As I've understand, maby it' also better if You are goin over the 4GB. ANYWAY .. :-)

Also I'm goin to buy an MSI 970 Gaming 4G in a couple of days and ..
It's one of the best video card available for around 375€
Also Zotac too. Another fav brand.

But just 'cos 4 the 980 You need at least 130 more bucks.

I'll go for the SLI ... again :-) in a couple of months ..

But this time It works a lot better :

i7 5930K
MSI mainboard
MSI GTX 970
and 64GB GSKill ..

LOL!

January 26, 2015 | 09:20 PM - Posted by ThorAxe

But it can't use all 64 ROPS even if they were enabled so does it matter?

Only a moron would buy a card based on ROPS and not on actual performance so why would anyone want a refund for a card that works very well in real world scenarios.

Perhaps Nvidia should release an optional BIOS (if possible) to remove the partition so that all 4GB is accessed equally. Maybe then the idiots would be happy despite this resulting in a slower performing card.

I suspect that the vast majority of people complaining don't even own the card.

January 27, 2015 | 12:32 PM - Posted by Anonymous (not verified)

Or possibly an idiot or an imbecile ; )

January 27, 2015 | 04:07 PM - Posted by Anonymous (not verified)

Agree, most of the whinge fest is from those who don't even buy nvidia. Wasn't that long ago that said GPU maker was in trouble for bs'ing on the clock speed that their cards couldn't maintain due to excessive heat......

January 31, 2015 | 07:06 AM - Posted by One Concerned Ass Citizen (not verified)

I own a Gigabyte G1 GTX 970 the performance of the card is good; i really enjoy the frame rate i get in games on ultra settings. The issue with me at least is when I buy a product and that product has certain specifications and then to come out it doesn't match what they claim; that is an issue

When consumers see or hear an advertisement, whether it’s on the Internet, radio or television, or anywhere else, federal law says that ad must be truthful, not misleading, and, when appropriate, backed by scientific evidence. The Federal Trade Commission enforces these truth-in-advertising laws, and it applies the same standards no matter where an ad appears – in newspapers and magazines, online, in the mail, or on billboards or buses. The FTC looks especially closely at advertising claims that can affect consumers’ health or their pocketbooks – claims about food, over-the-counter drugs, dietary supplements, alcohol, and tobacco and on conduct related to high-tech products and the Internet, such as the dissemination of spyware. The FTC also monitors and writes reports about ad industry practices regarding marketing of food, violent movies, music, and electronic games to children.

When the FTC finds a case of fraud perpetrated on consumers, the agency files actions in federal district court for immediate and permanent orders to stop scams; prevent fraudsters from perpetrating scams in the future; freeze their assets; and get compensation for victims.

http://www.ftc.gov/news-events/media-resources/truth-advertising

That is a pull down from the FTC website and what do you know that Nvidia as of right now falls into that category. They either had to know and were in on it in or they were too stupid to figure it out. In either case they need to do something about it. If they ignore it thinking everyone will forget there will be a class action suit started and the FTC is going to get involved, ETC. I will be among the people that will be seeking legal advice on this issue within the next week from the time of my post.

I enjoy the products they make but what i don't enjoy is a product that makes certain claims within hardware capabilities and fails to deliver on said capabilities. It is wrong and Nvidia is a company that operates within the US and must abide by the laws set forth.

January 26, 2015 | 08:44 PM - Posted by Anonymous_001 (not verified)

they post specs on site and ref cards other companys take post that info and OC add goofy coolers etc but yea pretty much all specs are around for card. if they dont match ads there is a issue. buy a cpu and its diff then box i guess thats okies then

January 28, 2015 | 07:49 AM - Posted by Anonymous (not verified)

gpu z shows the gtx 970 as having 64 rops

January 28, 2015 | 08:51 PM - Posted by Austin (not verified)

Get off your high horse. I've lived in both America and Europe. Shoppers in America are not "up the creek".

I returned my 680 a few months ago to a local store after owning it for almost a year for a 100% refund. No questions asked. This kind of thing is not rare. Quit spreading propaganda and ignorance.

February 4, 2015 | 03:48 AM - Posted by X-mass (not verified)

the reason that people in europe think that people in The US get bad long term support is caused by things like TIPP. That the US has been seen to repeatedly watered down its defence for the consumer compared to europe. For a long time 90 days was the standard warranty in the states when in was a year in Europe, its now 2 years in europe
having said that class action lawsuits down really happen in the UK,
and the only real world possible consquences is that that someone who bought a gtx 970 at full wack may now be able to pay only a little more and buy a 980 instead

as ever we on this side of the pond know little about what happens on the other side of the pond and visa-versa.
X
PS are you purchanse the Austin - whose shows I watch and enjoy on youtube? IF not and you have not seen Austin Evans - go see his show - its very very good!

January 26, 2015 | 01:47 PM - Posted by ConcernedCitizen (not verified)

Also,
Where is FCAT? Why has it been swept under the rug, particularly when we are discussing the card made by the manufacturer of the software to measure smoothness missed by conventional tools???

Seems like a bit to much of an attempt to downplay the whole ordeal. 0.5GB of memory potentially almost useless or even detrimental to performance!

January 26, 2015 | 02:18 PM - Posted by Dreamburner (not verified)

FCAT will show almost nothing significant. Firts of all, Nai's so called "benchmark" is a such bullshit. if you're take a look on the source code, you will notice that in fact this "bench" do not test something, but trying to load large pieces of data (abnormal size for the GPU, because 128MB textures are almost nonsence for now) using CUDA interface onto the VRAM. Because memory in 970 is diveded on two arrays, after filling of first array, GPU will reallocate the part of data to the second array and, after expiring - offload to the system RAM. In real workloads these transfers between the two arrays of VRAM consume 1-2% of maximal perfornmance. Moreover, VRAM will never will be under the necessity of almost simultaneously receive 4GB of data (i.e. only write). And, as the final word, Nai's so-called "bechmark" force GPU to offload the part of data to the system RAM, because this bench work in OS with the GUI (that consumes VRAM). It's truly sad, that hardware sites do not see such obvious things.

January 26, 2015 | 02:57 PM - Posted by ConcernedCitizen (not verified)

I didn't ask about Nai's tool.

FCAT would tell us a heck of a lot more than average FPS which has been demonstrated to be nearly useless! Get games which aren't just using the last 0.5 GB for a cache and then play and record. I'm sure you recall the explanations from NV (through PCPer etc.) on how standards avg/FPS don't tell the whole story.

January 26, 2015 | 03:58 PM - Posted by Josh Walrath

Why don't you review the PCPer GTX 970 launch FCAT results.  I believe Ryan did 4K tests that would push pretty close to 3.5 GB of space.

January 26, 2015 | 04:44 PM - Posted by Allyn Malventano

Ryan is testing with higher memory usage now, but here was 970 FCAT testing with reasonable settings at 4k.

February 4, 2015 | 03:56 AM - Posted by X-mass (not verified)

so given what you know now about the GTX 960, 970, 980 and presumably the 980Ti which is presumably an SLI 970M 8GB given the reputed bandwidth, power needs, memmory etc
have you come to a conclusion about which way to jump or are you going to wait and see what the Red and Blue team offer in the coming months
Thanks as ever for the great coverage and can tell Josh to stop being so down on himself - a hetrosexual girlfriend of mine says that he is "cute as hell" was the words she used. Me personally no idea, but that's breeders for ya.

January 26, 2015 | 05:03 PM - Posted by Vesku (not verified)

Actually Nai's benchmark appears to be accurate as long as you run it headless so there is no VRAM already in use. From this article:

"Let's be blunt here: access to the 0.5GB of memory, on its own and in a vacuum, would occur at 1/7th of the speed of the 3.5GB pool of memory. If you look at the Nai benchmarks floating around, this is what you are seeing."

January 27, 2015 | 10:02 AM - Posted by Anonymous (not verified)

NVIDIA penalizing you for not getting the 980.

Keep it classy

January 26, 2015 | 01:17 PM - Posted by nathanddrews

*GASP*

You mean, this isn't really a big deal after all?

Seems to me that the situation where it really might make a difference is when game settings are set so high as to drag performance down anyway. I'm looking forward to your non-earth-shattering results. ;-)

EDIT: I thought I read something today posted by Nai himself, stating that his benchmark was not reliable for testing this anyway.

January 26, 2015 | 06:23 PM - Posted by JohnGR

If the thing that a company intentionally hide the real specs of the card for months, because marketing department thought it would be better the customers to think that they are buying a 64 ROPs card instead of a 56 ROPs card (plus the less cache and the way the memory is split in two) is no big deal, then yes you are absolutely right. In case you are interested, I can sell you a GT620 with 512bit data bus, 8GB of memory, 128 ROPs and 4096 CUDA cores.

January 26, 2015 | 01:16 PM - Posted by GV (not verified)

The tl;dr version of this story is Nvidia lied:

"First, despite initial reviews and information from NVIDIA, the GTX 970 actually has fewer ROPs and less L2 cache than the GTX 980. NVIDIA says this was an error in the reviewer’s guide and a misunderstanding between the engineering team and the technical PR team on how the architecture itself functioned"

January 26, 2015 | 01:18 PM - Posted by snowden (not verified)

Yup. Nvidia lied, got caught and is trying another bumpgate downplay with this bullcrap we're reading here.

Things are going to start heating up and this will certainly be another class action suit for them to contend with.

January 26, 2015 | 01:16 PM - Posted by snowden (not verified)

Yikes! Nvidia is in trouble on this. Obviously the claim this was a mistake that went missed all this time in reference to the review guide is bull. European consumer laws are going to crush nvidia on this false advertising incorrectly giving the ROP count for gtx 970 as 64 when it is 56.

Fortunately we have sites that are doing testing on the gtx 970 now, like anandtech and hardware.fr. So we'll be getting real data rather than just nvidia's damage control we see outlined in this article.

going to get interesting very fast with this debacle nvidia has created.

January 26, 2015 | 11:40 PM - Posted by Anonymous (not verified)

You may get real data, you may get cherry picked data. Seems like every major review site bend over backwards to protect nvidia and their products because they're afraid nvidia will stop sending them launch hardware for review. If you've read anandtech's article you will see that the author is convinced nvidia made an honest mistake and tries awfully hard to convince the readers in the comment section.

January 30, 2015 | 03:53 AM - Posted by biblicabeebli

go away, please.

January 26, 2015 | 01:17 PM - Posted by Anonymous (not verified)

Thanks for the article. I'm looking forward to the benchmarks.

They obviously knew that there was a mistake about the ROPs/L2 cache info when the reviews came out. I mean, it's been, what, 4 months now? And Nvidia never bothered to correct any media outlets? How convenient for them. And they never bothered to disclose this memory issue either.

I think it reflects poorly on them.

January 26, 2015 | 01:18 PM - Posted by Christo (not verified)

I will be seeking a refund.

I bought a 4GB card, not a 3.5GB card and 0.5GB which is only 1/8th as fast.

January 27, 2015 | 11:36 PM - Posted by Anonymous (not verified)

What card are you planning on buying after a refund?
I agree about the deception. But the performance and all the reviews and owners up to this point were quite content with the product.
Just like the Patriots how did it affect the end result?

January 28, 2015 | 04:11 AM - Posted by Anonymous (not verified)

and most of the reviews are at 1080 resolution, so what good are they?

January 26, 2015 | 01:21 PM - Posted by Spacebob (not verified)

Thanks for the great explanation Ryan!

I'm curious how this might affect performance once game developers get a hold of DX12's finer grained memory allocation. Since dev's have the option to manage GPU memory themselves in DX12, I'm curious what happens when they hit the 500MB partition.

January 26, 2015 | 01:22 PM - Posted by Anonymous (not verified)

Thanks Ryan. Very interesting stuff. I trust your testing this week will include your usual frame time charts? That's all I want to see, since NVIDIA's 4-6% figure seems to be based on the traditional average fps metric.

January 26, 2015 | 05:33 PM - Posted by Allyn Malventano

Frame time testing is in progress. We have the 970 done, but need to re-run the same sequence on a 980 for proper comparison.

January 26, 2015 | 01:23 PM - Posted by AirCraftMX (not verified)

Thank you Ryan for taking your weekend investigating this for us. Wish Nvidia had said at launch, yes it is a 4GB card (but) can only access 3.5GB at full speed, and the remaining .5GB at reduced speed...

January 26, 2015 | 01:24 PM - Posted by Regor (not verified)

This inforamtion from Nvidia brings up more questions !

Many months, 970 users complaining of stuttering and asking about 970 3.5gib memory behavior. Why no straight answer in 2014 from Nvidia?

Why no demand from Pcper and other web tech so that Nvidia give FULL answer about 970 limited design?

Will only big worldwide lawsuit against Nvidia give answer why they wait until Jan 2015 to release more information about 970 limited design and misrepresentation about full 4gib vram bandwidth???

January 30, 2015 | 04:10 AM - Posted by biblicabeebli

It took a long time because it is in large part an unexpected bug, but also all of the following:

The availability of knowledge had to hit a critical mass.
Someone informed on the inexplicable had to stumble across a benchmark that has exactly a memory access pattern that reveals this case where the memory loaded slower.
The particular benchmark that was stumbled across had to be run in command line mode because this is a gpu memory problem.
Someone had to connect the dots.

And All this had to happen on a card that has Excellent performance, surpassing all expectations that such a performance jump and power optimization could actually happen without a transistor shrink.

January 26, 2015 | 01:24 PM - Posted by Anonymous (not verified)

Heh, no future-proofing here.

Nvidia plainly lied from the start, and by design apparently which this article somewhat explains. And everything would have been KEPT silent if not for a few enthusiasts.

Bad nvidia, very bad. This does open the legal door it looks like.

January 26, 2015 | 01:24 PM - Posted by The Fucking GOD of the GTX970s (not verified)

Wow, just wow.

My stance will remain the same on GTX970s. Get refunds while you can. 2015, where desktop products are beaten by laptop products. GTX970M and 980M 6/8GB versions are far superior than this glorified poop.

January 26, 2015 | 04:24 PM - Posted by Allyn Malventano

Those laptops are in a 'megapixel race' with RAM specs. It is very hard to use more than 4GB of GPU memory on a mobile platform.

January 26, 2015 | 01:24 PM - Posted by Anonymous (not verified)

This is still limited to averaged performance metrics, I'd be very interested to learn the effect this has on perceived performance when a game addresses the full 4 GB memory pool the card offers when compared to a situation where the slower 0.5 GB pool isn't utilized.

Frame times, stutter and other effects - how are they influenced by this slower memory pool. Averaged performance metrics are not the main cause of consumer concern here!

January 26, 2015 | 01:28 PM - Posted by cq3mrd (not verified)

"Should NVIDIA have been more upfront about the changes this GPU brought compared to the GTX 980? Absolutely and emphatically. But does this change the stance or position of the GTX 970 in the world of discrete PC graphics? I don’t think it does."
I partly agree with this assessment. For most people using these cards, this should be the case. But how about those that went SLI for 4K gaming and hitting that 0.5GB slowness? Or people who bought this card to do compute on it?
Bottom line: nVidia lost this consumer's trust.

Thanks for the write-up!

January 26, 2015 | 01:30 PM - Posted by nevzim (not verified)

Whatever, I need 8GB version to match consoles so spoled develpers do not have excuse for lazy ports.

January 26, 2015 | 11:21 PM - Posted by Anonymous (not verified)

Good god mate! That's 8GB of shared RAM. Imagine if you're PC had to share 8GB of system RAM with your GPU.

Please stop parroting an obvious lie.

January 27, 2015 | 06:15 AM - Posted by Anonymous (not verified)

I don't know if it that obvious. 8GB "shared" is okay as long as it is unified. With the current PC architecture, you essentially have a copy of everything in system RAM, and you copy this to GPU memory. GPU memory is mostly acting as a cache in this case, just allowing faster access to stuff that is already in system ram.

What do you think is sitting in system ram when you run a game? Mostly graphics assets, I would assume. If you have unified memory, then the CPU can just pass a pointer (address) to the GPU. The GPU can then access it directly without any memory copies. This uses memory much more efficiently. I have been assuming that the consoles use unified memory, but I don't actually know if this is true.

January 27, 2015 | 02:41 PM - Posted by nevzim (not verified)

GPU performance dwarfs CPU performance. GPU memory is what is going to matter exclusivelly from now on. It does not make sense to store data in the system memory. CPU will manage fine with the PCIe bandwidth to graphics memory which is not true the other way around.

January 27, 2015 | 11:40 PM - Posted by Rwells (not verified)

Consoles do not currently use 8GB for games. Just like a computer some of that memory is used by the OS for those systems. The amount can change with updates.

January 26, 2015 | 01:31 PM - Posted by fuwarghh (not verified)

Can you enquire as to how Nvidia's techs are able to disable the ROP unit cluster in question?

If it can be done in firmware, I think that an available firmware update to convert the card to 3GB and net an extra "4-6%" performance would be quite a boon to 970 owners!

(assuming I understood the above info correctly)

January 26, 2015 | 01:39 PM - Posted by fuwarghh (not verified)

nope got it wrong,

thought they were talking about disabling the last ROP/L2 unit, rather than fully enabling it

January 26, 2015 | 01:44 PM - Posted by Allyn Malventano

Disabling portions of the die take place during production and are coded in on a chip to chip basis. The firmware may have something to do with retrieving this data and ensuring those portions are avoided, but in a case where these chips are binned - they are binned this way for a reason. Chances are there is a defect in either the L2 or the SMMs, otherwise they would be selling it as a 980 and not a 970. There have been firmwares for other types of products (AMD CPUs) in the past, but given the configuiration of this chip, it's doubtful a simple firmware tweak could completely reconfigure how the memory controllers interleave and are connected to the L2.

January 26, 2015 | 11:44 PM - Posted by Anonymous (not verified)

Hasn't the previous couple amd architectures allowed them to cut down chips without gimping the memory system?

January 27, 2015 | 04:57 AM - Posted by Anonymous (not verified)

There is a wide range of what could be defective. I believe the 980M has the same memory interface as the GTX 980 (although slower), but a lower number of SMMs. The 970M actually goes down to 192-bit memory interface, so it probably is just a full 980 chip with an entire ROP/MC partition disabled (2 L2 + associated ROP), rather than just one L2 (plus crossbar port) like the desktop 970. The 960 seems to be a smaller chip rather than just a full part with disabled units. Wikipedia list all of the current mobile parts as cut down GM204 parts while the 960 is listed as GM206.

The 970 is an improvement. You get a 224-bit interface where in previous generations, this would have been cut down to a 192-bit interface.

http://en.wikipedia.org/wiki/GeForce_900_series

January 26, 2015 | 01:33 PM - Posted by mattmos (not verified)

Other applications than games are affected by this issue, I use a GPU renderer which was benchmarking much slower than the 780 series of cards the 970 was supposed to supersede, until they implemented a hack fix which restricted the memory access of the program to the first 3.5Gb only. At that point the overall card performance increased dramatically to beat previous generation cards. So we're now stuck with a card which has only 3.5Gb of usable memory in an application which would really make use of the extra memory if it ran at the same speed as the rest. Not impressed.

January 27, 2015 | 02:30 AM - Posted by Anonymous (not verified)

Can you provide more info on this gpu renderer benchmark? How did you disable the last 0.5gb memory bank? It might be useful for some of us. Thanks!

January 26, 2015 | 01:41 PM - Posted by Anonymous (not verified)

This is B.S. I hope someone sues them.

January 26, 2015 | 03:45 PM - Posted by Anonymous (not verified)

Completely agree. The issue isn't the slightly reduced performance on a daily basis it is the leaving this key piece of info out at launch as well as 'forgetting' to mention it.

January 26, 2015 | 01:43 PM - Posted by Anonymous (not verified)

IMO they should have left out those 500 MB and make it a 3.5 GB card, since it doesn't seem that this cache can be used reliably as an advantage. If they make without such *tricks* a 960 TI closer in performance to these 970 than towards 960 and price it ~280 USD, that would really be the sweet spot for me.

January 26, 2015 | 02:16 PM - Posted by Allyn Malventano

I started thinking the same way you are, as I'm in the market for a 970 myself, but after seeing this layout, I get how this is still a better way of doing things. Games are going to need a given amount of memory based on how they are configured (resolution, frame buffer, textures, etc). That doesn't change based on which type of GPU you are using. The OS obfuscates how the memory is allocated, and the game has no clue how much GPU memory there is, or how it is configured - it just asks the OS for memory. What *does* change is the point where you run out of GPU RAM and bleed over into system RAM (which has much higher overhead as it has to pass through CPU, PCIe, etc to get to the GPU).

Take a game that creeps up to 3.8GB based on settings. Your theoretical 3.5GB 970 would slow to a crawl and hit a literal performance brick wall as it now has to start sharing with paged Windows / motherboard-connected DRAM. The real 970, as it is currently configured, would take much less of a performance hit with that same game configuration. While that last 1/8th of memory is slower than the other 7/8th, it is still ~4x faster (for graphics) than system DRAM + overhead in getting that data to the GPU.

January 27, 2015 | 06:27 AM - Posted by Anonymous (not verified)

Leaving off a memory chip to make it 3.5 GB on a 224-bit wide interface may not be doable. It can be any one of the L2 caches with a defect, so you cant count on it being one specific memory controller that would be effected. To make it actually a 3.5 GB card (7 memory chips instead of 8) would require 8 different board versions, all with a different memory chip left off. They could mount all of the memory chips and disable the one connected to the defective L2, but this doesn't make much sense. It can be used under some circumstances.

January 26, 2015 | 01:44 PM - Posted by ConcernedCitizen (not verified)

WHERE IS THE FCAT ANALYSIS???

You'd assume with all of the hype and viral marketing it has been used for, that it would be the first thing brought to the table!

NVIDIA developed it!!

January 26, 2015 | 04:46 PM - Posted by ajoy39

FCAT results were included in their original review of the 970, including 4K results which would utilize all 4GB of the frame buffer

January 26, 2015 | 04:58 PM - Posted by Allyn Malventano

Not necessarily - the 'typical' settings and the games we chose tended to run <3.5GB. We are running additional tests now to check for frame time variance, etc.

January 26, 2015 | 11:47 PM - Posted by Anonymous (not verified)

Thanks for this, hopefully you can shed some light on the situation. I wouldn't be surprised if you don't find anything though, if it were a serious problem I imagine nvidia would come out and say it since they've already had to start damage control.

January 26, 2015 | 04:57 PM - Posted by Allyn Malventano

FCAT testing takes time to perform, especially when we have to tweak game settings to get perfectly within this 3.5-4GB window. A 1-day turnaround on a round of FCAT results would be challenging even on a weekday.

March 26, 2015 | 08:14 PM - Posted by Anonymous (not verified)

Well it's months later. Where are the results? Got a blowjob instead>

January 26, 2015 | 01:47 PM - Posted by Ophelos

When the GTX970 came out, I went away from Nvidia and moved back to AMD back in Oct when the huge price drop went on AMD graphic cards. That's because I take most of what reviewers say today about graphic cards with a grain of salt, mostly because the crap they say is more marketing BS then anything else.

So let this stuff be a lesson to everyone who is buying a graphic card. Do some real research on alot of different websites instead of just one or two before buying anything.

January 26, 2015 | 02:12 PM - Posted by Anonymous (not verified)

Dude,
while I do blame reviewers often, this is one thing, where it's not their fault.

They didn't know that and if you don't specifically test for something like this, it isn't something they could have found out. Why do you think it took 4 months until it came out?

So, don't blame the reviewers on that one, but blame Nvidia.

January 26, 2015 | 01:49 PM - Posted by Anonymous (not verified)

"That interface has 8 total ports to connect to collections of L2 cache and memory controllers, all of which are utilized in a GTX 980. With a GTX 970 though, only 7 of those ports are enabled, taking one of the combination L2 cache / ROP units along with it."

How many of those ports are enabled in a GTX 980M mobile variant?

January 27, 2015 | 12:50 AM - Posted by Spanners (not verified)

The GTX 980M has all 8 ports but only 12 SM units.

January 26, 2015 | 01:54 PM - Posted by Anonymous (not verified)

would be an idea to compare R9`s with full ram use - pretty sure they`ll show a uniform slow down vs resolution (ram use)

very shady of vidia - and the `mis-communication` part? they basically LIED and got caught and trying aweful PR to get out of it. again.

bumpgate anyone??

January 26, 2015 | 01:56 PM - Posted by scibuff (not verified)

Why do you ever bother posting that Nai's bench output. He clearly stated that the last 0.5GB is not from the GPU DRAM but from CUDA total memory which includes system DDR3 RAM and his app doesn't seem to be able to access the 0.5GB module at all so it is always using virtual memory from system resources. Thus showing those numbers is completely useless!

January 26, 2015 | 02:05 PM - Posted by Anonymous (not verified)

another north Korean NVidia apologist. NV got caught and are trying to use every PR trick so they don't come out smelling of s**t.

Nai`s app uses the last 512mb of GPU ram to prove it accessed slower than the rest. and the app worked. Nothing to do with system ram , so its a very worthwhile tool - if your not an NVidia shill!

January 26, 2015 | 02:26 PM - Posted by Allyn Malventano

The tool does work so long as you are testing the GPU headless. Lots of people were misusing the tool to say their 980 had the same issue.

January 26, 2015 | 02:00 PM - Posted by Anonymous (not verified)

From now on cut down/binned parts, need to be marked as such, in big red letters, and if someone is gaming on a system, the GPUs drivers, and gaming engines and such should be able to request block of regular system memory form the OS, and keep that memory held in reserve for graphics/graphics buffer use, with maybe better memory algorithms to utilize the .5 GB in a way that does not affect the game. The OS should be able to give the GPU, and gaming engine, some non pageable regular system memory assigned for the entire game, its not like a gaming rig is doing any other heavy multitasking other than the game/gaming engine that is in play.

Really discrete GPU makers should consider making their discrete products more like gaming console systems, only with MORE graphics resources, as well as on DIE/module CPUs, with plenty of GDDR5, or HBM memory, and run the whole game on the discrete card, with a streamlined gaming OS, and gaming engine all running, and hosted, on the discrete card. There is nothing like having the CPU, and GPU right next to each other, connected by an internal BUS(wide as possible) straight to a large bank of super fast memory, that side steps all the PCI/whatever encoding/decoding overhead and latency. A large block of on DIE/Module ram would be even better, in addition, to Cache all most essential gaming OS/Gaming engine code. Relying on a general purpose OS, running on a bandwidth constrained motherboard CPU, is going to be a big limiter, as gaming moves to 4k and larger resolutions.

January 26, 2015 | 02:06 PM - Posted by Anonymous (not verified)

"Let's be blunt here: access to the 0.5GB of memory, on its own and in a vacuum, would occur at 1/7th of the speed of the 3.5GB pool of memory."
Could you, please, confirm: this is the statement made by Jonah Alben/information received from Jonah Alben?

Excellent article! Thank you very much!

January 26, 2015 | 04:14 PM - Posted by Allyn Malventano

It might not be exactly what was stated, but based on what we know, that is a factual statement. Accessing >3.5GB means you are reading from a single chip, and via a single memory controller. Simple math dictates that the throughput of that section of memory (when being accessed alone) will be 1/7th of the faster section, or 1/8th of the total possible throughput.

Hitting the total *is* possible (i.e. all memory active simultaneously), but only in cases where memory both above and below the 3.5GB mark is being accessed. When usage is limited to the lower 3.5GB, the effective memory bus width is actually 224 bits, with a proportionally slower throughput of 196 GB/sec.

January 27, 2015 | 04:33 AM - Posted by Anonymous (not verified)

"Hitting the total *is* possible"

It doesn't seem like it is.

"*To those wondering how peak bandwidth would remain at 224 GB/s despite the division of memory controllers on the GTX 970, Alben stated that it can reach that speed only when memory is being accessed in both pools."

This seems to conflict with what is stated in the Anandtech article. Since the second partition has a shared connection to the crossbar, it can not be read at the same time; it might be able to be written though. Advertising this as having 224 GB/s max theoretical seems deceptive. It seems to behave as a 224-bit interface or a 32-bit interface, never a 256-bit interface. 224 is better than 192 though.

January 27, 2015 | 05:00 AM - Posted by Anonymous (not verified)

Wikipedia article is listing bandwidth as 192 or 28, but this is linking to the anandtech article as a source.

January 26, 2015 | 02:15 PM - Posted by Johnny Rook (not verified)

Man, I only wish they hadn't cut all those SMMs and L2 in my GTX 970!

Wait... I would have a GTX 980 then.

Yeah, GTX 970 could be 4-6% faster in high memory utilization scenarios but, I have a feeling it wouldn't sell for $329 and would be very close to

In the end of the day, nVIDIA did aim GTX 970 performance to fit a price target.
Tom Petersen said the other day: "nVIDIA's chips are built with a cost target (read retail price) in mind".
Sure Tom, sure! What you didn't said is that nVIDIA aim performance as well.
Is not "build the best chip we can". Don't tell me redesining GTX 970 memory interface is "the best you can"; GTX 980 memory interface is the best you can. nVIDIA did it to cap GTX 970 performance. Period.

January 26, 2015 | 02:37 PM - Posted by Allyn Malventano

I would imagine that the L2 is one of the more complex and dense portions of the GPU, and it's likely that would be something key to binning (i.e. some 980 dies have a defect in one of the L2's). Point being that if they decided the 970 would keep all L2's, then dies with a defect L2 can't be sold => lower yields => higher cost for the usable GPUs.

January 26, 2015 | 02:53 PM - Posted by Johnny Rook (not verified)

Is a very reasonable argument. I really hope that's the case and after reading Anadtech's article it seams to be. But, in the end of the day, only nVIDIA really knows why and all this does is open the way to speculation about SLi frame pacing and whatnot, which can only affect nVIDIA's credibility.

To be honest, I'm perfectly happy with the GTX 970 performance and I do not experience the stutter over 3584MB VRAM utilization everybody is talking about, I just don't no matter how much i try (and from what I've read, it seams a very SoM localized problem, which I suspect has very little to do with VRAM).

January 26, 2015 | 11:52 PM - Posted by Anonymous (not verified)

So you're saying some 980's have a defect in one of the L2's and it's still able to function correctly or it actually affects performance negatively? I have a 980 so I'm interested.

January 27, 2015 | 01:11 AM - Posted by Anonymous (not verified)

No, a 980 is a GPU that passes all tests. A 970 is a 980 that has an L2 failure. All dies, CPU or GPU, are "binned" this way. Thats (one reason) why there are so many different speed CPU's.

January 26, 2015 | 03:22 PM - Posted by Haunted Abyss (not verified)

Exactly. THAT SR IS CORRECT.
Nvidia pulled a AMD

"lets cheat the people and release the same product just cheap out on the chip and cut it down"
thus Comes time for the TI they are ready -_-

January 26, 2015 | 06:52 PM - Posted by Anonymous (not verified)

no, nvidia pulled an nvidia.
This is what they do all the time especially with mobile parts. Just keep sucking the green D mate.

January 26, 2015 | 02:22 PM - Posted by Anonymous (not verified)

http://i.imgur.com/Wcq4OBo.gif

January 26, 2015 | 02:27 PM - Posted by Anonymous (not verified)

http://imgur.com/lF2gKSS

January 26, 2015 | 03:18 PM - Posted by haha (not verified)

Haha, that was great!

January 26, 2015 | 03:43 PM - Posted by Anon (not verified)

http://imgur.com/OpHUcMR

January 27, 2015 | 05:09 PM - Posted by Anonymous (not verified)

http://a.pomf.se/hmtktw.jpg

January 26, 2015 | 02:35 PM - Posted by Anonymous (not verified)

They should have come clean and no one would complain. So, for my 1080p monitor this is nice card that will hold me at least 2 years. That was all i wanted to read. I'll read depth analysis for sure but now some thing are clearer.

January 26, 2015 | 02:36 PM - Posted by Anonymous (not verified)

shows those MORDOR video`s on GTX 970 stuttering are very relevant - with the hit the ram is taking with access when the 512MB is being sued a lot , its aweful.

January 26, 2015 | 02:41 PM - Posted by Mert (not verified)

My gtx 980 behaves like a gtx 970 in Nai benchmark as it shows slower 3 or 4 figures for each L2 and Dram. What gives? Do I have a faulty 980 with disabled SMs or something?

January 26, 2015 | 03:17 PM - Posted by Allyn Malventano

You need to run it headless. Any GUI / frame buffer in use is allocated *in addition to* what that benchmark is allocating.

January 26, 2015 | 02:42 PM - Posted by Anonymous (not verified)

Eh, I'll be sticking with my 970. It's performance hasn't suddenly changed in the last few months, and having 1/8 of the vRAM access slightly faster isn't worth the £150+ premium for a 980.

January 26, 2015 | 02:49 PM - Posted by MAXXHEW

There is something about being overcharged that is never right. Anytime you feel you are being overcharged, listen to yourself... and tell the salesmen to go f themselves.

January 26, 2015 | 02:52 PM - Posted by slapdashbr (not verified)

Interesting.

On one hand, this doesn't change the actual performance that people who bought the 970 have been getting; benchmark results are accurate and will not change. However, it is my opinion that nVidia has committed an ethical breach (not only to end customers, but to their board partners) by not honestly declaring the hardware specifications. Customers bought the 970 in no small part because it *appeared* to have identical VRAM and memory-bandwidth performance to the 980; in particular this puts it on more even footing when it comes to performance in high-resolution, multi-GPU configurations; since VRAM is mirrored across multiple cards in SLI, extremely high-end systems with multiple GPUs will be more sensitive to this otherwise subtle difference in performance. It has been shown by synthetic benchmarks that in the right usage scenarios, VRAM performance plummets by roughly a factor of 7 (as expected from this explanation of the architecture). This may be worse than "real-world" performance but it does show a clear disparity between consumer expectations and reality.

January 26, 2015 | 02:58 PM - Posted by Anonymous (not verified)

but are the benchmarks correct? a number of users are seeing stuttering in SoM for example which wasn't and isn't mentioned in reviews...

January 26, 2015 | 02:57 PM - Posted by ProMace (not verified)

My GTX970 Phantom will be arriving tomorrow. Will I be RMA'ing the card? Absolutely not. The fact remains that if you're running something that is intensive enough to fill out more than 3.5 GB on a GTX970, the card will be having a hard time coping anyway. Not even a GTX980 will help you in that area. For years and years 60 FPS - or even 120 FPS - has been the holy grail to aim for and all of a sudden people are talking about starting lawsuits... Because of a corner scenario where a card is struggling to get even 30 FPS? Yes, you can run them in SLI. But if you're really in the SLI league, you should be going top end anyway, especially for 4K.

A simple fact: the GTX970 retails for 330 euros here in The Netherlands, the GTX980 costs 60 (S-I-X-T-Y) percent more at 530 euros. For the vast majority of people, including Yours Truly, the benefits would simply not justify the additional cost. I agree that nVidia should have been upfront about the nitty gritty details. But on the other hand, REALISTICALLY speaking, I just don't buy it that a product is worshipped for several months and then from one day to the next people start peeing all over it. Face it: in NORMAL scenarios you're getting around 90 percent of GTX980 performance at around 60 percent of the price.

Just to be clear: I'm by no means an nVidia 'fanboy'. I've been using ATI/AMD for years because... well, because they offered the best bang for the buck. Which I believe is currently the case for the GTX970, regardless of the mass hysteria going on. Yes, nVidia could have branded the card 3.5 GB or even 3 GB and leave it at that. People would STILL have shelled out the 330 euros, knowing that they'd be buying some 'crippled' version of the top end card. Reading this article, I actually agree that nVidia did a smart job of 'limiting the limitations' with the GTX970 being a cut-down version of the GTX980.

So no, I'm not in the "I'm gonna sue their ass for selling me an inferior product and I'll never buy their crap again unless they compensate me by replacing my 970 with a 980" game. Following that logic, I'd be suing Devolo for selling me a DLAN1200 kit not performing beyond 130 Mbps and I'd have sued Microsoft for not being able to use 4 GB with 32 bit Windows XP. Heck, back in 1983 I would have sued Commodore for having to switch Kernel ROM visibility off and on to get to the underlying RAM, costing valuable CPU cycles. Count your blessings guys.

January 26, 2015 | 02:58 PM - Posted by Anonymous (not verified)

tottaly agree with you...

January 26, 2015 | 03:02 PM - Posted by ConcernedCitizen (not verified)

Cool story bro.

Some people actually want to use 4 GB as new 2015 games and some 2014 games can use it even at 1080p. Enjoy the gimped, I mean mistaken spec'd 970.

January 26, 2015 | 03:06 PM - Posted by jerrytsao (not verified)

Many valid points there, but majority of the people feel they've been cheated in certain degree, so this will hunt Nvidia for years to come.

January 26, 2015 | 10:35 PM - Posted by ThorAxe

I'm yet to see any indication of 'the majority of people feel they've been cheated'. Based on purely anecdotal evidence most of the people with a 970 don't care.

As for another comment stating that they use more that 3.5GB at 1080p I call BS. I've tried Crysis 3 maxed with 8x MSAA and it barely uses more than 2.3GB. I need to go to at least 4K and even then it uses about 3.4GB maxed with 4x MSAA. The only time even 4K goes over is at 8x MSAA.

This is just butt hurt fanboys complaining.

January 27, 2015 | 12:17 AM - Posted by Anonymous (not verified)

crysis 3 is a couple years old and not indicative of recent and future console ports. Far cry 4 uses 3gb of video memory at 1080p without msaa. Shadow of mordor, watch dogs, assassins creed unity are all recent games that each use more memory at 1080p than any game released in the past few years. This is a trend that will continue and probably get worse. I don't know about you, but when I buy a gpu I intend to have a good experience with future games and not just recent ones.

January 27, 2015 | 12:58 AM - Posted by ThorAxe

From what I have read FC3 only uses 2.3GB at 1080p Ultra with SMAA.

January 27, 2015 | 01:51 AM - Posted by Anonymous (not verified)

Uses or requires ?

Game being able to allocate 3gb of vram doesn't mean all 3gb are required.

January 26, 2015 | 03:11 PM - Posted by MAXXHEW

So you are walking down the street, you meet a girl... take her home. You are in bed and playing, come to find out... she lied. You will just say, Ok dude... go ahead and give it to me up the butt, at least I'm getting some.

January 26, 2015 | 03:55 PM - Posted by ProMace (not verified)

The vividness of the scenario you're portraying suggests you're citing from past experience, which is kind of disturbing. At any rate, your analogy (pun intended) falls somewhat short. You're entitled to your opinion, I just don't share that opinion.

January 26, 2015 | 05:40 PM - Posted by MAXXHEW

Hahaha not my past experience, I'm a bit smarter than that... but not much. What's funny is that some of you would even ask for another date... and the real idiots would probably just ignore the reality and try to marry the freak.

January 28, 2015 | 06:17 AM - Posted by Jules (not verified)

They should have called it:
nVidia GTX 970 4GB It's a trap! Edition.

January 30, 2015 | 12:58 PM - Posted by MAXXHEW

that would be classic.

January 26, 2015 | 03:28 PM - Posted by Allyn Malventano

ProMace: Kudos for being one of the few who A. Knows what a C64 is, and B. Knowing about the Kernel ROM switching issue. I still reference the 6522 VIA bug story as an example of significant performance issues introduced when a hardware bug has to be corrected in software.

January 27, 2015 | 05:34 AM - Posted by Anonymous (not verified)

In what universe is the 970 the current best bang for buck ?

January 31, 2015 | 02:38 AM - Posted by Mihkel (not verified)

Hear hear !

I own a 970, and when I found out that this happened... my first reaction was dissapointment... but then I realized, that screw that - the card STILL performs like a beast... no issues what so ever.

Yes, they should have been honest in the first place, but, yeah... not dissapointed in the least.

Last card I had was an ATI, which performed admirably, until they started messing with the patches which lead from one BSOD to another...

So yeah, still liking this card...

January 26, 2015 | 02:58 PM - Posted by Anonymous (not verified)

"GeForce GTX 750 Ti Whitepaper
GM107 Maxwell Architecture In-Depth
ROPs are still aligned with L2 cache slices and Memory Controllers."

still? Legacy Products too? (7xx,6xx,5xx,4xx ROPs masking products)

980 64 ROP, 2048KB L2, 256bit MC 224 GB/s
970 52 ROP, 1792KB L2, 256bit MC 224 GB/s*
960 32 ROP, 1024KB L2, 128bit MC 112 GB/s
750Ti 16 ROP, 2048KB L2, 128bit MC 86 GB/s
750 16 ROP, 2048KB L2, 128bit MC 80 GB/s

7xx,6xx,5xx,4xx???

January 26, 2015 | 03:34 PM - Posted by Allyn Malventano

No Nvidia GPUs prior to the 970 were capable of this new configuration, and therefore the older GPUs did not need to segment RAM into a faster / slower segment.

January 26, 2015 | 04:40 PM - Posted by MarkT (not verified)

Good call Allyn, I feel good with my 770 4GB :-P

January 26, 2015 | 10:16 PM - Posted by Rick (not verified)

If they thought it was such a great feature, why didn't nvidia tell us about it before today?

January 27, 2015 | 06:42 AM - Posted by Anonymous (not verified)

A 3.5 GB card with a 224-bit memory interface would be a lot harder to market than a 4 GB card with a 256-bit interface. Regardless of the current PR mess, if they had done it like they did in previous generations, a 970 would probably have been a 3 GB card with 192-bit interface. With the 970 you get an "extra" 512 MB (full speed) and an additional extra 512 MB not full speed.

January 26, 2015 | 05:43 PM - Posted by Anonymous (not verified)

"Before people complain about the ROP count difference as a performance bottleneck, keep in mind that the 13 SMMs in the GTX 970 can only output 52 pixels/clock and the seven segments of 8 ROPs each (56 total) can handle 56 pixels/clock. The SMMs are the bottleneck, not the ROPs"

About perfect design coincide (Front-end and back-end)

GM204 16SMM 64px/clock, 64 ROPs 64px/clock 980
GM204 13SMM 52px/clock, 56 ROPs 56px/clock 970
GM206 08SMM 32px/clock, 32 ROPs 32px/clock 960
GM107 05SMM 20px/clock, 16 ROPs 16px/clock 750Ti Why GM107 design not 20ROPs???
GM107 04SMM 16px/clock, 16 ROPs 16px/clock 750

January 26, 2015 | 03:00 PM - Posted by Anonymous (not verified)

Great article/video guys.
But the fact that- as you said- the 970M is built in a way to not have that additional, slower slice of memory shows how aware NVIDIA was of the issue and that makes me think the marketing team was made aware of the issue if not before the launch of the desktop 970M then surely after the launch of the 970M, which is before this was discovered by users, so they probably had the chance to fix the incorrect specs before, but chose not to.

January 26, 2015 | 03:05 PM - Posted by P0ci (not verified)

Im 13 and gotz me a new 970 from mommy for chirstmas AMD is such crap!!!!......

Seriously glag I boycotted nvidia years ago.

January 26, 2015 | 03:05 PM - Posted by Mert (not verified)

Please tell me if I see lower results for my gtx 980 does it mean it has disabled SMs? Last three shows 7.5 Gbyte/s which is even lower than gtx 970! How to make sure it is not because of windows allocating ram? I have only one gpu with no iGPU so no way to test in headless mode.. HELP PLEASE

January 26, 2015 | 03:31 PM - Posted by Allyn Malventano

If you can't test headless, then the last few results you are seeing is a slow down due to swapping with system memory. No need to try and get creative just to test headless. If you have a 980, you have all parts of the chip enabled.

January 26, 2015 | 03:34 PM - Posted by Mert (not verified)

Thanks I was worried for nothing it seems :)

January 28, 2015 | 11:52 AM - Posted by ProMace (not verified)

The term 'headless' seems to be an appropriate qualification of the parroting mass psychosis victims bashing those who don't believe their d##ks just got trimmed by half an inch, resulting in impotence.

January 26, 2015 | 03:09 PM - Posted by ZoA (not verified)

Why did nVidia even bother sticking 4th GB when it can't be used properly? Seems 970 was originally designed as 3GB card, and somebody probably decided to “make it 4GB” at the last moment for PR reasons.

Also PCper's and nV claims that issues develop only after 3.5 GB is inaccurate, benchmarks show degradation in speed of memory access the moment memory demand passes 3GB.

P.S. Who of you thinks PCper would be so forgiving and so quick to produce excuses if this happened with AMD card?

January 26, 2015 | 03:40 PM - Posted by Allyn Malventano

Based on the configuration of the 970 that has been revealed through our talks with nV, it is completely accurate. There is really no reason based on this configuration that there would be a falloff at less than 3.5GB. Differences in what is seen by others can be attributed to the fact that the tools they are using to test are not accurately reporting how much GPU allocation is being used by the game / test. One example is folks running one particular test without the GPU being run headless, and therefore having other allocations taking some of the available GPU RAM outside of the test that is being run.

January 26, 2015 | 03:11 PM - Posted by Anonymous (not verified)

Sign up for petition on https://www.change.org/p/nvidia-refund-for-gtx-970

January 26, 2015 | 03:25 PM - Posted by Drenus (not verified)

So

could any1, in plain english tell me, if 970 is a crappy deal?

i bought a new pc a few days ago, awaiting shipping, so i might still have time to change it.....

im seriously gonna be pissed if i spent 1200 dollars on a new pc, and my new 970, turns out to be 1 % better than my GTX 660 because of this crap

Thanks in advance

January 26, 2015 | 03:30 PM - Posted by Anonymous (not verified)

It's great for 1080p, nvidia fans say that it should be good for 1-2 years on 1080p, but not on higher resolutions with max graphic details.

January 26, 2015 | 03:50 PM - Posted by Allyn Malventano

We have tested the 970 it at 4k resolutions. It did well.

January 26, 2015 | 04:14 PM - Posted by Drenus (not verified)

let me get this straight, the only thing this whole thing will have any real effect on, is for that 0.01 percentile, that plays everything on max, on a 4k display

im play games on with a 1920x1080 resolution, on a 27 inch monitor, so i won't really feel any effect of this?

January 26, 2015 | 04:41 PM - Posted by Allyn Malventano

Correct, and cranking everything up at 4k on a 970 tends to result in unplayable framerates (memory speed divisions aside). SLI may be a different story though.

January 27, 2015 | 03:54 AM - Posted by Anonymous (not verified)

I hope you guys aren't leaving out 1440p out like the ones in Nvidia table. Shadows of Mordor, Battlefield 4 & Call of Duty.

If the games already run too slow at 4k what is the point. Just confirming the obvious for single gpu users.

January 26, 2015 | 04:14 PM - Posted by RS84 (not verified)

Why no FCAT?

January 26, 2015 | 04:41 PM - Posted by Allyn Malventano

It was the weekend. Ryan is testing this now.

January 26, 2015 | 04:45 PM - Posted by Drenus (not verified)

Thanks Allyn, i shat my pants when a friend first linked this, but seeing as i won't be touching anything remotly close to 4k anytime soon, im more relaxed now

Thanks again for the explanation

January 28, 2015 | 06:28 AM - Posted by Jules (not verified)

Thank you for choosing nVidia GTX 970 4GB It's a trap! Edition.

January 26, 2015 | 08:35 PM - Posted by RS84 (not verified)

Ok..

January 28, 2015 | 12:34 AM - Posted by le_sauveur92 (not verified)

Most First person shooting games does not demand a lot of vram to be used. BF4, Crysis 3, FarCry 4, even CODAW. It only shows when you crank up the settings on higher resolutions such 1440p or 4k. you can also resolve this with in SLI config.

The troubling issue for me plating Mordor using the 970 on ultra at 1080p uses 3.2gb of vram and Unity uses constantly 3.5gb on that resolution even on the bloody rooftops. Open world games would take an impact from all of these vram requirements even at current 1080p. With GTA 5 and Witcher 3 are on the horizon, this could put on some perspective on future games development.

If you want to play at 4k, just buy 980, 780 ti, 290x, or if you still want MASSIVE vram and just buy a GTX Titan.

Nvidia still needs to be transparent on these kinds of issues or they will leave a dent on their consumers' trust.

January 26, 2015 | 03:27 PM - Posted by Nvidia_Shill (not verified)

So what? I say this is not a big deal, in fact you should look at it as an added bonus for you. Now you have make belief 4GB memory and 512mb that can be accessed as system memory making your gaming PC even faster.

And swear to god all of us here at Nvidia lived under a rock for the past 5 months and really didn't see all those hundreds and hundreds of news, reviews and editorials about the GTX 970.

January 26, 2015 | 03:46 PM - Posted by lantian (not verified)

thank you for being unbiased in this , even if the 500mb are a bottleneck it.s no where near as suaver as the 6xxx series 256bit bus with 4 gig vram

January 26, 2015 | 03:47 PM - Posted by hansmoleman (not verified)

A lesson in Nvidia math:

4 = 3.5

I would have liked an asterisk next to the "4gb" on my GTX 970 retail box explaining the fact that my card performs best if a game stays below 3.5gigs.
Big time shady move.

January 26, 2015 | 03:52 PM - Posted by Anonymous (not verified)

Ryan Shrout, just out of curiosity: how much nVidia paid you to apologize for them
and if you did it only by your own volition .. holy ****

3.5Gb card - a philosophical debate?!?! you wot m8

January 26, 2015 | 04:07 PM - Posted by Ngreedia (not verified)

Falsely advertising a product, The way it's meant to get paid.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.