Review Index:
Feedback

AMD Radeon HD 7970 3GB Graphics Card Review - Tahiti at 28nm

Author:
Manufacturer: AMD

The First 28nm GPU Architecture

It is going to be an exciting 2012. Both AMD and NVIDIA are going to be bringing gamers entirely new GPU architectures, Intel has Ivy Bridge up its sleeve and the CPU side of AMD is looking forward to the introduction of the Piledriver lineup. Today though we end 2011 with the official introduction of the AMD Southern Islands GPU design, a completely new architecture from the ground up that engineers have been working on for more than three years.

This GPU will be the first on several fronts: the first 28nm part, the first cards with support for PCI Express 3.0 and the first to officially support DirectX 11.1 coming with Windows 8. Southern Islands is broken up into three different families starting with Tahiti at the high-end, Pitcairn for sweet spot gaming and Cape Verde for budget discrete options. The Radeon HD 7970 card that is launching today with availability in early January is going to be the top-end single GPU option, based on Tahiti.

Let's see what 4.31 billion transistors buys you in today's market.  I have embedded a very short video review here as well for your perusal but of course, you should continue down a bit further for the entire, in-depth review of the Radeon HD 7970 GPU.

Southern Islands - Starting with Tahiti

Before we get into benchmark results we need to get a better understanding of this completely new GPU design that was first divulged in June at the AMD Fusion Developer Summit. At that time, our own lovely and talented Josh Walrath wrote up a great preview of the architecture that remains accurate and pertinent for today's release. We will include some of Josh's analysis here and interject with anything new that we have learned from AMD about the Southern Islands architecture.

When NVIDIA introduced the G80, they took a pretty radical approach to GPU design. Instead of going with previous VLIW architectures which would support operations such as Vec4+Scalar, they went with a completely scalar architecture. This allowed a combination of flexibility of operation types, ease of scheduling, and a high utilization of compute units. AMD has taken a somewhat similar, but still unique approach to their new architecture.

View Full Size

Continue reading our review of the AMD Radeon HD 7970 3GB graphics card and Southern Islands architecture!!

Instead of going with a purely scalar setup like NVIDIA, they opted for a vector + scalar solution. The new architecture revolves around the Compute Unit, which contains all of the functional units. The CU can almost be viewed as a fully independent processor. The unit features its own L1 cache, branch and MSG unit, control and decode unit, instruction fetch arbitration functionality, and the scalar and vector units.

View Full Size

The vector units are the primary workers in the CU when it comes to crunching numbers. Each unit contains four cores, and allows for four “wavefronts” to be processed at any one time. Because AMD stepped away from the VLIW5/4 architectures, and have gone with a vector+scalar setup, we expect to see a high utilization of each unit as compared to the old. We also expect scheduling to be much easier and efficient, which will again improve performance and efficiency. The scalar unit will actually be responsible for all of the pointer ops as well as branching code. This particular setup harkens back to the Cray supercomputers of the 1980s. The combination of scalar and vector processors was very intuitive for the workloads back then, and that follows onto the workloads of today that AMD looks to address.

View Full Size

The combination of these processors and the overall design of each CU gives it the properties of different types of units. It is a MIMD (multiple instructions multiple data) in that it can address four threads per cycle per vector, from different apps. It acts as a SIMD (single instruction multiple data) much like the previous generation of GPUs. Finally it has SMT (symmetric multi-threading) in that all four vector cores can be working on different instructions, and there are 40 waves active in each CU at any one time. Furthermore, as mentioned in the slide, it supports multiple asynchronous and independent command streams. Essentially the unit is able to work on all kinds of workloads at once, no matter what the source of the data or instructions are.

View Full Size

A full Tahiti GPU will feature 32 of the Compute Units, each made up of 64 stream processors. That brings the entire GPU to a total of 2048 SPs, a 33% jump over the Cayman architecture that was built around 1536 processors.

Memory and Caches

One area that AMD did detail extensively was the changes in the internal cache, as well as their push for fully virtualized memory. Each CU has its own L1 cache divided into data, instruction, and load/store. The GPU then has shared L2 cache which is fully coherent. Each L1 cache has a 64 bit interface with the L2, and once this scales in terms of both CU count and GPU clockspeed, we can expect to see multiple terabytes per second of bandwidth between the caches. The L1 caches and texture caches are now read/write, as compared to the read only units in previous architectures. This is a big nod not only to efficiency and performance, but also the type of caches needed for some serious compute type workloads.

View Full Size

The next level of memory support is that of full virtualization of memory with the CPU. Previous generations of products were limited to what memory and cache were onboard each video card. This posed some limitations on not just content in graphics, but were also problematic in compute type scenarios. Large data sets proved to be troublesome, and required a memory virtualization system which was separate from the CPUs virtual memory. By adopting x86-64 virtual memory support on the GPU, this gets rid of a lot of the problems in previous cards. The GPU shares the virtual memory space, which improves data handling and locality, as well as gracefully surviving unhappy things like page faults and oversubscriptions. This again is aimed at helping to improve the programming model. With virtual memory, the GPU’s state is not hidden, and it should also allow for fast context switches as well as context switch pre-emption. State changes and context switches can be quite costly, so when working in an environment that features both graphics based and compute workloads, the added features described above should make things go a whole lot smoother, as well as be significantly faster, thereby limiting the amount of downtime per CU.

View Full Size

It also opens up some new advantages to traditional graphics. “Megatextures” which will not fit on a card’s frame buffer can be stored in virtual memory. While not as fast as onboard, it is still far faster than loading up the texture from the hard drive. This should allow for more seamless worlds. I’m sure John Carmack is quite excited about this technology.

Obviously the 32 CU on Tahiti make up the majority of the architecture but there are several other keys to look at.  Just as we saw with Cayman, Southern Islands will offer dual geometry engines for improved scalability as well as eight render back-ends with 32 ROP engines

View Full Size

The memory interface gets a nice boost moving from the 256-bit GDDR5 interface on Cayman to a 384-bit interface capable of 264 GB/sec of bandwidth.  There are six individual 64-bit dual-channel memory controllers that create an unbalanced render back-end ratio of 4:6. 

View Full Size

And of course, the architecture rounds out with the controllers for the PCI Express 3.0 interface capable of 8 GigaTransfers / second (essentially double that of PCIe 2.0), Eyefinity display controllers, UVD engine, CrossFire compositor, etc.  All of this adds up to an amazing 4.31 billion transistors on a 28nm process technology inside a 365mm2 die

You might be wondering why we don't have the typical die shot of the lovely new 28nm Tahiti GPU.  In truth, I have no answer for you, other than we asked several times and were told in each instance that they didn't have one.  While this did arouse some suspicion from us as the the design of Tahiti, AMD assured us there were no tricks up its sleeve, the die was built with 36 CUs with 4 disabled, for example.

December 21, 2011 | 09:15 PM - Posted by wargames12

If it's faster that the 580, it kind of makes sense that it costs a bit more. It's pretty disappointing to see the prices of current gen cards stay so high for so long though. Hopefully when we see Nvidia's new card we'll see some price drops all around on the current generation.

December 21, 2011 | 09:40 PM - Posted by Mr_Tea (not verified)

My thoughts exactly. I was hoping to see AMD drive the performance/dollar up with this release. At this rate single GPUs will be launching at $1000 in 2 years :(

December 21, 2011 | 09:47 PM - Posted by wargames12

I want to blame the rich guys who will actually pay the extra 2-300 dollars for the extra 10-20 percent performance, but I can't. I would do the same if I had the extra cash haha.

December 26, 2011 | 07:49 AM - Posted by Anonymous (not verified)

That is pretty funny! You obviously don't remember how much the 8800GTX and Ultra cost when they were launched. I paid $950.00 for my first 8800GTX. With the price drop just go Crossfire it keeps getting better and better, as well as cheaper.

January 3, 2014 | 08:21 AM - Posted by Anonymous (not verified)

Ha, turns out you were right.

December 21, 2011 | 09:57 PM - Posted by Ryan Shrout

I'm glad that point came across well in my review. I love the performance out of this, but I guess I just expected/wanted AMD to undercut NVIDIA to put pressure on them and start the price wars again.

There is still a chance that NVIDIA cuts the GTX 580 down to $425 or something - they have a lot of room with the GTX 570 priced at $340. If they do that, then AMD will have to drop the 7970 price.

December 21, 2011 | 09:51 PM - Posted by Buyers

Good solid review. I think i was hoping for a little more of a performance increase over the 580. I look forward to seeing eyefinity benchmarks with this card with crossfire setup.

Couple of edits:
Page 3, talking about DDMA:Multi-tasking under image of triple screen gaming with soccer on background tv:"With the Radeon HD 7970 it is not possible to both play a game and..." should that be now, given the way the rest of the sentence and paragraph read?

Metro 2033 @ 1920x1080 line graph X-axis labels are the default Series1/2/3/4 instead of the gpu name labels.

December 21, 2011 | 09:56 PM - Posted by Tim Verry

Yes, it is now possible.

December 21, 2011 | 09:58 PM - Posted by Ryan Shrout

Yeah, thanks for that type - kind of a big difference. :)

And also, yes, we are looking forward to doing both Eyefinity and CrossFire testing very soon!

Let me know if there are particular titles you want to see tested!

December 23, 2011 | 03:01 PM - Posted by Nacelle (not verified)

I'm sure it goes without saying BF3 in Eyefinity is what everyone wants. Not much else brings two 6970's to a crawl.

December 21, 2011 | 09:57 PM - Posted by Slash3 (not verified)

Would it be possible to disable vsync on Skyrim and re-run your tests? There are several methods which successfully disable it, including editing the .cfg file (iPresentInterval=0) or using a utility like Radpro to force vsync disable on the process. As it stands, that particular game's set of benchmarks is totally useless. Nice card, though.

December 21, 2011 | 10:00 PM - Posted by Ryan Shrout

Based on my research and testing, I wasn't able to find a way to do disable Vsync with AMD cards without modding the game, which seems less than ideal.

If you have a link to a solution though directly, I'll gladly try it!

December 21, 2011 | 10:10 PM - Posted by Kennneth (not verified)

Who are you going to believe?

http://nl.hardware.info/reviews/2472/32/amd-radeon-hd-7970-review-dirt-3...

December 21, 2011 | 10:13 PM - Posted by Mr_Tea (not verified)

I saw a slide that showed this card doing separate audio out to each display and intelligently switching if a video was moved to another display. Any truth to that? Testing? That would be pretty awesome. Thanks.

December 21, 2011 | 10:19 PM - Posted by Mr_Tea (not verified)

Whoops, must have skipped that page.

June 12, 2013 | 11:33 PM - Posted by Launa (not verified)

ils peinture devis travaux pompe à chaleur
geothermie chauffage pompe à chaleur air air pac air eau
prix pompe à chaleur chauffage attendre bien que
les la clientèle détailler plage

December 21, 2011 | 10:25 PM - Posted by RashyNuke (not verified)

Ryan what is with porn music...Oh found my 7970 wetspot.

December 21, 2011 | 10:31 PM - Posted by jstnomega (not verified)

given the current state of the art re Vid games, isn't it all still a matter of pushing pixels? if that's the case, then clearly something aint right here - look at the 40nm vs 28nm Pixel Fillrate figures in Ryan's video - barely any gain at all

December 21, 2011 | 10:46 PM - Posted by Ryan Shrout

Pixel fill rate is not really the defining factor right now. It is not how many pixels you can push its what you calculate on those pixels in real-time. Shading power! Oh, and geometry is picking up again in importance.

December 21, 2011 | 11:25 PM - Posted by Anonymous (not verified)

Great review, its nice to see someone actually moving the technology along, and innovating. Seamlessly it seems, this time.

December 22, 2011 | 07:26 AM - Posted by Ryan Shrout

Agreed. More than than a year with the GTX 6970 and GTX 580 is enough for me.

December 22, 2011 | 12:05 AM - Posted by bjv2370

good review

December 22, 2011 | 01:52 AM - Posted by Irishgamer01

This card is way over priced.
I for one will be sticking to my current setups. For now.
The sweet spot for this level of card is 399.
While performance is better its not enough for me.
I want to see Nvidia's offering. If they hold their current pricing structure, match or better performance, then AMD will be punished big time.

Will give me a certain amount of pleasure, as I hate price milking, just because they can.

December 22, 2011 | 07:27 AM - Posted by Ryan Shrout

Yes, what NVIDIA has to say about GPUs in the next few months will be very important here. Curious to see if Kepler will hold up to its promises.

December 22, 2011 | 03:36 AM - Posted by Metwinge (not verified)

I shall be placing my 2 5870's for one of these as im very impressed by these benchmark scores especially in BF3 as thats the game im playing atm. I have 2 1080p monitors sitting under our tree so i can see these cards taking a bashing with the few high end games i play

Thanks for the excellent review Ryan

December 22, 2011 | 07:27 AM - Posted by Ryan Shrout

Thanks!

And have fun with Eyefinity!

December 22, 2011 | 05:37 AM - Posted by Kevin (not verified)

Great review, but I cant see spending upwards of $600 for it =/

December 22, 2011 | 05:44 AM - Posted by Apostrophe (not verified)

It is a lovely video card - shame the price is a bit high. I do hope Nvidia responds with something equally impressive. 2012 is going to be interesting.

By the way, have you guys considered adding Star Wars: The Old Republic to your battery of tests? It's the biggest MMO to launch in years and I would expect that a large portion of your readers will be interested.

December 22, 2011 | 07:28 AM - Posted by Ryan Shrout

We did consider it, but didn't have to validate before this article. We still might, depending on how GPU-bound the game is, if it all.

December 22, 2011 | 05:56 AM - Posted by nabokovfan87

Extremely glad the GCN architecture Isn't as terrible as bulldozer ended up. I went to the skyrim page first and was waiting for a huge dissapointment, but yeah.

I am intrigued to see how my box will handle the new card. I have an MSI 4850 512mb and am upgrading to the next card made by the 48xx series team. Sort of amazing to think of the differences in performance is going to be 5-10x better when comparing raw specifications alone.

If anyone is interested I will be doing some benchmarks and testing things out, if PCPER wants to use those for a writeup or discuss it, I would be more then happy to provide it and listen to your thoughts.

CANNOT WAIT!

ALSO: There are a lot of people waiting to upgrade, waiting for the "new stuff" from either to decide on what to do, that is why the pricing is so high right away. Like I said above and I am sure many others will be grabbing the 4950, it is more about upgrading then it is price. These cards last 3-4 years, and the price initially is worth it for the upgrade possibilities and expanded feature set then what is currently available in terms of power usage, temperature, performance, and so on.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.