All | Editorial | General Tech | Graphics Cards | Networking | Motherboards | Cases and Cooling | Processors | Chipsets | Memory | Displays | Systems | Storage | Mobile | Shows and Expos
In our previous article and video, I introduced you to our upcoming testing methodology for evaluating graphics cards based not only frame rates but on frame smoothness and the efficiency of those frame rates. I showed off some of the new hardware we are using for this process and detailed how direct capture of graphics card output allows us to find interesting frame and animation anomalies using some Photoshop still frames.
Today we are taking that a step further and looking at a couple of captured videos that demonstrate a "stutter" and walking you through, frame by frame, how we can detect, visualize and even start to measure them.
This video takes a couple of examples of stutter in games, DiRT 3 and Dishonored to be exact, and shows what they look like in real time, at 25% speed and then finally in a much more detailed frame-by-frame analysis.
Obviously this is just a couple instances of what a stutter is and there are often times less apparent in-game stutters that are even harder to see in video playback. Not to worry - this capture method is capable of seeing those issues as well and we plan on diving into the "micro" level as well shortly.
We aren't going to start talking about whose card and what driver is being used yet and I know that there are still a lot of questions to be answered on this topic. You will be hearing more quite soon from us and I thank you all for your comments, critiques and support.
Let me know below what you thought of this video and any questions that you might have.
A change is coming in 2013
If the new year will bring us anything, it looks like it might be the end of using "FPS" as the primary measuring tool for graphics performance on PCs. A long, long time ago we started with simple "time demos" that recorded rendered frames in a game like Quake and then played them back as quickly as possible on a test system. The lone result was given as time, in seconds, and was then converted to an average frame rate having known the total number of frames recorded to start with.
More recently we saw a transition to frame rates over time and the advent frame time graphs like the ones we have been using in our graphics reviews on PC Perspective. This expanded the amount of data required to get an accurate picture of graphics and gaming performance but it was indeed more accurate, giving us a more clear image of how GPUs (and CPUs and systems for that matter) performed in games.
And even though the idea of frame times have been around just a long, not many people were interested in getting into that detail level until this past year. A frame time is the amount of time each frame takes to render, usually listed in milliseconds, and could range from 5ms to 50ms depending on performance. For a reference, 120 FPS equates to an average of 8.3ms, 60 FPS is 16.6ms and 30 FPS is 33.3ms. But rather than average those out by each second of time, what if you looked at each frame individually?
Scott over at Tech Report started doing that this past year and found some interesting results. I encourage all of our readers to follow up on what he has been doing as I think you'll find it incredibly educational and interesting.
Through emails and tweets many PC Perspective readers have been asking for our take on it, why we weren't testing graphics cards in the same fashion yet, etc. I've stayed quiet about it simply because we were working on quite a few different angles on our side and I wasn't ready to share results. I am still not ready to share the glut of our information yet but I am ready to start the discussion and I hope our community find its compelling and offers some feedback.
At the heart of our unique GPU testing method is this card, a high-end dual-link DVI capture card capable of handling 2560x1600 resolutions at 60 Hz. Essentially this card will act as a monitor to our GPU test bed and allow us to capture the actual display output that reaches the gamer's eyes. This method is the best possible way to measure frame rates, frame times, stutter, runts, smoothness, and any other graphics-related metrics.
Using that recorded footage, sometimes reaching 400 MB/s of consistent writes at high resolutions, we can then analyze the frames one by one, though with the help of some additional software. There are a lot of details that I am glossing over including the need for perfectly synced frame rates, having absolutely zero dropped frames in the recording, analyzing, etc, but trust me when I say we have been spending a lot of time on this.
A ton of technology in here
In the world of graphics cards there a lot of also-rans, cards that were released but didn't really leave a mark on the industry. Reference cards are a dime a dozen (not really, though $0.10 HD 7970s sounds like a great thing to me) and when the only thing vendors can compete on is price it is very hard to make a compelling argument for one card over another. The ASUS Matrix Platinum HD 7970 that we are looking at today in no way suffers from these problems - it is a custom design with unique features that really give it the ability to stand out from the clogged quarters of the GPU shelf.
As you should expect by now with the ASUS ROG brand, the Matrix HD 7970 not only has a slightly overclocked clock speed on the GPU and memory but also some unique features like VGA Hotwire, TweakIt buttons and more!
ASUS ROG Matrix Design
Before we dive into performance and our experiences in overclocknig the HD 7970 with the ASUS Matrix Platinum we wanted to go over some of the design highlights that make this graphics card unique. Available in both Matrix and Matrix Platinum (hand picked chips) revisions, this triple-slot design will include a custom built PCB with 20-phase power and quite a bit more.
This "exploded" view of the Matrix HD 7970 shows a high-level view of these features with details to follow below. Some of the features are really aimed at the extreme overclockers that like to get their hands into some LN2 but there is still a lot to offer users that just want to try their hand at getting additional performance through air overclocking.
ASUS' custom DirectCU II cooler is at work on the ASUS Matrix HD 7970 using all copper heatpipes to help lower temps by 20% compared to the reference HD 7970 while also running quieter thanks to the larger 100mm fans. These fans can be independently controlled and include the ASUS dust proof fan technology we have seen previously.
We go inside the Wii U
Last night after the midnight release of the new Nintendo Wii U gaming console, we did what any self respecting hardware fan would do: we tore it apart. That's right, while live on our PC Perspective Live! page, we opened up a pair of Wii U consoles, played a couple of games on the Deluxe while we took a tri-wing screwdriver to the second. Inside we found some interesting hardware (and a lot more screws) and at the conclusion of the 5+ hour marathon, we had a reassembled system with only a handful of leftover screws!
If you missed the show last night we have archived the entire video on our YouTube channel (embedded below) as well as the photos we took during the event in their full resolution glory. There isn't much to discuss about the teardown other than what we said in the video but I am going to leave a few comments after each set of four images.
OH! And if you missed the live event and want to be apart of another one, we are going to be holding a Hitman: Absolution Game Stream on our Live Page sponsored by AMD with giveaways like Radeon graphics cards and LOTS of game keys! Stop by again and see us on http://pcper.com/live on Tuesday the 20th at 8pm ET.
During the stream we promised photos of everything we did while taking it apart, so here you go! Click to get the full size image!
Getting inside the Wii U was surprisingly easy as the white squares over the screws were simply stickers and we didn't have to worry about any clips breaking, etc. The inside is dominated by the optical drive provided by Panasonic.
A curious new driver from AMD
In case you missed the news, AMD is going to be making a big push with their Radeon brand from now until the end of the year starting with an incredibly strong game bundle that includes as many as three full games and 20% off the new Medal of Honor. The second part of this campaign is a new driver specifically the 12.11 beta that will be posted to the public later this week.
AMD is claiming to have made some substantial improvements on quite a few games including the very popular Battlefield 3 and the upcoming Medal of Honor (both of which use the same base engine). But keep in mind that 15% is a LOT and this is the best case scenario in specific maps and you may not see benefits on others.
There are going to be some debates about the validity of these performance boosts from AMD until we can get some more specific details on WHAT has changed. Essentially the company line is that they have finally "caught up" to the GCN GPU architecture introduced with the Radeon HD 7970 in January of 2012. We traditionally see this happen with new GPU architectures from both vendors but for it to have taken this long is troublesome and will surely cause some raised eyebrows from gamers and the competition.
We decided to run through the Radeon HD 7870 GHz Edition with this new 12.11 beta driver to compare it to the 12.9 beta driver we had just completed testing on a few weeks ago. AMD claims performance advantages for all the GCN cards including the 7700/7800/7900 cards though we only had time to test a single card for our initial article. The results are on the following pages...
Another GK106 Completes the Stack
It has been an interesting year for graphics cards and 2012 still has another solid quarter of releases ahead of it. With the launch of AMD's 7000-series back in January, followed by the start of NVIDIA's Kepler lineup in March, we have had new graphics cards on a very regular basis ever since. And while AMD's Radeon HD 7000 cards seemed to be bunched together a bit better, NVIDIA has staggered the release of the various Kepler cards, either because of capacity at the manufacturing facilities or due to product marketing plans - take your pick.
Today we see the completion of the NVIDIA GeForce GTX 600 stack (if you believe the PR at NVIDIA) with the release of the GeForce GTX 650 Ti, a $150 graphics card that fills in the gap between the somewhat anemic GTX 650 and GT 640 cards and the most recently unveiled card, the GTX 660 2GB that currently sells for $229.
The GTX 650 Ti has more in common with the GTX 660 than it does the GTX 650, both being based on the GK106 GPU, but is missing some of the unique features that NVIDIA has touted of the 600-series cards like GPU Boost. Let's dive into the product and see if this new card will be the best option for those of you with $150 graphics budgets.
PhysX Settings Comparison
Borderlands 2 is a hell of a game; we actually ran a 4+ hour live event on launch day to celebrate its release and played it after our podcast that week as well. When big PC releases occur we usually like to take a look at performance of the game on a few graphics cards as well to see how NVIDIA and AMD cards stack up. Interestingly, for this title, PhysX technology was brought up again and NVIDIA was widely pushing it as a great example of implementation of the GPU-accelerated physics engine.
What you may find unique in Borderlands 2 is that the game actually allows you to enabled PhysX features at Low, Medium and High settings, with either NVIDIA or AMD Radeon graphics cards installed in your system. In past titles, like Batman: Arkham City and Mafia II, PhysX was only able to be enabled (or at least at higher settings) if you had an NVIDIA card. Many gamers that used AMD cards saw this as a slight and we tended to agree. But since we could enable it with a Radeon card installed, we were curious to see what the results would be.
Of course, don't expect the PhysX effects to be able to utilize the Radeon GPU for acceleration...
Borderlands 2 PhysX Settings Comparison
The first thing we wanted to learn was just how much difference you would see by moving from Low (the lowest setting, there is no "off") to Medium and then to High. The effects were identical on both AMD and NVIDIA cards and we made a short video here to demonstrate the changes in settings.
GK106 Completes the Circle
The release of the various Kepler-based graphics cards have been interesting to watch from the outside. Though NVIDIA certainly spiced things up with the release of the GeForce GTX 680 2GB card back in March, and then with the dual-GPU GTX 690 4GB graphics card, for quite quite some time NVIDIA was content to leave the sub-$400 markets to AMD's Radeon HD 7000 cards. And of course NVIDIA's own GTX 500-series.
But gamers and enthusiasts are fickle beings - knowing that the GTX 660 was always JUST around the corner, many of you were simply not willing to buy into the GTX 560s floating around Newegg and other online retailers. AMD benefited greatly from this lack of competition and only recently has NVIDIA started to bring their latest generation of cards to the price points MOST gamers are truly interested in.
Today we are going to take a look at the brand new GeForce GTX 660, a graphics cards with 2GB of frame buffer that will have a starting MSRP of $229. Coming in $80 under the GTX 660 Ti card released just last month, does the more vanilla GTX 660 have what it takes to replace the success of the GTX 460?
The GK106 GPU and GeForce GTX 660 2GB
NVIDIA's GK104 GPU is used in the GeForce GTX 690, GTX 680, GTX 670 and even the GTX 660 Ti. We saw the much smaller GK107 GPU with the GT 640 card, a release I was not impressed with at all. With the GTX 660 Ti starting at $299 and the GT 640 at $120, there was a WIDE gap in NVIDIA's 600-series lineup that the GTX 660 addresses with an entirely new GPU, the GK106.
First, let's take a quick look at the reference card from NVIDIA for the GeForce GTX 660 2GB - it doesn't differ much from the reference cards for the GTX 660 Ti and even the GTX 670.
The GeForce GTX 660 uses the same half-length PCB that we saw for the first time with the GTX 670 and this will allow retail partners a lot of flexibility with their card designs.
Multiple Contenders - EVGA SC
One of the most anticipated graphics card releases of the year occurred this month in the form of the GeForce GTX 660 Ti from NVIDIA, and as you would expect we were there on the day one with an in-depth review of the card at reference speeds.
The GeForce GTX 660 Ti is based on GK104, and what you might find interesting is that it is nearly identical to the specifications of the GTX 670. Both utilize 7 SMX units for a total of 1344 stream processors – or CUDA cores – and both run at a reference clock speed of 915 MHz base and 980 MHz Boost. Both include 112 texture units though the GeForce GTX 660 Ti does see a drop in ROP count from 32 to 24. Also, L2 cache drops from 512KB to 384KB along with a memory bus width drop from 256-bit to 192-bit.
We already spent quite a lot of time talking about the GTX 660 Ti compared to the other NVIDIA and AMD GPUs in the market in our review (linked above) as well as on our most recent episode of the PC Perspective Podcast. Today's story is all about the retail cards we received from various vendors including EVGA, Galaxy, MSI and Zotac. We are going to show you each card's design, the higher clocked settings that were implemented, performance differences between them and finally the overclocking comparisons of all four.
Another GK104 Option for $299
If you missed our live stream with PC Perspective's Ryan Shrout and NVIDIA's Tom Petersen discussing the new GeForce GTX 660 Ti you can find the replay at this link!!
While NVIDIA doesn't like us to use the codenames anymore, very few GPUs are as flexible and as stout as the GK104. Originally released with the GTX 680 and then with the dual-GPU beast known as the GTX 690 and THEN with the more modestly priced GTX 670, this single chip has caused AMD quite a few headaches. It appears things will only be worse with the release of the new GeForce GTX 660 Ti today, once again powered by GK104 and the Kepler architecture at the $299 price point.
While many PC gamers lament about the lack of games that really push hardware today, NVIDIA has been promoting the GTX 660 Ti as the upgrade option of choice for gamers on a 2 -4 year cycle. Back in 2008 the GTX 260 was the mid-range enthusiast option while in 2010 it was the GTX 470 based on Fermi. NVIDIA claims GTX 260 users will see more than 3x the performance increase with the 660 Ti all while generating those pixels more efficiently.
I mentioned that the GeForce GTX 660 Ti was based on GK104 and what you mind find interesting is that it is nearly identical to the specifications of the GTX 670. Both utilize 7 SMXs for a total of 1344 stream processors or CUDA cores and both run at a reference clock speed of 915 MHz base and 980 MHz Boost. Both include 112 texture units though the GeForce GTX 660 Ti does see a drop in ROP count from 32 to 24 and L2 cache drops from 512KB to 384KB. Why?
Spicing up the GTX 670
The Power Edition graphics card series from MSI is a relatively new addition to its lineup. The Power Edition often mimics that of the higher-end Lightning series, but at a far lower price (and perhaps a smaller feature set). This allows MSI to split the difference between the reference class boards and the high end Lightning GPUs.
Doing this allows users a greater variety of products to choose from, and to better tailor users' purchases by their needs and financial means. Not everyone wants to pay $600 for a GTX 680 Lightning, but what if someone was able to get similar cooling, quality, and overclocking potential for a much lower price? This is what MSI has done with one of its latest Power Edition cards.
The GTX 670 Power Edition
The NVIDIA GTX 670 cards have received accolades throughout the review press. It is a great combination of performance, power consumption, heat production, and price. It certainly caused AMD a great amount of alarm, and it hurriedly cut prices on the HD 7900 series of cards in response. The GTX 670 is a slightly cut-down version of the full GTX 680, and it runs very close to the clock speed of its bigger brother. In fact, other than texture and stream unit count, the cards are nearly identical.
Overclocked and 4GB Strong
Even though the Kepler GK104 GPU is now matured in the market, there is still a ton of life left in this not-so-small chip and Galaxy sent us a new graphics card to demonstrate just that. The Galaxy GeForce GTX 670 GC 4GB card that we are reviewing today takes the GTX 670 GPU (originally released and reviewed on May 10th) and juices it up on two different fronts: clock rates and memory capacity.
The Galaxy GTX 670 GC 4GB graphics card is based on GK104 as mentioned below and meets most of the same specifications as the reference GTX 670. That includes 1344 CUDA cores or stream processors, 112 texture units and 32 ROP units along with a 256-bit GDDR5 memory bus.
The GC title indicates that the Galaxy GTX 670 GC 4GB is overclocked as well - this card runs at 1006 MHz base clock, 1085 MHz Boost clock and 1500 MHz memory clock. Compared to the defaults of 915 MHz, 980 MHz and 1500 MHz (respectively) this Galaxy model gets a 10% increase in clock speed though we'll see how much that translates into gaming performance as we go through our review.
Of course, also in the title of the review, the Galaxy GTX 670 GC includes 4GB of frame buffer, twice as much as the reference cards. The goal is obviously to attract gamers with high resolution screens (2560x1600 or 2560x1440) as well as users interested in triple panel NVIDIA Surround gaming. We test both of those resolutions in our game collection on the following pages to see just how that works out.
7950 gets a quick refresh
Back in June, AMD released (or at least announced) an update to the Radeon HD 7970 3GB card called the GHz Edition. Besides the higher clock speeds, the card was the first AMD offering to include PowerTune with Boost–a dynamic clock scaling capability that allowed the GPU to increase clock speeds when power and temperature allowed.
While similar in ideology to the GPU Boost that NVIDIA invented with the GTX 680 Kepler launch, AMD's Boost is completely predictable and repeatable. Everyone's HD 7970 GHz Edition performs exactly the same regardless of your system or environment.
Here is some commentary that I had on the technology back in June that remains unchanged:
AMD's PowerTune with Boost technology differs from NVIDIA's GPU Boost in a couple of important ways. First, much to its original premise, AMD can guarantee exactly how all Radeon HD 7970 3GB GHz Edition graphics cards will operate, and at what speeds in any given environment. There should be no variability between the card that I get and the card that you can buy online. Using digital temperature estimation in conjunction with voltage control, the PowerTune implementation of boost is completely deterministic.
As the above diagram illustrates, the "new" part of PowerTune with the GHz Edition is the ability to vary the voltage of the GPU in real-time to address a wider range of qualified clock speeds. On the previous HD 7970s the voltage was a locked static voltage in its performance mode, meaning that it would not increase or decrease during load operations. As AMD stated to us in a conversation just prior to launch, "by having multiple voltages that can be invoked, we can be at a more optimal clock/voltage combination more of the time, and deliver higher average performance."
The problem I have with AMD's boost technology is that they are obviously implementing this as a reaction to NVIDIA's technology. That isn't necessarily a bad thing, but the tech feels a little premature because of it. We were provided no tools prior to launch to actually monitor the exact clock speed of the GPU in real-time. The ability to monitor these very small changes in clock speed are paramount to our ability to verify the company's claims, and without it we will have questions about the validity of results. GPU-Z and other applications we usually use to monitor clock speeds (including AMD's driver) only report 1050 MHz as the clock speed–no real-time dynamic changes are being reported.
(As a side note, AMD has promised to showcase their internal tool to show real-time clock speed changes in our Live Review at http://pcper.com/live on Friday the 22nd, 11am PDT / 2pm EDT.) [It has since been archived for your viewing pleasure.]
A couple of points to make here: AMD still has not released that tool to show us internal steps of clock speeds, and instead told me today that they were waiting for an updated API to allow other software (including their own CCC) to be able to report the precise results.
Today AMD is letting us preview the new HD 7950 3GB card that will be shipping soon with updated clock speeds and Boost support. The new base clock speed of the HD 7950 will be 850 MHz, compared to the 800 MHz of the original reference HD 7950. The GPU will be able to boost as high as 925 MHz. That should give the new 7950s a solid performance gain over the original with a clock speed increase of as much as 15%.
The HAWK Returns
The $300 to $400 range of video cards has become quite crowded as of late. If we can remember way back to March when AMD introduced their HD 7800 series of cards, and later that month we saw NVIDIA release their GTX 680 card. Even though NVIDIA held the price/performance crown, AMD continued to offer their products at what many considered to be grossly overpriced considering the competition. Part of this was justified because NVIDIA simply could not meet demand of their latest card, and they were often unavailable for purchase at MSRPs. Eventually AMD started cutting back prices, but this led to another issue. The HD 7950 was approaching the price of the HD 7870 GHz Edition. The difference in prices between these products was around $20, but the 7950 was around 20% faster than the 7870. This made the HD 7870 (and the slightly higher priced overclocked models) a very unattractive option for users.
It seems as though AMD and their partners have finally rectified this situation, and just in time. With NVIDIA finally being able to adequately provide stock for both the GTX 680 and GTX 670, the prices on the upper-midrange cards has taken a nice drop to where we feel they should be. We are now starting to see some very interesting products based on the HD 7850 and HD 7870 cards, one of which we are looking at today.
The MSI R7870 HAWK
The R7870 Hawk utilizes the AMD HD 7870 GPU. This chip has a reference speed of 1 GHz, but with the Hawk it is increased to a full 1100 MHz. The GPU has the entire 20 compute units enabled featuring 1280 stream processors. It has the 256 bit memory bus running 2GB of GDDR-5 memory at 1200 MHz, which gives a total bandwidth of 160 GB/sec. I am somewhat disappointed that MSI did not give the memory speed a boost, but at least the user can enable that for themselves through the Afterburner software.
A new SKU for a new battle
On launch day we hosted AMD's Evan Groenke for an in-studio live interview and discussion of about the Radeon HD 7970 GHz Edition. For the on-demand version of that event, check it out right here. Enjoy!
AMD has had a good run in the discrete graphics market for quite some time. With the Radeon HD 5000 series, the company was able to take a commanding mindshare (if not marketshare) lead from NVIDIA. While that diminished some with the HD 6000 series going up against NVIDIA's GTX 500 family, the release of the HD 7970 and HD 7950 just before the end of 2011 stepped it up again. AMD was the first to market with a 28nm GPU, the first to support DX11.1, the first with a 3GB frame buffer and the new products were simply much faster than what NVIDIA had at the time.
AMD enjoyed that crowned location on the GPU front all the way until the NVIDIA GeForce GTX 680 launched in March. In a display of technology that most reviewers never thought possible, NVIDIA had a product that was faster, more power efficient and matched or exceeded just about every feature of the AMD Radeon HD 7000 cards. Availability problems plagued NVIDIA for several months (and we just now seeing the end of the shortage) and even caused us to do nearly-weekly "stock checks" to update readers. Prices on the HD 7900 cards have slowly crept down to find a place where they are relevant in the market, but AMD appears to not really want to take a back seat to NVIDIA again.
While visiting with AMD in Seattle for the Fusion Developer Summit a couple of weeks ago, we were briefed on a new secret: Tahiti 2, or Tahiti XT2 internally. An updated Radeon HD 7970 GPU that was going to be shipping soon with higher clock speeds and a new "boost" technology in order to combat the GTX 680. Even better, this card was going to have a $499 price tag.
The GK107 GPU
What does $399 buy these days?
I think it is pretty safe to say that MSI makes some pretty nice stuff when it comes to video cards. Their previous generation of the HD 6000 and GTX 500 series of cards were quite popular, and we reviewed more than a handful here. That generation of cards really seemed to stake MSI’s reputation as one of the top video card vendors in the industry in terms of quality, features, and cooling innovation. Now we are moving onto a new generation of cards from both AMD and NVIDIA, and the challenges of keeping up MSI’s reputation seem to have increased.
The competition has become much more aggressive as of late. Asus has some unique solutions, and companies such as XFX have stepped up their designs to challenge the best of the industry. MSI has found themselves to be in a much more crowded space with upgraded cooler designs, robust feature sets, and pricing that reflects the larger selection of products that fit such niches. The question here is if MSI’s design methodology for non-reference cards is up to the challenge.
Previously I was able to review the R7970 Lightning from MSI, and it was an impressive card. I had some initial teething problems with that particular model, but a BIOS flash later and some elbow grease allowed it to work as advertised. Today I am looking at the R7950 TwinFrozr3GD5/OC. This card looks to feature a reference PCB combined with a Twin Frozr III cooling solution. I was not entirely sure what to expect with this card, since the Lightning was such a challenge at first.
XFX Throws into the Midrange Ring
Who is this XFX? This is a brand that I have not dealt with in a long time. In fact, the last time I had an XFX card was some five years ago, and it was in the form of the GeForce 8800 GTX XXX Edition. This was a pretty awesome card for the time, and it seemed to last forever in terms of performance and features in the new DX 10 world that was 2007/2008. This was a heavily overclocked card, and it would get really loud during gaming sessions. I can honestly say though that this particular card was troublefree and well built.
XFX has not always had a great reputation though, and the company has gone through some very interesting twists and turns over the years. XFX is a subsidiary of Pine Technologies. Initially XFX dealt strictly with NVIDIA based products, but a few years back when the graphics market became really tight, NVIDIA dropped several manufacturers and focused their attention on the bigger partners. Among the victims of this tightening were BFG Technologies and XFX. Unlike BFG, XFX was able to negotiate successfully with AMD to transition their product lineup to Radeon products. Since then XFX has been very aggressive in pursuing unique designs based on these AMD products. While previous generation designs did not step far from the reference products, this latest generation is a big step forward for XFX.
When the Fermi architecture was first discussed in September of 2009 at the NVIDIA GPU Technology Conference it marked an interesting turn for the company. Not only was NVIDIA releasing details about a GPU that wasn’t going to be available to consumers for another six months, but also that NVIDIA was building GPUs not strictly for gaming anymore – HPC and GPGPU were a defining target of all the company’s resources going forward.
Kepler on the other hand seemed to go back in the other direction with a consumer graphics release in March of this year without discussion of the Tesla / Quadro side of the picture. While the company liked to tout that Kepler was built for gamers I think you’ll find that with the information NVIDIA released today, Kepler was still very much designed to be an HPC powerhouse. More than likely NVIDIA’s release schedules were altered by the very successful launch of AMD’s Tahiti graphics cards under the HD 7900 brand. As a result, gamers got access to GK104 before NVIDIA’s flagship professional conference and the announcement of GK110 – a 7.1 billion transistor GPU aimed squarely at parallel computing workloads.
With the Fermi design NVIDIA took a gamble and changed directions with its GPU design betting that it could develop a microprocessor that was primarily intended for the professional markets while still appealing to the gaming markets that have sustained it for the majority of the company’s existence. While the GTX 480 flagship consumer card and the GTX 580 to some degree had overheating and efficiency drawbacks for gaming workloads compared to AMD GPUs, the GTX 680 based on Kepler GK104 has improved on them greatly. NVIDIA has still designed Kepler for high-performance computing though with a focus this time on power efficiency as well as performance though we haven’t seen the true king of this product line until today.
GK110 Die Shot
Built on the 28nm process technology from TSMC, GK110 is an absolutely MASSIVE chip built on 7.1 billion transistors and though NVIDIA hasn’t given us a die size, it is likely coming close the reticle limit of 550 square millimeters. NVIDIA is proud to call this chip the most ‘architecturally complex’ microprocessor ever built and while impressive, it means there is potential for some issues when it comes to producing a chip of this size. This GPU will be able to offer more than 1 TFlop of double precision computing power with greater than 80% efficiency and 3x the performance per watt of Fermi designs.
NVIDIA puts its head in the clouds
Today at the 2012 NVIDIA GPU Technology Conference (GTC), NVIDIA took the wraps off a new cloud gaming technology that promises to reduce latency and improve the quality of streaming gaming using the power of NVIDIA GPUs. Dubbed GeForce GRID, NVIDIA is offering the technology to online services like Gaikai and OTOY.
The goal of GRID is to bring the promise of "console quality" gaming to every device a user has. The term "console quality" is kind of important here as NVIDIA is trying desperately to not upset all the PC gamers that purchase high-margin GeForce products. The goal of GRID is pretty simple though and should be seen as an evolution of the online streaming gaming that we have covered in the past–like OnLive. Being able to play high quality games on your TV, your computer, your tablet or even your phone without the need for high-performance and power hungry graphics processors through streaming services is what many believe the future of gaming is all about.
GRID starts with the Kepler GPU - what NVIDIA is now dubbing the first "cloud GPU" - that has the capability to virtualize graphics processing while being power efficient. The inclusion of a hardware fixed-function video encoder is important as well as it will aid in the process of compressing images that are delivered over the Internet by the streaming gaming service.
This diagram shows us how the Kepler GPU handles and accelerates the processing required for online gaming services. On the server side, the necessary process for an image to find its way to the user is more than just a simple render to a frame buffer. In current cloud gaming scenarios the frame buffer would have to be copied to the main system memory, compressed on the CPU and then sent via the network connection. With NVIDIA's GRID technology that capture and compression happens on the GPU memory and thus can be on its way to the gamer faster.
The results are H.264 streams that are compressed quickly and efficiently to be sent out over the network and return to the end user on whatever device they are using.