A year of GeForce drivers reviewed

Subject: Graphics Cards | March 5, 2013 - 02:28 PM |
Tagged: nvidia, geforce, graphics drivers

After evaluating the evolution of AMD's drivers over 2012, [H]ard|OCP has now finalized their look at NVIDIA's offerings over the past year.  They chose a half dozen drivers spanning March to December, tested on both the GTX680 and GTX 670.  As you can see throughout the review, NVIDIA's performance was mostly stable apart from the final driver of 2012 which provided noticeably improved performance in several games.  [H] compared the frame rates from both companies on the same chart and it makes the steady improvement of AMD's drivers over the year even more obvious.  That does imply that AMD's initial drivers for this year needed improvement and that perhaps the driver team at AMD has a lot of work cut out for them in 2013 if they want to reach a high level of performance across the board, with game specific improvements offering the only deviation in performance.

H_Geforce.jpg

"We have evaluated AMD and NVIDIA's 2012 video card driver performances separately. Today we will be combining these two evaluations to show each companies full body of work in 2012. We will also be looking at some unique graphs that show how each video cards driver improved or worsened performance in each game throughout the year."

Here are some more Graphics Card articles from around the web:

Graphics Cards

Source: [H]ard|OCP

NVIDIA Details Tegra 4 and Tegra 4i Graphics

Subject: Graphics Cards | February 25, 2013 - 08:01 PM |
Tagged: nvidia, tegra, tegra 4, Tegra 4i, pixel, vertex, PowerVR, mali, adreno, geforce

 

When Tegra 4 was introduced at CES there was precious little information about the setup of the integrated GPU.  We all knew that it would be a much more powerful GPU, but we were not entirely sure how it was set up.  Now NVIDIA has finally released a slew of whitepapers that deal with not only the GPU portion of Tegra 4, but also some of the low level features of the Cortex A15 processor.  For this little number I am just going over the graphics portion.

layout.jpg

This robust looking fellow is the Tegra 4.  Note the four pixel "pipelines" that can output 4 pixels per clock.

The graphics units on the Tegra 4 and Tegra 4i are identical in overall architecture, just that the 4i has fewer units and they are arranged slightly differently.  Tegra 4 is comprised of 72 units, 48 of which are pixel shaders.  These pixel shaders are VLIW based VEC4 units.  The other 24 units are vertex shaders.  The Tegra 4i is comprised of 60 units, 48 of which are pixel shaders and 12 are vertex shaders.  We knew at CES that it was not a unified shader design, but we were still unsure of the overall makeup of the part.  There are some very good reasons why NVIDIA went this route, as we will soon explore.

If NVIDIA were to transition to unified shaders, it would increase the overall complexity and power consumption of the part.  Each shader unit would have to be able to handle both vertex and pixel workloads, which means more transistors are needed to handle it.  Simpler shaders focused on either pixel or vertex operations are more efficient at what they do, both in terms of transistors used and power consumption.  This is the same train of thought when using fixed function units vs. fully programmable.  Yes, the programmability will give more flexibility, but the fixed function unit is again smaller, faster, and more efficient at its workload.

layout_4i.jpg

On the other hand here we have the Tegra 4i, which gives up half the pixel pipelines and vertex shaders, but keeps all 48 pixel shaders.

If there was one surprise here, it would be that the part is not completely OpenGL ES 3.0 compliant.  It is lacking in one major function that is required for certification.  This particular part cannot render at FP32 levels.  It has been quite a few years since we have heard of anything not being able to do FP32 in the PC market, but it is quite common to not support it in the power and transistor conscious mobile market.  NVIDIA decided to go with a FP 20 partial precision setup.  They claim that for all intents and purposes, it will not be noticeable to the human eye.  Colors will still be rendered properly and artifacts will be few and far between.  Remember back in the day when NVIDIA supported FP16 and FP32 while they chastised ATI for choosing FP24 with the Radeon 9700 Pro?  Times have changed a bit.  Going with FP20 is again a power and transistor saving decision.  It still supports DX9.3 and OpenGL ES 2.0, but it is not fully OpenGL ES 3.0 compliant.  This is not to say that it does not support any 3.0 features.  It in fact does support quite a bit of the functionality required by 3.0, but it is still not fully compliant.

This will be an interesting decision to watch over the next few years.  The latest Mali 600 series, PowerVR 6 series, and Adreno 300 series solutions all support OpenGL ES 3.0.  Tegra 4 is the odd man out.  While most developers have no plans to go to 3.0 anytime in the near future, it will eventually be implemented in software.  When that point comes, then the Tegra 4 based devices will be left a bit behind.  By then NVIDIA will have a fully compliant solution, but that is little comfort for those buying phones and tablets in the near future that will be saddled with non-compliance once applications hit.

ogles_feat.jpg

The list of OpenGL ES 3.0 features that are actually present in Tegra 4, but the lack of FP32 relegates it to 2.0 compliant status.

The core speed is increased to 672 MHz, well up from the 520 MHz in Tegra 3 (8 pixel and 4 vertex shaders).  The GPU can output four pixels per clock, double that of Tegra 3.  Once we consider the extra clock speed and pixel pipelines, the Tegra 4 increases pixel fillrate by 2.6x.  Pixel and vertex shading will get a huge boost in performance due to the dramatic increase of units and clockspeed.  Overall this is a very significant improvement over the previous generation of parts.

The Tegra 4 can output to a 4K display natively, and that is not the only new feature for this part.  Here is a quick list:

2x/4x Multisample Antialiasing (MSAA)

24-bit Z (versus 20-bit Z in the Tegra 3 processor) and 8-bit Stencil

4K x 4K texture size incl. Non-Power of Two textures (versus 2K x 2K in the Tegra 3 processor) – for higher quality textures, and easier to port full resolution textures from  console and PC games to Tegra 4 processor.  Good for high resolution displays.

16:1 Depth (Z) Compression and 4:1 Color Compression (versus none in Tegra 3 processor) – this is lossless compression and is useful for reducing bandwidth to/from the frame buffer, and especially effective in antialiasing processing when processing multiple samples per pixel

Depth Textures

Percentage Closer Filtering for Shadow Texture Mapping and Soft Shadows

Texture border color eliminate coarse MIP-level bleeding

sRGB for Texture Filtering, Render Surfaces and MSAA down-filter

1 - CSAA is no longer supported in Tegra 4 processors

This is a big generational jump, and now we only have to see how it performs against the other top end parts from Qualcomm, Samsung, and others utilizing IP from Imagination and ARM.

Source: NVIDIA
Author:
Manufacturer: PC Perspective

In case you missed it...

UPDATE: We have now published full details on our Frame Rating capture and analysis system as well as an entire host of benchmark results.  Please check it out!!

In one of the last pages of our recent NVIDIA GeForce GTX TITAN graphics card review we included an update to our Frame Rating graphics performance metric that details the testing method in more detail and showed results for the first time.  Because it was buried so far into the article, I thought it was worth posting this information here as a separate article to solict feedback from readers and help guide the discussion forward without getting lost in the TITAN shuffle.  If you already read that page of our TITAN review, nothing new is included below. 

I am still planning a full article based on these results sooner rather than later; for now, please leave me your thoughts, comments, ideas and criticisms in the comments below!


Why are you not testing CrossFire??

If you haven't been following our sequence of stories that investigates a completely new testing methodology we are calling "frame rating", then you are really missing out.  (Part 1 is here, part 2 is here.)  The basic premise of Frame Rating is that the performance metrics that the industry is gathering using FRAPS are inaccurate in many cases and do not properly reflect the real-world gaming experience the user has.

Because of that, we are working on another method that uses high-end dual-link DVI capture equipment to directly record the raw output from the graphics card with an overlay technology that allows us to measure frame rates as they are presented on the screen, not as they are presented to the FRAPS software sub-system.  With these tools we can measure average frame rates, frame times and stutter, all in a way that reflects exactly what the viewer sees from the game.

We aren't ready to show our full sets of results yet (soon!) but the problems lie in that AMD's CrossFire technology shows severe performance degradations when viewed under the Frame Rating microscope that do not show up nearly as dramatically under FRAPS.  As such, I decided that it was simply irresponsible of me to present data to readers that I would then immediately refute on the final pages of this review (Editor: referencing the GTX TITAN article linked above.) - it would be a waste of time for the reader and people that skip only to the performance graphs wouldn't know our theory on why the results displayed were invalid.

Many other sites will use FRAPS, will use CrossFire, and there is nothing wrong with that at all.  They are simply presenting data that they believe to be true based on the tools at their disposal.  More data is always better. 

Here are these results and our discussion.  I decided to use the most popular game out today, Battlefield 3 and please keep in mind this is NOT the worst case scenario for AMD CrossFire in any way.  I tested the Radeon HD 7970 GHz Edition in single and CrossFire configurations as well as the GeForce GTX 680 and SLI.  To gather results I used two processes:

  1. Run FRAPS while running through a repeatable section and record frame rates and frame times for 60 seconds
  2. Run our Frame Rating capture system with a special overlay that allows us to measure frame rates and frame times with post processing.

Here is an example of what the overlay looks like in Battlefield 3.

fr_sli_1.jpg

Frame Rating capture on GeForce GTX 680s in SLI - Click to Enlarge

The column on the left is actually the visuals of an overlay that is applied to each and every frame of the game early in the rendering process.  A solid color is added to the PRESENT call (more details to come later) for each individual frame.  As you know, when you are playing a game, multiple frames will make it on any single 60 Hz cycle of your monitor and because of that you get a succession of colors on the left hand side.

By measuring the pixel height of those colored columns, and knowing the order in which they should appear beforehand, we can gather the same data that FRAPS does but our results are seen AFTER any driver optimizations and DX changes the game might make.

fr_cf_1.jpg

Frame Rating capture on Radeon HD 7970 CrossFire - Click to Enlarge

Here you see a very similar screenshot running on CrossFire.  Notice the thin silver band between the maroon and purple?  That is a complete frame according to FRAPS and most reviews.  Not to us - we think that frame rendered is almost useless. 

Continue reading our 3rd part in a series of Frame Rating and to see our first performance results!!

Join PCPer and NVIDIA for a GeForce GTX TITAN Live Review!

Subject: Graphics Cards | February 21, 2013 - 01:12 PM |
Tagged: video, titan, nvidia, live review, live, kepler, geforce titan, geforce

Missed the live event?  Here is the full replay feature me and Tom Petersen!

Hopefully by now you have read our review of the NVIDIA GeForce GTX TITAN 6GB graphics card that was just released.  This is definitely a product release that highlights a generations of GPUs and I would really encourage you to read the article and offer your feedback.

However, we have another event to promote right now: NVIDIA's Tom Petersen will be joining me on PCPer Live! at 11am PT / 2pm ET to talk about the GeForce GTX TITAN and its performance, features, pricing and more! 

pcperlive2.png

GeForce GTX TITAN Live Review Stream

11am PT / 2pm ET - February 21st

PC Perspective Live! Page

If you have questions for Tom or me, you can leave them in the comments below (no registration required)!

nvidia1.jpg

TITAN up your ... you know

Subject: Graphics Cards | February 21, 2013 - 12:57 PM |
Tagged: titan, nvidia, kepler, gtx titan, gk110, geforce

Before getting into the performance of the $1000 NVIDIA TITAN it is worth looking at the improvements NVIDIA has added to this GK110 beast.  At 10.5" long it is a half inch longer than a 680 and a full 1.5" shorter than a 690, which allows it to fit in a wider variety of cases and the vastly improved thermals allow the usage of much smaller cases than other high end GPUs can manage without exotic cooling solutions.  There is also a reduction in noise generated, to the point where SLI'd TITANs run quieter than some single card solutions, not to mention much faster.  To take a look at just how much faster you can see [H]ard|OCP's results which you can compare to Ryan's results.

H_TITAN.jpg

"NVIDIA is launching a TITAN today, literally, the new GeForce GTX TITAN video card is here, and we have a lot to talk about. We test single-GPU and 2-way SLI today, with more to follow later. We will find out if this TITAN of a video card really is worth it, and just who this video card is designed for. Be prepared to face the fastest single-GPU video card."

Here are some more Graphics Card articles from around the web:

Graphics Cards

Source: [H]ard|OCP
Author:
Manufacturer: NVIDIA

TITAN is back for more!

Our NVIDIA GeForce GTX TITAN Coverage Schedule:

If you are reading this today, chances are you were here on Tuesday when we first launched our NVIDIA GeForce GTX TITAN features and preview story (accessible from the link above) and were hoping to find benchmarks then.  You didn't, but you will now.  I am here to show you that the TITAN is indeed the single fastest GPU on the market and MAY be the best graphics cards (single or dual GPU) on the market depending on what usage models you have.  Some will argue, some will disagree, but we have an interesting argument to make about this $999 gaming beast.

A brief history of time...er, TITAN

In our previous article we talked all about TITAN's GK110-based GPU, the form factor, card design, GPU Boost 2.0 features and much more and I would highly press you all to read it before going forward.  If you just want the cliff notes, I am going to copy and paste some of the most important details below.

IMG_9502.JPG

From a pure specifications standpoint the GeForce GTX TITAN based on GK110 is a powerhouse.  While the full GPU sports a total of 15 SMX units, TITAN will have 14 of them enabled for a total of 2688 shaders and 224 texture units.  Clock speeds on TITAN are a bit lower than on GK104 with a base clock rate of 836 MHz and a Boost Clock of 876 MHz.  As we will show you later in this article though the GPU Boost technology has been updated and changed quite a bit from what we first saw with the GTX 680.

The bump in the memory bus width is also key, being able to feed that many CUDA cores definitely required a boost from 256-bit to 384-bit, a 50% increase.  Even better, the memory bus is still running at 6.0 GHz resulting in total memory bandwdith of 288.4 GB/s.

blockdiagram2.jpg

Speaking of memory - this card will ship with 6GB on-board.  Yes, 6 GeeBees!!  That is twice as much as AMD's Radeon HD 7970 and three times as much as NVIDIA's own GeForce GTX 680 card.  This is without a doubt a nod to the super-computing capabilities of the GPU and the GPGPU functionality that NVIDIA is enabling with the double precision aspects of GK110.

Continue reading our full review of the NVIDIA GeForce GTX TITAN graphics card with benchmarks and an update on our Frame Rating process!!

GeForce 314.07 WHQL Drivers: Optimized For Crysis 3, Assassin's Creed 3 & Far Cry 3

Subject: Graphics Cards | February 19, 2013 - 01:50 PM |
Tagged: nvidia, graphics drivers, geforce, 314.07

Just in time for the arrival of the Titan previews comes the new WHQL 314.07 Geforce driver from NVIDIA.  Instead of offering a list of blanket improvements and average frame rate increased, NVIDIA has assembled a list of charts showing performance differences between this driver and the previous one for their four top GPUs in both SLI and single card setups.  As well they attempt to answer the question "Will it play Crysis 3?" with the chart below, showing the performance you can expect with Very High settings at 1080p resolution and 4x AA.  They also provide a link to their GeForce Experience tool which will optimize your Crysis 3 settings to whatever NVIDIA card(s) you happen to be using.  Upgrade now as the new driver seems to offer improvements across the board.

nvidia-geforce-314-07-whql-drivers-crysis-3-performance-chart-650.png

 

The new GeForce 314.07 WHQL driver is now available to download. An essential update for gamers jumping into Crysis 3 this week, 314.07 WHQL improves single-GPU and multi-GPU performance in Crytek’s sci-fi shooter by up to 65%.

Other highlights include sizeable SLI and single-GPU performance gains of up to 27% in Assassin’s Creed III, 19% in Civilization V, 14% in Call of Duty: Black Ops 2, 14% in DiRT 3, 11% in Just Cause 2, 10% in Deus Ex: Human Revolution, 10% in F1 2012, and 10% in Far Cry 3.

Rounding out the release is a ‘Excellent’ 3D Vision profile for Crysis 3, a SLI profile for Ninja Theory’s DmC: Devil May Cry, and an updated SLI profile for the free-to-play, third-person co-op shooter, Warframe.

You can download the GeForce 314.07 WHQL drivers with one click from the GeForce.com homepage; Windows XP, Windows 7 and Windows 8 packages are available for desktop systems, and for notebooks there are Windows 7 and Windows 8 downloads that cover all non-legacy products.

Source: NVIDIA
Author:
Manufacturer: NVIDIA

GK110 Makes Its Way to Gamers

Our NVIDIA GeForce GTX TITAN Coverage Schedule:

Back in May of 2012 NVIDIA released information on GK110, a new GPU that the company was targeting towards HPC (high performance computing) and the GPGPU markets that are eager for more processing power.  Almost immediately the questions began on when we might see the GK110 part make its way to consumers and gamers in addition to finding a home in supercomputers like Cray's Titan system capable of 17.59 Petaflops/s. 

 

Video Loading...

Watch this same video on our YouTube channel

02.jpg

Nine months later we finally have an answer - the GeForce GTX TITAN is a consumer graphics card built around the GK110 GPU.  Comprised of 2,688 CUDA cores, 7.1 billion transistors and with a die size of 551 mm^2, the GTX TITAN is a big step forward (both in performance and physical size).

specs3.jpg

From a pure specifications standpoint the GeForce GTX TITAN based on GK110 is a powerhouse.  While the full GPU sports a total of 15 SMX units, TITAN will have 14 of them enabled for a total of 2688 shaders and 224 texture units.  Clock speeds on TITAN are a bit lower than on GK104 with a base clock rate of 836 MHz and a Boost Clock of 876 MHz.  As we will show you later in this article though the GPU Boost technology has been updated and changed quite a bit from what we first saw with the GTX 680.

The bump in the memory bus width is also key, being able to feed that many CUDA cores definitely required a boost from 256-bit to 384-bit, a 50% increase.  Even better, the memory bus is still running at 6.0 GHz resulting in total memory bandwdith of 288.4 GB/s

Continue reading our preview of the brand new NVIDIA GeForce GTX TITAN graphics card!!

NVIDIA Joins the Bundle Game: Up to $150 in credit on Free-to-Play games for all GTX buyers

Subject: Graphics Cards | February 11, 2013 - 12:33 PM |
Tagged: world of tanks, planetside 2, nvidia, Hawken, gtx, geforce, bundle

AMD has definitely been winning the "game" of game bundles and bonus content with graphics cards purchases, as is evident from the recent Never Settle Reloaded campaign that includes titles like Crysis 3, Bioshock Infinite and Tomb Raider.  I made comments that NVIDIA was falling behind and may even start to look like they have moved away from a focus on PC gamers since they hadn't made any reply over the last year...

After losing a bidding war with AMD over Crysis 3, today NVIDIA is unveiling a bundle campaign that attack at a different angle; rather than including in bundled games NVIDIA is working free-to-play titles.  How do you give gamers bonuses by including free to play games?  Credits!  Cold hard cash!

bundle1.png

Starting today if you pick up any GeForce GTX graphics card you'll be eligible to get free in-game credit to use in one of the three free-to-play titles partnering with NVIDIA.  A GTX 650 or GTX 650 Ti will net you $25 in each for a total bonus of $75 while buying a GTX 660 or higher, all the way up to the GTX 690 results in $50 per game for a total of $150.

Also, after asking NVIDIA about it, this is a PER CARD bundle so if you get an SLI pair of anything, you'll get double the credit.  A pair of GeForce GTX 660s for an SLI rig results in $100 per game, $300 total!

bundle3.png

This is a very interesting approach that NVIDIA has decided to take and I am eager to get feedback from our readers on the differences between AMD's and NVIDIA's bundles.  I have played quite a bit of Planetside 2 and definitely enjoyed it; it is a graphics showcase as well with huge and expansive levels and hundreds of people per server.  World of Tanks and Hawken I am less familiar with but they also are extremely popular.

bundle2.png

Leave us your comments below!  Do you think NVIDIA's new GeForce GTX gaming bundle for free-to-play game credits can be successful! 

If you are looking for a new GeForce GTX card today and this bundle convinced you to buy, feel free to use the links below. 

Rumor: NVIDIA GK110 based GeForce GPU 'Titan' to be released late February

Subject: Graphics Cards | January 22, 2013 - 02:44 PM |
Tagged: nvidia, geforce, gk110, titan, rumor

A combination of rumors and news pieces found online and in some recent conversations with partners indicates that February will see the release of a new super-high-end graphics card from NVIDIA based on the GK110 GPU.  Apparently using the name "Titan" based on a report from Sweclockers.com, this new single GPU card will feature 2688 CUDA cores, compared to the 1536 in the GeForce GTX 680. 

gk110.jpg

If true, the name Titan likely refers to the Cray super computer of the same name built using GK110 Kepler Tesla cards.  Sweclockers.com's sources are quoted with the clocks of this new super-GPU as well: 732 MHz core clock and 5.2 GHz GDDR5 memory clock.  While those numbers are low compared to the 1000+ MHz speeds of the GK104 parts out today, this GPU would have 75% more compute units and presumably additional memory capacity as well.  The memory bus width of 384-bits is a 50% increase as well which would indicate another big jump in performance over current cards.  The CUDA core count of 2688 is actually indicative of a GK110 GPU with a single SMX disabled as well.

teslagpu2.jpg

The NVIDIA Titan card will apparently be the replacement for the GeForce GTX 690, a dual-GK104 card launched in May of last year.  The performance estimate for the Titan is approximately 85% of that GTX 690 and if the rumors are right it would see an $899 price tag.

Based on other conversations I have had recently you should only expect those same partners that were able to sell the GTX 690 to stock this new GK110-based part.  There won't be any modifications and you will see very little differentiation between vendors branding on it.  If dates are to be believed, we are hearing that a Feb 25th (or at least that week) launch is the current target.