NVIDIA Details Tegra 4 and Tegra 4i Graphics

Subject: Graphics Cards | February 25, 2013 - 08:01 PM |
Tagged: nvidia, tegra, tegra 4, Tegra 4i, pixel, vertex, PowerVR, mali, adreno, geforce

 

When Tegra 4 was introduced at CES there was precious little information about the setup of the integrated GPU.  We all knew that it would be a much more powerful GPU, but we were not entirely sure how it was set up.  Now NVIDIA has finally released a slew of whitepapers that deal with not only the GPU portion of Tegra 4, but also some of the low level features of the Cortex A15 processor.  For this little number I am just going over the graphics portion.

layout.jpg

This robust looking fellow is the Tegra 4.  Note the four pixel "pipelines" that can output 4 pixels per clock.

The graphics units on the Tegra 4 and Tegra 4i are identical in overall architecture, just that the 4i has fewer units and they are arranged slightly differently.  Tegra 4 is comprised of 72 units, 48 of which are pixel shaders.  These pixel shaders are VLIW based VEC4 units.  The other 24 units are vertex shaders.  The Tegra 4i is comprised of 60 units, 48 of which are pixel shaders and 12 are vertex shaders.  We knew at CES that it was not a unified shader design, but we were still unsure of the overall makeup of the part.  There are some very good reasons why NVIDIA went this route, as we will soon explore.

If NVIDIA were to transition to unified shaders, it would increase the overall complexity and power consumption of the part.  Each shader unit would have to be able to handle both vertex and pixel workloads, which means more transistors are needed to handle it.  Simpler shaders focused on either pixel or vertex operations are more efficient at what they do, both in terms of transistors used and power consumption.  This is the same train of thought when using fixed function units vs. fully programmable.  Yes, the programmability will give more flexibility, but the fixed function unit is again smaller, faster, and more efficient at its workload.

layout_4i.jpg

On the other hand here we have the Tegra 4i, which gives up half the pixel pipelines and vertex shaders, but keeps all 48 pixel shaders.

If there was one surprise here, it would be that the part is not completely OpenGL ES 3.0 compliant.  It is lacking in one major function that is required for certification.  This particular part cannot render at FP32 levels.  It has been quite a few years since we have heard of anything not being able to do FP32 in the PC market, but it is quite common to not support it in the power and transistor conscious mobile market.  NVIDIA decided to go with a FP 20 partial precision setup.  They claim that for all intents and purposes, it will not be noticeable to the human eye.  Colors will still be rendered properly and artifacts will be few and far between.  Remember back in the day when NVIDIA supported FP16 and FP32 while they chastised ATI for choosing FP24 with the Radeon 9700 Pro?  Times have changed a bit.  Going with FP20 is again a power and transistor saving decision.  It still supports DX9.3 and OpenGL ES 2.0, but it is not fully OpenGL ES 3.0 compliant.  This is not to say that it does not support any 3.0 features.  It in fact does support quite a bit of the functionality required by 3.0, but it is still not fully compliant.

This will be an interesting decision to watch over the next few years.  The latest Mali 600 series, PowerVR 6 series, and Adreno 300 series solutions all support OpenGL ES 3.0.  Tegra 4 is the odd man out.  While most developers have no plans to go to 3.0 anytime in the near future, it will eventually be implemented in software.  When that point comes, then the Tegra 4 based devices will be left a bit behind.  By then NVIDIA will have a fully compliant solution, but that is little comfort for those buying phones and tablets in the near future that will be saddled with non-compliance once applications hit.

ogles_feat.jpg

The list of OpenGL ES 3.0 features that are actually present in Tegra 4, but the lack of FP32 relegates it to 2.0 compliant status.

The core speed is increased to 672 MHz, well up from the 520 MHz in Tegra 3 (8 pixel and 4 vertex shaders).  The GPU can output four pixels per clock, double that of Tegra 3.  Once we consider the extra clock speed and pixel pipelines, the Tegra 4 increases pixel fillrate by 2.6x.  Pixel and vertex shading will get a huge boost in performance due to the dramatic increase of units and clockspeed.  Overall this is a very significant improvement over the previous generation of parts.

The Tegra 4 can output to a 4K display natively, and that is not the only new feature for this part.  Here is a quick list:

2x/4x Multisample Antialiasing (MSAA)

24-bit Z (versus 20-bit Z in the Tegra 3 processor) and 8-bit Stencil

4K x 4K texture size incl. Non-Power of Two textures (versus 2K x 2K in the Tegra 3 processor) – for higher quality textures, and easier to port full resolution textures from  console and PC games to Tegra 4 processor.  Good for high resolution displays.

16:1 Depth (Z) Compression and 4:1 Color Compression (versus none in Tegra 3 processor) – this is lossless compression and is useful for reducing bandwidth to/from the frame buffer, and especially effective in antialiasing processing when processing multiple samples per pixel

Depth Textures

Percentage Closer Filtering for Shadow Texture Mapping and Soft Shadows

Texture border color eliminate coarse MIP-level bleeding

sRGB for Texture Filtering, Render Surfaces and MSAA down-filter

1 - CSAA is no longer supported in Tegra 4 processors

This is a big generational jump, and now we only have to see how it performs against the other top end parts from Qualcomm, Samsung, and others utilizing IP from Imagination and ARM.

Source: NVIDIA

Triangles beat voxels when you are constructing a building

Subject: General Tech | February 22, 2013 - 12:23 PM |
Tagged: nvidia, jen-hsun huang

NVIDIA will have a new nerve center across the street from their existing headquarters as from what Jen-Hsun told The Register they are almost at the point where they need bunk-desks in their current HQ.  The triangle pattern that the artists concepts shown not only embodies a key part of NVIDIA's technology but is also a well recognized technique in architecture to provide very sturdy construction.  Hao Ko was the architect chosen for the design, his resume includes a terminal at JFK airport as well as a rather tall building in China.  For NVIDIA's overlord to plan such an expensive undertaking shows great confidence in his companies success, even with the shrinking discrete GPU market.

nvidia_new_hq_aerial_view.jpg

"Move over Apple. Nvidia cofounder and CEO Jen-Hsun Huang wants to build his own futuristic space-station campus – and as you might expect, the Nvidia design is black and green and built from triangles, the basic building block of the mathematics around graphics processing. And, as it turns out, the strongest shape in architecture."

Here is some more Tech News from around the web:

Tech Talk

Source: The Register
Author:
Manufacturer: PC Perspective

In case you missed it...

UPDATE: We have now published full details on our Frame Rating capture and analysis system as well as an entire host of benchmark results.  Please check it out!!

In one of the last pages of our recent NVIDIA GeForce GTX TITAN graphics card review we included an update to our Frame Rating graphics performance metric that details the testing method in more detail and showed results for the first time.  Because it was buried so far into the article, I thought it was worth posting this information here as a separate article to solict feedback from readers and help guide the discussion forward without getting lost in the TITAN shuffle.  If you already read that page of our TITAN review, nothing new is included below. 

I am still planning a full article based on these results sooner rather than later; for now, please leave me your thoughts, comments, ideas and criticisms in the comments below!


Why are you not testing CrossFire??

If you haven't been following our sequence of stories that investigates a completely new testing methodology we are calling "frame rating", then you are really missing out.  (Part 1 is here, part 2 is here.)  The basic premise of Frame Rating is that the performance metrics that the industry is gathering using FRAPS are inaccurate in many cases and do not properly reflect the real-world gaming experience the user has.

Because of that, we are working on another method that uses high-end dual-link DVI capture equipment to directly record the raw output from the graphics card with an overlay technology that allows us to measure frame rates as they are presented on the screen, not as they are presented to the FRAPS software sub-system.  With these tools we can measure average frame rates, frame times and stutter, all in a way that reflects exactly what the viewer sees from the game.

We aren't ready to show our full sets of results yet (soon!) but the problems lie in that AMD's CrossFire technology shows severe performance degradations when viewed under the Frame Rating microscope that do not show up nearly as dramatically under FRAPS.  As such, I decided that it was simply irresponsible of me to present data to readers that I would then immediately refute on the final pages of this review (Editor: referencing the GTX TITAN article linked above.) - it would be a waste of time for the reader and people that skip only to the performance graphs wouldn't know our theory on why the results displayed were invalid.

Many other sites will use FRAPS, will use CrossFire, and there is nothing wrong with that at all.  They are simply presenting data that they believe to be true based on the tools at their disposal.  More data is always better. 

Here are these results and our discussion.  I decided to use the most popular game out today, Battlefield 3 and please keep in mind this is NOT the worst case scenario for AMD CrossFire in any way.  I tested the Radeon HD 7970 GHz Edition in single and CrossFire configurations as well as the GeForce GTX 680 and SLI.  To gather results I used two processes:

  1. Run FRAPS while running through a repeatable section and record frame rates and frame times for 60 seconds
  2. Run our Frame Rating capture system with a special overlay that allows us to measure frame rates and frame times with post processing.

Here is an example of what the overlay looks like in Battlefield 3.

fr_sli_1.jpg

Frame Rating capture on GeForce GTX 680s in SLI - Click to Enlarge

The column on the left is actually the visuals of an overlay that is applied to each and every frame of the game early in the rendering process.  A solid color is added to the PRESENT call (more details to come later) for each individual frame.  As you know, when you are playing a game, multiple frames will make it on any single 60 Hz cycle of your monitor and because of that you get a succession of colors on the left hand side.

By measuring the pixel height of those colored columns, and knowing the order in which they should appear beforehand, we can gather the same data that FRAPS does but our results are seen AFTER any driver optimizations and DX changes the game might make.

fr_cf_1.jpg

Frame Rating capture on Radeon HD 7970 CrossFire - Click to Enlarge

Here you see a very similar screenshot running on CrossFire.  Notice the thin silver band between the maroon and purple?  That is a complete frame according to FRAPS and most reviews.  Not to us - we think that frame rendered is almost useless. 

Continue reading our 3rd part in a series of Frame Rating and to see our first performance results!!

Join PCPer and NVIDIA for a GeForce GTX TITAN Live Review!

Subject: Graphics Cards | February 21, 2013 - 01:12 PM |
Tagged: video, titan, nvidia, live review, live, kepler, geforce titan, geforce

Missed the live event?  Here is the full replay feature me and Tom Petersen!

Hopefully by now you have read our review of the NVIDIA GeForce GTX TITAN 6GB graphics card that was just released.  This is definitely a product release that highlights a generations of GPUs and I would really encourage you to read the article and offer your feedback.

However, we have another event to promote right now: NVIDIA's Tom Petersen will be joining me on PCPer Live! at 11am PT / 2pm ET to talk about the GeForce GTX TITAN and its performance, features, pricing and more! 

pcperlive2.png

GeForce GTX TITAN Live Review Stream

11am PT / 2pm ET - February 21st

PC Perspective Live! Page

If you have questions for Tom or me, you can leave them in the comments below (no registration required)!

nvidia1.jpg

TITAN up your ... you know

Subject: Graphics Cards | February 21, 2013 - 12:57 PM |
Tagged: titan, nvidia, kepler, gtx titan, gk110, geforce

Before getting into the performance of the $1000 NVIDIA TITAN it is worth looking at the improvements NVIDIA has added to this GK110 beast.  At 10.5" long it is a half inch longer than a 680 and a full 1.5" shorter than a 690, which allows it to fit in a wider variety of cases and the vastly improved thermals allow the usage of much smaller cases than other high end GPUs can manage without exotic cooling solutions.  There is also a reduction in noise generated, to the point where SLI'd TITANs run quieter than some single card solutions, not to mention much faster.  To take a look at just how much faster you can see [H]ard|OCP's results which you can compare to Ryan's results.

H_TITAN.jpg

"NVIDIA is launching a TITAN today, literally, the new GeForce GTX TITAN video card is here, and we have a lot to talk about. We test single-GPU and 2-way SLI today, with more to follow later. We will find out if this TITAN of a video card really is worth it, and just who this video card is designed for. Be prepared to face the fastest single-GPU video card."

Here are some more Graphics Card articles from around the web:

Graphics Cards

Source: [H]ard|OCP
Author:
Manufacturer: NVIDIA

TITAN is back for more!

Our NVIDIA GeForce GTX TITAN Coverage Schedule:

If you are reading this today, chances are you were here on Tuesday when we first launched our NVIDIA GeForce GTX TITAN features and preview story (accessible from the link above) and were hoping to find benchmarks then.  You didn't, but you will now.  I am here to show you that the TITAN is indeed the single fastest GPU on the market and MAY be the best graphics cards (single or dual GPU) on the market depending on what usage models you have.  Some will argue, some will disagree, but we have an interesting argument to make about this $999 gaming beast.

A brief history of time...er, TITAN

In our previous article we talked all about TITAN's GK110-based GPU, the form factor, card design, GPU Boost 2.0 features and much more and I would highly press you all to read it before going forward.  If you just want the cliff notes, I am going to copy and paste some of the most important details below.

IMG_9502.JPG

From a pure specifications standpoint the GeForce GTX TITAN based on GK110 is a powerhouse.  While the full GPU sports a total of 15 SMX units, TITAN will have 14 of them enabled for a total of 2688 shaders and 224 texture units.  Clock speeds on TITAN are a bit lower than on GK104 with a base clock rate of 836 MHz and a Boost Clock of 876 MHz.  As we will show you later in this article though the GPU Boost technology has been updated and changed quite a bit from what we first saw with the GTX 680.

The bump in the memory bus width is also key, being able to feed that many CUDA cores definitely required a boost from 256-bit to 384-bit, a 50% increase.  Even better, the memory bus is still running at 6.0 GHz resulting in total memory bandwdith of 288.4 GB/s.

blockdiagram2.jpg

Speaking of memory - this card will ship with 6GB on-board.  Yes, 6 GeeBees!!  That is twice as much as AMD's Radeon HD 7970 and three times as much as NVIDIA's own GeForce GTX 680 card.  This is without a doubt a nod to the super-computing capabilities of the GPU and the GPGPU functionality that NVIDIA is enabling with the double precision aspects of GK110.

Continue reading our full review of the NVIDIA GeForce GTX TITAN graphics card with benchmarks and an update on our Frame Rating process!!

Podcast #239 - NVIDIA GTX TITAN, PlayStation 4 Hardware, SSD Endurance and more!

Subject: General Tech | February 21, 2013 - 02:58 AM |
Tagged: titan, Tegra 4i, tegra 4, ssd, ps4, podcast, nvidia, Intel

PC Perspective Podcast #239 - 02/21/2013

Join us this week as we discuss NVIDIA GTX TITAN, PlayStation 4 Hardware, SSD Endurance and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

  • iTunes - Subscribe to the podcast directly through the iTunes Store
  • RSS - Subscribe through your regular RSS reader
  • MP3 - Direct download link to the MP3 file

Hosts: Ryan Shrout, Jeremy Hellstrom, Josh Walrath and Allyn Malventano

This Podcast is brought to you by MSI!

Program length: 0:59:53

Podcast topics of discussion:

  1. Week in Reviews:
    1. 0:01:20 Crysis 3 Live Game Stream - Win Free Stuff!!
    2. 0:03:20 Gigabyte GA-F2A85X-UP4 FM2 Motherboard Review: Overkill for Trinity?
    3. 0:09:25 NVIDIA GeForce GTX TITAN Preview - GK110, GPU Boost 2.0, Overclocking and GPGPU
    4. 0:24:15 Taking an Accurate Look at SSD Write Endurance
  2. 0:27:20 This Podcast is brought to you by MSI!
  3. News items of interest:
    1. 0:28:50 AMD wants to set the record straight on its future GPU strategy
    2. 0:36:05 A Crowd Funded Mini-ITX Case, the NCASE M1
    3. 0:38:30 Sony's Fourth Playstation (PS4) Specs Revealed
      1. Next Generation Consoles Likely Not Compatible
    4. 0:45:15 NVIDIA Releases Tegra 4i: Mini-Me!
  4. Closing:
    1. 0:49:00 Hardware / Software Pick of the Week
      1. Ryan: Excel 2013.. or not.
      2. Jeremy: WobbleWorks $75 Pen-Sized 3D Printer on Kickstarter
      3. Josh: Sweet lookin Monitor... not without quirks
      4. Allyn: Gunnar
  1. 1-888-38-PCPER or podcast@pcper.com
  2. http://pcper.com/podcast
  3. http://twitter.com/ryanshrout and http://twitter.com/pcper
  4. Closing/outro

Be sure to subscribe to the PC Perspective YouTube channel!!

NVIDIA Releases Tegra 4i: I Shall Name It... Mini-Me!

Subject: Processors | February 20, 2013 - 09:35 PM |
Tagged: Tegra 4i, tegra 4, tegra 3, Tegra 2, tegra, phoenix, nvidia, icera, i500

 

The NVIDIA Tegra 4 and Shield project were announced at this year’s CES, but there were other products in the pipeline that were just not quite ready to see the light of day at that time.  While Tegra 4 is an impressive looking part for mobile applications, it is not entirely appropriate for the majority of smart phones out there.  Sure, the nebulous “Superphone” category will utilize Tegra 4, but that is not a large part of the smartphone market.  The two basic issues with Tegra 4 is that it pulls a bit more power at the rated clockspeeds than some manufacturers like, and it does not contain a built-in modem for communication needs.

Tegra 4i_die_shot.png

The die shot of the Tegra 4i.  A lot going on in this little guy.

NVIDIA bought up UK modem designer Icera to help create true all-in-one SOCs.  Icera has a unique method with building their modems that they say is not only more flexible than what others are offering, but also much more powerful.  These modems skip a lot of fixed function units that most modems are made of and rely on high speed general purpose compute units and an interesting software stack to create smaller modems with greater flexibility when it comes to wireless standards.  At CES NVIDIA showed off the first product of this acquisition, the i500.  This is a standalone chip and is set to be offered with the Tegra 4 SOC.

Yesterday NVIDIA introduced the Tegra 4i, formerly codenamed “Grey”.  This is a combined Tegra SOC with the Icera i500 modem.  This is not exactly what we were expecting, but the results are actually quite exciting.  Before I get too out of hand about the possibilities of the chip, I must make one thing perfectly clear.  The chip itself will not be available until Q4 2013.  It will be released in limited products with greater availability in Q1 2014.  While NVIDIA is announcing this chip, end users will not get to use it until much later this year.  I believe this issue is not so much that NVIDIA cannot produce the chips, but rather the design cycles of new and complex cell phones do not allow for rapid product development.

NV_T4i_Feat.png

Tegra 4i really should not be confused for the slightly earlier Tegra 4.  The 4i actually uses the 4th revision of the Cortex A9 processor rather than the Cortex A15 in the Tegra 4.  The A9 has been a mainstay of modern cell phone processors for some years now and offers a great deal of performance when considering die size and power consumption.  The 4th revision improves IPC of the A9 in a variety of ways (memory management, prefetch, buffers, etc.), so it will perform better than previous Cortex A9 solutions.  Performance will not approach that provided by the much larger and complex A15 cores, but it is a nice little boost from what we have previously seen.

The Tegra 4 features a 72 core GPU (though NVIDIA has still declined to detail the specifics of their new mobile graphics technology- these ain’t Kepler though), while the 4i features a nearly identical unit featuring 60 cores.  There is no word so far as to what speed these will be running at or how performance really compares to the latest graphics products from ARM, Imagination, or Qualcomm.

The chip is made on TSMC’s 28 nm HPM process and features core speeds up to 2.3 GHz.  We again have no information on if that will be all four cores at that speed or turbo functionality with one core.  The design adopts the previous 4+1 core setup with four high speed cores and one power saving core.  Considering how small each core is (Cortex A9 or A15) it is not a waste of silicon as compared to the potential power savings.  The HPM process is the high power version rather than the LPM (low power) used for Tegra 4.  My guess here is that the A9 cores are not going to pull all that much power anyway due to their simpler design as compared to A15.  Hitting 2.3 GHz is also a factor in the process decision.  Also consider that +1 core that is fabricated slightly differently than the other four to allow for slower transistor switching speed with much lower leakage.

NV_T4_Comp.png

The die size looks to be in the 60 to 65 mm squared range.  This is not a whole lot larger than the original Tegra 2 which was around 50 mm squared.  Consider that the Tegra 4i has three more cores, a larger and more able GPU portion, and the integrated Icera i500 modem.  The modem is a full Cat 3 LTE capable unit (100 mbps), so bandwidth should not be an issue for this phone.  The chip has all of the features of the larger Tegra 4, such as the Computational Photography Architecture, Image Signal Processor, video engine, and the “optimized memory interface”.  All of those neat things that NVIDIA showed off at CES will be included.  The only other major feature that is not present is the ability to output 3200x2000 resolutions.  This particular chip is limited to 1920x1200.  Not a horrific tradeoff considering this will be a smartphone SOC with a max of 1080P resolution for the near future.

We expect to see Tegra 4 out in late Q2 in some devices, but not a lot.  While Tegra 4 is certainly impressive, I would argue that Tegra 4i is the more marketable product with a larger chance of success.  If it were available today, I would expect its market impact to be similar to what we saw with the original 28nm Krait SOCs from Qualcomm last year.  There is simply a lot of good technology in this core.  It is small, it has a built-in modem, and performance per mm squared looks to be pretty tremendous.  Power consumption will be appropriate for handhelds, and perhaps might turn out to be better than most current solutions built on 28 nm and 32 nm processes.

NV_Phoenix.png

NVIDIA also developed the Phoenix Reference Phone which features the Tegra 4i.  This is a rather robust looking unit with a 5” screen and 1080P resolution.  It has front and rear facing cameras, USB and HDMI ports, and is only 8 mm thin.  Just as with the original Tegra 3 it features the DirectTouch functionality which uses the +1 core to handle all touch inputs.  This makes it more accurate and sensitive as compared to other solutions on the market.

Overall I am impressed with this product.  It is a very nice balance of performance, features, and power consumption.  As mentioned before, it will not be out until Q4 2013.  This will obviously give the competition some time to hone their own products and perhaps release something that will not only compete well with Tegra 4i in its price range, but exceed it in most ways.  I am not entirely certain of this, but it is a potential danger.  The potential is low though, as the design cycles for complex and feature packed cell phones are longer than 6 to 7 months.  While NVIDIA has had some success in the SOC market, they have not had a true homerun yet.  Tegra 2 and Tegra 3 had their fair share of design wins, but did not ship in numbers that came anywhere approaching Qualcomm or Samsung.  Perhaps Tegra 4i will be that breakthrough part for NVIDIA?  Hard to say, but when we consider how aggressive this company is, how deep their developer relations, and how feature packed these products seem to be, then I think that NVIDIA will continue to gain traction and marketshare in the SOC market.

Source: NVIDIA

GeForce 314.07 WHQL Drivers: Optimized For Crysis 3, Assassin's Creed 3 & Far Cry 3

Subject: Graphics Cards | February 19, 2013 - 01:50 PM |
Tagged: nvidia, graphics drivers, geforce, 314.07

Just in time for the arrival of the Titan previews comes the new WHQL 314.07 Geforce driver from NVIDIA.  Instead of offering a list of blanket improvements and average frame rate increased, NVIDIA has assembled a list of charts showing performance differences between this driver and the previous one for their four top GPUs in both SLI and single card setups.  As well they attempt to answer the question "Will it play Crysis 3?" with the chart below, showing the performance you can expect with Very High settings at 1080p resolution and 4x AA.  They also provide a link to their GeForce Experience tool which will optimize your Crysis 3 settings to whatever NVIDIA card(s) you happen to be using.  Upgrade now as the new driver seems to offer improvements across the board.

nvidia-geforce-314-07-whql-drivers-crysis-3-performance-chart-650.png

 

The new GeForce 314.07 WHQL driver is now available to download. An essential update for gamers jumping into Crysis 3 this week, 314.07 WHQL improves single-GPU and multi-GPU performance in Crytek’s sci-fi shooter by up to 65%.

Other highlights include sizeable SLI and single-GPU performance gains of up to 27% in Assassin’s Creed III, 19% in Civilization V, 14% in Call of Duty: Black Ops 2, 14% in DiRT 3, 11% in Just Cause 2, 10% in Deus Ex: Human Revolution, 10% in F1 2012, and 10% in Far Cry 3.

Rounding out the release is a ‘Excellent’ 3D Vision profile for Crysis 3, a SLI profile for Ninja Theory’s DmC: Devil May Cry, and an updated SLI profile for the free-to-play, third-person co-op shooter, Warframe.

You can download the GeForce 314.07 WHQL drivers with one click from the GeForce.com homepage; Windows XP, Windows 7 and Windows 8 packages are available for desktop systems, and for notebooks there are Windows 7 and Windows 8 downloads that cover all non-legacy products.

Source: NVIDIA
Author:
Manufacturer: NVIDIA

GK110 Makes Its Way to Gamers

Our NVIDIA GeForce GTX TITAN Coverage Schedule:

Back in May of 2012 NVIDIA released information on GK110, a new GPU that the company was targeting towards HPC (high performance computing) and the GPGPU markets that are eager for more processing power.  Almost immediately the questions began on when we might see the GK110 part make its way to consumers and gamers in addition to finding a home in supercomputers like Cray's Titan system capable of 17.59 Petaflops/s. 

 

Video Loading...

Watch this same video on our YouTube channel

02.jpg

Nine months later we finally have an answer - the GeForce GTX TITAN is a consumer graphics card built around the GK110 GPU.  Comprised of 2,688 CUDA cores, 7.1 billion transistors and with a die size of 551 mm^2, the GTX TITAN is a big step forward (both in performance and physical size).

specs3.jpg

From a pure specifications standpoint the GeForce GTX TITAN based on GK110 is a powerhouse.  While the full GPU sports a total of 15 SMX units, TITAN will have 14 of them enabled for a total of 2688 shaders and 224 texture units.  Clock speeds on TITAN are a bit lower than on GK104 with a base clock rate of 836 MHz and a Boost Clock of 876 MHz.  As we will show you later in this article though the GPU Boost technology has been updated and changed quite a bit from what we first saw with the GTX 680.

The bump in the memory bus width is also key, being able to feed that many CUDA cores definitely required a boost from 256-bit to 384-bit, a 50% increase.  Even better, the memory bus is still running at 6.0 GHz resulting in total memory bandwdith of 288.4 GB/s

Continue reading our preview of the brand new NVIDIA GeForce GTX TITAN graphics card!!