NVIDIA Details Tegra 4 and Tegra 4i Graphics
Subject: Graphics Cards | February 25, 2013 - 08:01 PM | Josh Walrath
Tagged: nvidia, tegra, tegra 4, Tegra 4i, pixel, vertex, PowerVR, mali, adreno, geforce
When Tegra 4 was introduced at CES there was precious little information about the setup of the integrated GPU. We all knew that it would be a much more powerful GPU, but we were not entirely sure how it was set up. Now NVIDIA has finally released a slew of whitepapers that deal with not only the GPU portion of Tegra 4, but also some of the low level features of the Cortex A15 processor. For this little number I am just going over the graphics portion.
This robust looking fellow is the Tegra 4. Note the four pixel "pipelines" that can output 4 pixels per clock.
The graphics units on the Tegra 4 and Tegra 4i are identical in overall architecture, just that the 4i has fewer units and they are arranged slightly differently. Tegra 4 is comprised of 72 units, 48 of which are pixel shaders. These pixel shaders are VLIW based VEC4 units. The other 24 units are vertex shaders. The Tegra 4i is comprised of 60 units, 48 of which are pixel shaders and 12 are vertex shaders. We knew at CES that it was not a unified shader design, but we were still unsure of the overall makeup of the part. There are some very good reasons why NVIDIA went this route, as we will soon explore.
If NVIDIA were to transition to unified shaders, it would increase the overall complexity and power consumption of the part. Each shader unit would have to be able to handle both vertex and pixel workloads, which means more transistors are needed to handle it. Simpler shaders focused on either pixel or vertex operations are more efficient at what they do, both in terms of transistors used and power consumption. This is the same train of thought when using fixed function units vs. fully programmable. Yes, the programmability will give more flexibility, but the fixed function unit is again smaller, faster, and more efficient at its workload.
On the other hand here we have the Tegra 4i, which gives up half the pixel pipelines and vertex shaders, but keeps all 48 pixel shaders.
If there was one surprise here, it would be that the part is not completely OpenGL ES 3.0 compliant. It is lacking in one major function that is required for certification. This particular part cannot render at FP32 levels. It has been quite a few years since we have heard of anything not being able to do FP32 in the PC market, but it is quite common to not support it in the power and transistor conscious mobile market. NVIDIA decided to go with a FP 20 partial precision setup. They claim that for all intents and purposes, it will not be noticeable to the human eye. Colors will still be rendered properly and artifacts will be few and far between. Remember back in the day when NVIDIA supported FP16 and FP32 while they chastised ATI for choosing FP24 with the Radeon 9700 Pro? Times have changed a bit. Going with FP20 is again a power and transistor saving decision. It still supports DX9.3 and OpenGL ES 2.0, but it is not fully OpenGL ES 3.0 compliant. This is not to say that it does not support any 3.0 features. It in fact does support quite a bit of the functionality required by 3.0, but it is still not fully compliant.
This will be an interesting decision to watch over the next few years. The latest Mali 600 series, PowerVR 6 series, and Adreno 300 series solutions all support OpenGL ES 3.0. Tegra 4 is the odd man out. While most developers have no plans to go to 3.0 anytime in the near future, it will eventually be implemented in software. When that point comes, then the Tegra 4 based devices will be left a bit behind. By then NVIDIA will have a fully compliant solution, but that is little comfort for those buying phones and tablets in the near future that will be saddled with non-compliance once applications hit.
The list of OpenGL ES 3.0 features that are actually present in Tegra 4, but the lack of FP32 relegates it to 2.0 compliant status.
The core speed is increased to 672 MHz, well up from the 520 MHz in Tegra 3 (8 pixel and 4 vertex shaders). The GPU can output four pixels per clock, double that of Tegra 3. Once we consider the extra clock speed and pixel pipelines, the Tegra 4 increases pixel fillrate by 2.6x. Pixel and vertex shading will get a huge boost in performance due to the dramatic increase of units and clockspeed. Overall this is a very significant improvement over the previous generation of parts.
The Tegra 4 can output to a 4K display natively, and that is not the only new feature for this part. Here is a quick list:
2x/4x Multisample Antialiasing (MSAA)
24-bit Z (versus 20-bit Z in the Tegra 3 processor) and 8-bit Stencil
4K x 4K texture size incl. Non-Power of Two textures (versus 2K x 2K in the Tegra 3 processor) – for higher quality textures, and easier to port full resolution textures from console and PC games to Tegra 4 processor. Good for high resolution displays.
16:1 Depth (Z) Compression and 4:1 Color Compression (versus none in Tegra 3 processor) – this is lossless compression and is useful for reducing bandwidth to/from the frame buffer, and especially effective in antialiasing processing when processing multiple samples per pixel
Depth Textures
Percentage Closer Filtering for Shadow Texture Mapping and Soft Shadows
Texture border color eliminate coarse MIP-level bleeding
sRGB for Texture Filtering, Render Surfaces and MSAA down-filter
1 - CSAA is no longer supported in Tegra 4 processors
This is a big generational jump, and now we only have to see how it performs against the other top end parts from Qualcomm, Samsung, and others utilizing IP from Imagination and ARM.
A graphical description of market woes from Jon Peddie
Subject: General Tech, Graphics Cards | February 25, 2013 - 01:32 PM | Jeremy Hellstrom
Tagged: jon peddie, graphics, market share
If last weeks report from Jon Peddie Research on sales for all add in and integrated graphics had you worried, the news this week is not gong to help boost your confidence. This week the report focuses solely on add in boards and the drop is dramatic; Q4 2012 sales plummeted just short of 20% compared to Q3 2012. When you look at the entire year, sales dropped 10% overall as AMD's APUs are making serious inroads into the mobile market, as are Intel's, with many notebooks being sold without a discrete GPU. The losses are coming from the mainstream market, enthusiast level GPUs actually saw a slight increase in sales but the small volume is utterly drowned by the mainstream market. You can check out the full press release here.
"JPR found that AIB shipments during Q4 2012 behaved according to past years with regard to seasonality, but the drop was considerably more dramatic. AIB shipments decreased 17.3% from the last quarter (the 10 year average is just -0.68%). On a year-to-year comparison, shipments were down 10%."
Here is some more Tech News from around the web:
- 3DMark Review @ OCC
- Trendnet N300 Easy-N-Range Extender @ Rbmods
- NETGEAR ProSafe GS110T Gigabit SmartSwitch @ Benchmark Reviews
- Quantum computer one step closer after ‘true’ quantum calculation @ The Register
- Microsoft brings Azure back online @ The Register
- Understanding Camera Optics & Smartphone Camera Trends, A Presentation by Brian Klug @ AnandTech
- MWC Sunday roundup: HP Slate, Ascend P2 and Firefox phones @ The Inquirer
- AMD releases Firepro R5000 with remote display technology @ The Inquirer
- The TR Podcast 129: PlayStation 4, Titan, and more
AMD wants to wash your hair, with graphics. What??
Subject: Graphics Cards | February 22, 2013 - 05:29 PM | Ryan Shrout
Tagged: tressfx, amd
I got an odd email just now that I thought I would share with you. From AMD's Gaming Evolved account I got this:
You're at the top of your game. Why isn't your hair? TressFX is specially formulated with dynamic compounds like PPLL to re-energize your tired locks with vitality and luster.
WAT?
An odd campaign for sure, but it appears that on Tuesday AMD is going to discuss a technology that will bring realistic hair to gaming. Finally some use for all that GPGPU horsepower on the Southern Islands graphics cards?
You can see the landing page for yourself right here.
In case you missed it...
UPDATE: We have now published full details on our Frame Rating capture and analysis system as well as an entire host of benchmark results. Please check it out!!
In one of the last pages of our recent NVIDIA GeForce GTX TITAN graphics card review we included an update to our Frame Rating graphics performance metric that details the testing method in more detail and showed results for the first time. Because it was buried so far into the article, I thought it was worth posting this information here as a separate article to solict feedback from readers and help guide the discussion forward without getting lost in the TITAN shuffle. If you already read that page of our TITAN review, nothing new is included below.
I am still planning a full article based on these results sooner rather than later; for now, please leave me your thoughts, comments, ideas and criticisms in the comments below!
Why are you not testing CrossFire??
If you haven't been following our sequence of stories that investigates a completely new testing methodology we are calling "frame rating", then you are really missing out. (Part 1 is here, part 2 is here.) The basic premise of Frame Rating is that the performance metrics that the industry is gathering using FRAPS are inaccurate in many cases and do not properly reflect the real-world gaming experience the user has.
Because of that, we are working on another method that uses high-end dual-link DVI capture equipment to directly record the raw output from the graphics card with an overlay technology that allows us to measure frame rates as they are presented on the screen, not as they are presented to the FRAPS software sub-system. With these tools we can measure average frame rates, frame times and stutter, all in a way that reflects exactly what the viewer sees from the game.
We aren't ready to show our full sets of results yet (soon!) but the problems lie in that AMD's CrossFire technology shows severe performance degradations when viewed under the Frame Rating microscope that do not show up nearly as dramatically under FRAPS. As such, I decided that it was simply irresponsible of me to present data to readers that I would then immediately refute on the final pages of this review (Editor: referencing the GTX TITAN article linked above.) - it would be a waste of time for the reader and people that skip only to the performance graphs wouldn't know our theory on why the results displayed were invalid.
Many other sites will use FRAPS, will use CrossFire, and there is nothing wrong with that at all. They are simply presenting data that they believe to be true based on the tools at their disposal. More data is always better.
Here are these results and our discussion. I decided to use the most popular game out today, Battlefield 3 and please keep in mind this is NOT the worst case scenario for AMD CrossFire in any way. I tested the Radeon HD 7970 GHz Edition in single and CrossFire configurations as well as the GeForce GTX 680 and SLI. To gather results I used two processes:
- Run FRAPS while running through a repeatable section and record frame rates and frame times for 60 seconds
- Run our Frame Rating capture system with a special overlay that allows us to measure frame rates and frame times with post processing.
Here is an example of what the overlay looks like in Battlefield 3.
Frame Rating capture on GeForce GTX 680s in SLI - Click to Enlarge
The column on the left is actually the visuals of an overlay that is applied to each and every frame of the game early in the rendering process. A solid color is added to the PRESENT call (more details to come later) for each individual frame. As you know, when you are playing a game, multiple frames will make it on any single 60 Hz cycle of your monitor and because of that you get a succession of colors on the left hand side.
By measuring the pixel height of those colored columns, and knowing the order in which they should appear beforehand, we can gather the same data that FRAPS does but our results are seen AFTER any driver optimizations and DX changes the game might make.
Frame Rating capture on Radeon HD 7970 CrossFire - Click to Enlarge
Here you see a very similar screenshot running on CrossFire. Notice the thin silver band between the maroon and purple? That is a complete frame according to FRAPS and most reviews. Not to us - we think that frame rendered is almost useless.
Continue reading our 3rd part in a series of Frame Rating and to see our first performance results!!
Join PCPer and NVIDIA for a GeForce GTX TITAN Live Review!
Subject: Graphics Cards | February 21, 2013 - 01:12 PM | Ryan Shrout
Tagged: video, titan, nvidia, live review, live, kepler, geforce titan, geforce
Missed the live event? Here is the full replay feature me and Tom Petersen!
Hopefully by now you have read our review of the NVIDIA GeForce GTX TITAN 6GB graphics card that was just released. This is definitely a product release that highlights a generations of GPUs and I would really encourage you to read the article and offer your feedback.
However, we have another event to promote right now: NVIDIA's Tom Petersen will be joining me on PCPer Live! at 11am PT / 2pm ET to talk about the GeForce GTX TITAN and its performance, features, pricing and more!
GeForce GTX TITAN Live Review Stream
11am PT / 2pm ET - February 21st
PC Perspective Live! Page
If you have questions for Tom or me, you can leave them in the comments below (no registration required)!
TITAN up your ... you know
Subject: Graphics Cards | February 21, 2013 - 12:57 PM | Jeremy Hellstrom
Tagged: titan, nvidia, kepler, gtx titan, gk110, geforce
Before getting into the performance of the $1000 NVIDIA TITAN it is worth looking at the improvements NVIDIA has added to this GK110 beast. At 10.5" long it is a half inch longer than a 680 and a full 1.5" shorter than a 690, which allows it to fit in a wider variety of cases and the vastly improved thermals allow the usage of much smaller cases than other high end GPUs can manage without exotic cooling solutions. There is also a reduction in noise generated, to the point where SLI'd TITANs run quieter than some single card solutions, not to mention much faster. To take a look at just how much faster you can see [H]ard|OCP's results which you can compare to Ryan's results.
"NVIDIA is launching a TITAN today, literally, the new GeForce GTX TITAN video card is here, and we have a lot to talk about. We test single-GPU and 2-way SLI today, with more to follow later. We will find out if this TITAN of a video card really is worth it, and just who this video card is designed for. Be prepared to face the fastest single-GPU video card."
Here are some more Graphics Card articles from around the web:
- Nvidia's GeForce GTX Titan @ The Tech Report
- NVIDIA GTX TITAN @ Overclockers.com
- NVIDIA’s GeForce GTX Titan Review, Part 2: Titan's Performance Unveiled @ AnandTech
- NVIDIA GeForce GTX Titan Gaming Review @ OCC
- NVIDIA GeForce GTX TITAN 6GB Performance Review @ Hardware Canucks
- NVIDIA GeForce GTX TITAN 6 GB @ techPowerUp
- NVIDIA GeForce GTX TITAN SLI & Tri-SLI @ techPowerUp
- MSI GTX 670 Twin Frozr Power Edition OC 2GB @ Tweaktown
- Desktop Graphics Card Comparison Guide @ TechARP
- HIS Radeon HD 7850 iPower IceQ Turbo 4GB Crossfire @ Legion Hardware
TITAN is back for more!
Our NVIDIA GeForce GTX TITAN Coverage Schedule:
- Tuesday, February 19 @ 9am ET: GeForce GTX TITAN Features Preview
- Thursday, February 21 @ 9am ET: GeForce GTX TITAN Benchmarks and Review
- Thursday, February 21 @ 2pm ET: PC Perspective Live! GTX TITAN Stream
If you are reading this today, chances are you were here on Tuesday when we first launched our NVIDIA GeForce GTX TITAN features and preview story (accessible from the link above) and were hoping to find benchmarks then. You didn't, but you will now. I am here to show you that the TITAN is indeed the single fastest GPU on the market and MAY be the best graphics cards (single or dual GPU) on the market depending on what usage models you have. Some will argue, some will disagree, but we have an interesting argument to make about this $999 gaming beast.
A brief history of time...er, TITAN
In our previous article we talked all about TITAN's GK110-based GPU, the form factor, card design, GPU Boost 2.0 features and much more and I would highly press you all to read it before going forward. If you just want the cliff notes, I am going to copy and paste some of the most important details below.
From a pure specifications standpoint the GeForce GTX TITAN based on GK110 is a powerhouse. While the full GPU sports a total of 15 SMX units, TITAN will have 14 of them enabled for a total of 2688 shaders and 224 texture units. Clock speeds on TITAN are a bit lower than on GK104 with a base clock rate of 836 MHz and a Boost Clock of 876 MHz. As we will show you later in this article though the GPU Boost technology has been updated and changed quite a bit from what we first saw with the GTX 680.
The bump in the memory bus width is also key, being able to feed that many CUDA cores definitely required a boost from 256-bit to 384-bit, a 50% increase. Even better, the memory bus is still running at 6.0 GHz resulting in total memory bandwdith of 288.4 GB/s.
Speaking of memory - this card will ship with 6GB on-board. Yes, 6 GeeBees!! That is twice as much as AMD's Radeon HD 7970 and three times as much as NVIDIA's own GeForce GTX 680 card. This is without a doubt a nod to the super-computing capabilities of the GPU and the GPGPU functionality that NVIDIA is enabling with the double precision aspects of GK110.
LucidLogix Virtu MVP 2.0 Software Suite Now Available
Subject: General Tech, Graphics Cards | February 20, 2013 - 12:49 PM | Jeremy Hellstrom
Tagged: lucid, virtu MVP, virtu, hyperformance
As promised at CES, Lucidlogix has released their Virtu MVP 2.0 for purchase to anyone who wants to buy it. Their GPU Virtualization software for SandyBridge and IvyBridge based systems with a discrete card allows you to jump back a forth between the embedded GPU on your processor and the graphics card without needing to move monitor cables or reboot. That allows you to save your laptops battery life when the discrete GPU is not needed but to instantly enable it the second you fire up a compatible game, the list of which has grown since the release of their original Virtu MVP. They have also improved their Virtual VSync and Hyperformance features which we reviewed last summer on an Origin laptop.
The move to selling the product directly to consumers is beneficial as previously you could only get the software and updates from the manufacturer of your motherboard or your laptop. As anyone who has dealt with the infrequency graphics driver updates from manufacturers is well aware, the updates are few and far between. It is much better to be able to acquire the software from the vendor who creates it in the first place. Head over to Lucidlogix to read more and perhaps buy one of the three versions available.
"The optimal system specifications Virtu MVP 2.0 include an Intel® Core™ i5 (Sandy Bridge) on an Intel Sandy Bridge or Ivy Bridge motherboard with an NVIDIA® Geforce 460GTX or similar or better AIB and 2GB or more memory running Windows® 7 or Windows 8 in either 32-bit or 64-bit modes.
With special launch prices, Virtu MVP 2.0 is now available in three models: Basic with GPU virtualization for $34.99 (USD), Standard with Virtual Vsync for $44.99 and Pro with Hyperformance and Virtual Vsync for $54.99."
Here is some more Tech News from around the web:
- Tilera etches '*ss-kicking' 72-core system-on-chip for network gear @ The Register
- Samsung develops a programmable mobile GPU @ The Inquirer
- Canon PIxma MG6320 Review @ TechReviewSource
PCPer Live! Crysis 3 Game Stream - Win Games and Graphics Cards from AMD!
Subject: Graphics Cards | February 19, 2013 - 08:00 PM | Ryan Shrout
Tagged: video, tahiti, radeon, never settle reloaded, live, Crysis 3, crysis, amd
UPDATE: If you missed the live stream you can still catch the YouTube replay right here!!
On February 19th on the PC Perspective Live! page we will be streaming some single player game action of the new Crysis 3. If there has ever been a game that defined the world of PC gaming graphics and technology, it is the Crysis series.
"Sure, but can it play Crysis?"
There is probably no more famous line of dialogue that pigeon hole's new hardware releases.
With the release of the latest version of Crysis 3 on February 19th, we will be teaming up with AMD once again to provide a fun and exciting PCPer Game Stream that includes game demonstrations and of course, prizes and game keys for those that watch the event LIVE!
Crysis 3 Game Stream
5pm PT / 8pm ET - February 19th
PC Perspective Live! Page
Warning: this one will DEFINITELY have mature language and content!!
The stream will be sponsored by AMD and its Never Settle Reloaded game bundles which we previously told you about. Depending on the AMD Radeon HD 7000 series GPU that you buy, you could get some amazing free games including:
-
Radeon HD 7900 Series
- FREE Crysis 3
- FREE Bioshock Infinite
-
Radeon HD 7800 Series
- FREE Bioshock Infinite
- FREE Tomb Raider
-
Radeon HD 7900 CrossFire Set
- FREE Crysis 3
- FREE Bioshock Infinite
- FREE Tomb Raider
- FREE Far Cry 3
- FREE Hitman: Absolution
- FREE Sleeping Dogs
AMD's Robert Hallock (@Thracks on twitter) will be joining us via Skype to talk about the game's technology, performance considerations as well as helping me with some co-op gaming!
Of course, just to sweeten the deal a bit we have some prizes lined up for those of you that participate in our Crysis 3 Game Stream:
- 2 x Radeon HD 7970 3GB graphics cards
- 4 x Combo codes for both Crysis 3 AND Bioshock Infinite
Pretty nice, huh? All you have to do to win is be present on the PC Perspective Live! Page during the event as we will announce both the content/sweepstakes method AND the winners!
Stop in on February 19th for some PC gaming fun!!
We interrupt your Titan previews for a look at comparitive Catalyst version performance
Subject: Graphics Cards | February 19, 2013 - 06:09 PM | Jeremy Hellstrom
Tagged: amd, catalyst, 2012
Today might be Titan Preview Day as you can see from the links below as well as Ryan's article here, but [H]ard|OCP would like to offer you solid performance numbers instead. They took a look back at the Catalyst 12.x series of drivers that AMD GPU owners have been using over the past year. With the HD 7970 and HD 7950 they tested 7 of AMD's past drivers for performance on four popular games. The findings are fairly clear, after a poor start to the year AMD's drivers showed improved performance as the year went on, with leaps after games were released and the driver could be optimized for speed. The HD7970 did improve over the year but it was the 7950 that proved to receive the biggest gains.
"We continuing our look at driver performance improvements over time by evaluating AMD’s 2012 driver performances on both the AMD Radeon HD 7970 and HD 7950 video cards. We will see how drivers from the beginning of the year to the end of year have impacted real world gameplay performance . "
Here are some more Graphics Card articles from around the web:
- NVIDIA's GeForce GTX Titan, Part 1: Titan For Gaming, Titan For Compute @ AnandTech
- Nvidia GeForce GTX Titan Video Card Preview @ Ninjalane
- NVIDIA GeForce GTX Titan Video Card Preview @ Legit Reviews
- NVIDIA GeForce GTX TITAN; GK110’s Opening Act @ Hardware Canucks
- aming and Supercomputing Collide: NVIDIA Announces GeForce Titan @ Techgage
- NVIDIA GeForce GTX Titan Review @ OCC
- Asus GTX 660Ti DirectCU II TOP @ eTeknix
- EVGA GTX 650 Ti SSC 2 GB @ techPowerUp















