Subject: Graphics Cards | June 21, 2016 - 09:22 PM | Scott Michaud
Tagged: nvidia, fermi, kepler, maxwell, pascal, gf100, gf110, GK104, gk110, GM204, gm200, GP104
Techspot published an article that compared eight GPUs across six, high-end dies in NVIDIA's last four architectures: Fermi to Pascal. Average frame rates were listed across nine games, each measured at three resolutions:1366x768 (~720p HD), 1920x1080 (1080p FHD), and 2560x1600 (~1440p QHD).
The results are interesting. Comparing GP104 to GF100, mainstream Pascal is typically on the order of four times faster than big Fermi. Over that time, we've had three full generational leaps in fabrication technology, leading to over twice the number of transistors packed into a die that is almost half the size. It does, however, show that prices have remained relatively constant, except that the GTX 1080 is sort-of priced in the x80 Ti category despite the die size placing it in the non-Ti class. (They list the 1080 at $600, but you can't really find anything outside the $650-700 USD range).
It would be interesting to see this data set compared against AMD. It's informative for an NVIDIA-only article, though.
NVIDIA's Ansel Technology
“In-game photography” is an interesting concept. Not too long ago, it was difficult to just capture the user's direct experience with a title. Print screen could only hold a single screenshot at a time, which allowed Steam and FRAPS to provide a better user experience. FRAPS also made video more accessible to the end-user, but it output huge files and, while it wasn't too expensive, it needed to be purchased online, which was a big issue ten-or-so years ago.
Seeing that their audience would enjoy video captures, NVIDIA introduced ShadowPlay a couple of years ago. The feature allowed users to, not only record video, but also capture the last few minutes. It did this with hardware acceleration, and it did this for free (for compatible GPUs). While I don't use ShadowPlay, preferring the control of OBS, it's a good example of how NVIDIA wants to support their users. They see these features as a value-add, which draw people to their hardware.
Subject: Graphics Cards | January 20, 2016 - 08:26 PM | Scott Michaud
Tagged: nvidia, linux, tesla, fermi, kepler, maxwell
It's nice to see long-term roundups every once in a while. They do not really provide useful information for someone looking to make a purchase, but they show how our industry is changing (or not). In this case, Phoronix tested twenty-seven NVIDIA GeForce cards across four architectures: Tesla, Fermi, Kepler, and Maxwell. In other words, from the GeForce 8 series all the way up to the GTX 980 Ti.
Image Credit: Phoronix
Nine years of advancements in ASIC design, with a doubling time-step of 18 months, should yield a 64-fold improvement. The number of transistors falls short, showing about a 12-fold improvement between the Titan X and the largest first-wave Tesla, although that means nothing for a fabless semiconductor designer. The main reason why I include this figure is to show the actual Moore's Law trend over this time span, but it also highlights the slowdown in process technology.
Performance per watt does depend on NVIDIA though, and the ratio between the GTX 980 Ti and the 8500 GT is about 72:1. While this is slightly better than the target 64:1 ratio, these parts are from very different locations in their respective product stacks. Swapping the 8500 GT for the following year's 9800 GTX, which leads to a comparison between top-of-the-line GPUs of their respective times, and you see a 6.2x improvement in performance per watt versus the GTX 980 Ti. On the other hand, that part was outstanding for its era.
I should note that each of these tests take place on Linux. It might not perfectly reflect the landscape on Windows, but again, it's interesting in its own right.
Quiet, Efficient Gaming
The last few weeks have been dominated by talk about the memory controller of the Maxwell based GTX 970. There are some very strong opinions about that particular issue, and certainly NVIDIA was remiss on actually informing consumers about how it handles the memory functionality of that particular product. While that debate rages, we have somewhat lost track of other products in the Maxwell range. The GTX 960 was released during this particular firestorm and, while it also shared the outstanding power/performance qualities of the Maxwell architecture, it is considered a little overpriced when compared to other cards in its price class in terms of performance.
It is easy to forget that the original Maxwell based product to hit shelves was the GTX 750 series of cards. They were released a year ago to some very interesting reviews. The board is one of the first mainstream cards in recent memory to have a power draw that is under 75 watts, but can still play games with good quality settings at 1080P resolutions. Ryan covered this very well and it turned out to be a perfect gaming card for many pre-built systems that do not have extra power connectors (or a power supply that can support 125+ watt graphics cards). These are relatively inexpensive cards and very easy to install, producing a big jump in performance as compared to the integrated graphics components of modern CPUs and APUs.
The GTX 750 and GTX 750 Ti have proven to be popular cards due to their overall price, performance, and extremely low power consumption. They also tend to produce a relatively low amount of heat, due to solid cooling combined with that low power consumption. The Maxwell architecture has also introduced some new features, but the major changes are to the overall design of the architecture as compared to Kepler. Instead of 192 cores per SMK, there are now 128 cores per SMM. NVIDIA has done a lot of work to improve performance per core as well as lower power in a fairly dramatic way. An interesting side effect is that the CPU hit with Maxwell is a couple of percentage points higher than Kepler. NVIDIA does lean a bit more on the CPU to improve overall GPU power, but most of this performance hit is covered up by some really good realtime compiler work in the driver.
Asus has taken the GTX 750 Ti and applied their STRIX design and branding to it. While there are certainly faster GPUs on the market, there are none that exhibit the power characteristics of the GTX 750 Ti. The combination of this GPU and the STRIX design should result in an extremely efficient, cool, and silent card.
NVIDIA's Tegra X1
NVIDIA seems to like begin on a one year cycle with their latest Tegra products. Many years ago we were introduced to the Tegra 2, and the year after that the Tegra 3, and the year after that the Tegra 4. Well, NVIDIA did spice up their naming scheme to get away from the numbers (not to mention the potential stigma of how many of those products actually made an impact in the industry). Last year's entry was the Tegra K1 based on the Kepler graphics technology. These products were interesting due to the use of the very latest, cutting edge graphics technology in a mobile/low power format. The Tegra K1 64 bit variant used two “Denver” cores that were actually designed by NVIDIA.
While technically interesting, the Tegra K1 series have made about the same impact as the previous versions. The Nexus 9 was the biggest win for NVIDIA with these parts, and we have heard of a smattering of automotive companies using Tegra K1 in those applications. NVIDIA uses the Tegra K1 in their latest Shield tablet, but they do not typically release data regarding the number of products sold. The Tegra K1 looks to be the most successful product since the original Tegra 2, but the question of how well they actually sold looms over the entire brand.
So why the history lesson? Well, we have to see where NVIDIA has been to get a good idea of where they are heading next. Today, NVIDIA is introducing the latest Tegra product, and it is going in a slightly different direction than what many had expected.
The reference board with 4 GB of LPDDR4.
Experience with Silent Design
In the time periods between major GPU releases, companies like ASUS have the ability to really dig down and engineer truly unique products. With the expanded time between major GPU releases, from either NVIDIA or AMD, these products have continued evolving to offer better features and experiences than any graphics card before them. The ASUS Strix GTX 780 is exactly one of those solutions – taking a GTX 780 GPU that was originally released in May of last year and twisting it into a new design that offers better cooling, better power and lower noise levels.
ASUS intended, with the Strix GTX 780, to create a card that is perfect for high end PC gamers, without crossing into the realm of bank-breaking prices. They chose to go with the GeForce GTX 780 GPU from NVIDIA at a significant price drop from the GTX 780 Ti, with only a modest performance drop. They double the reference memory capacity from 3GB to 6GB of GDDR5, to assuage any buyer’s thoughts that 3GB wasn’t enough for multi-screen Surround gaming or 4K gaming. And they change the cooling solution to offer a near silent operation mode when used in “low impact” gaming titles.
The ASUS Strix GTX 780 Graphics Card
The ASUS Strix GTX 780 card is a pretty large beast, both in physical size and in performance. The cooler is a slightly modified version of the very popular DirectCU II thermal design used in many of the custom built ASUS graphics cards. It has a heat dissipation area more than twice that of the reference NVIDIA cooler and uses larger fans that allow them to spin slower (and quieter) at the improved cooling capacity.
Out of the box, the ASUS Strix GTX 780 will run at 889 MHz base clock and 941 MHz Boost clock, a fairly modest increase over the 863/900 MHz rates of the reference card. Obviously with much better cooling and a lot of work being done on the PCB of this custom design, users will have a lot of headroom to overclock on their own, but I continue to implore companies like ASUS and MSI to up the ante out of the box! One area where ASUS does impress is with the memory – the Strix card features a full 6GB of GDDR5 running 6.0 GHz, twice the capacity of the reference GTX 780 (and even GTX 780 Ti) cards. If you had any concerns about Surround or 4K gaming, know that memory capacity will not be a problem. (Though raw compute power may still be.)
A Tablet and Controller Worth Using
An interesting thing happened a couple of weeks back, while I was standing on stage at our annual PC Perspective Hardware Workshop during Quakecon in Dallas, TX. When NVIDIA offered up a SHIELD (now called the SHIELD Portable) for raffle, the audience cheered. And not just a little bit, but more than they did for nearly any other hardware offered up during the show. That included motherboards, graphics card, monitors, even complete systems. It kind of took me aback - NVIDIA SHIELD was a popular brand, a name that was recognized, and apparently, a product that people wanted to own. You might not have guessed that based on the sales numbers that SHIELD has put forward though. Even though it appeared to have a significant mind share, market share was something that was lacking.
Today though, NVIDIA prepares the second product in the SHIELD lineup, the SHIELD Tablet, a device the company hopes improves on the idea of SHIELD to encourage other users to sign on. It's a tablet (not a tablet with a controller attached), it has a more powerful SoC that can utilize different APIs for unique games, it can be more easily used in a 10-ft console mode and the SHIELD specific features like Game Stream are included and enhanced.
The question of course though is easy to put forward: should you buy one? Let's explore.
The NVIDIA SHIELD Tablet
At first glance, the NVIDIA SHIELD Tablet looks like a tablet. That actually isn't a negative selling point though, as the SHIELD Tablet can and does act like a high end tablet in nearly every way: performance, function, looks. We originally went over the entirety of the tablet's specifications in our first preview last week but much of it bears repeating for this review.
The SHIELD Tablet is built around the NVIDIA Tegra K1 SoC, the first mobile silicon to implement the Kepler graphics architecture. That feature alone makes this tablet impressive because it offers graphics performance not seen in a form factor like this before. CPU performance is also improved over the Tegra 4 processor, but the graphics portion of the die sees the largest performance jump easily.
A 1920x1200 resolution 7.9-in IPS screen faces the user and brings the option of full 1080p content lacking with the first SHIELD portable. The screen is bright and crisp, easily viewable in bring lighting for gaming or use in lots of environments. Though the Xiaomi Mi Pad 7.9 had a 2048x1536 resolution screen, the form factor of the SHIELD Tablet is much more in line with what NVIDIA built with the Tegra Note 7.
A powerful architecture
In March of this year, NVIDIA announced the GeForce GTX Titan Z at its GPU Technology Conference. It was touted as the world's fastest graphics card with its pair of full GK110 GPUs but it came with an equally stunning price of $2999. NVIDIA claimed it would be available by the end of April for gamers and CUDA developers to purchase but it was pushed back slightly and released at the very end of May, going on sale for the promised price of $2999.
The specifications of GTX Titan Z are damned impressive - 5,760 CUDA cores, 12GB of total graphics memory, 8.1 TFLOPs of peak compute performance. But something happened between the announcement and product release that perhaps NVIDIA hadn't accounted for. AMD's Radeon R9 295X2, a dual-GPU card with full-speed Hawaii chips on-board, was released at $1499. I think it's fair to say that AMD took some chances that NVIDIA was surprised to see them take, including going the route of a self-contained water cooler and blowing past the PCI Express recommended power limits to offer a ~500 watt graphics card. The R9 295X2 was damned fast and I think it caught NVIDIA a bit off-guard.
As a result, the GeForce GTX Titan Z release was a bit quieter than most of us expected. Yes, the Titan Black card was released without sampling the gaming media but that was nearly a mirror of the GeForce GTX 780 Ti, just with a larger frame buffer and the performance of that GPU was well known. For NVIDIA to release a flagship dual-GPU graphics cards, admittedly the most expensive one I have ever seen with the GeForce brand on it, and NOT send out samples, was telling.
NVIDIA is adamant though that the primary target of the Titan Z is not just gamers but the CUDA developer that needs the most performance possible in as small of a space as possible. For that specific user, one that doesn't quite have the income to invest in a lot of Tesla hardware but wants to be able to develop and use CUDA applications with a significant amount of horsepower, the Titan Z fits the bill perfectly.
Still, the company was touting the Titan Z as "offering supercomputer class performance to enthusiast gamers" and telling gamers in launch videos that the Titan Z is the "fastest graphics card ever built" and that it was "built for gamers." So, interest peaked, we decided to review the GeForce GTX Titan Z.
The GeForce GTX TITAN Z Graphics Card
Cost and performance not withstanding, the GeForce GTX Titan Z is an absolutely stunning looking graphics card. The industrial design started with the GeForce GTX 690 (the last dual-GPU card NVIDIA released) and continued with the GTX 780 and Titan family, lives on with the Titan Z.
The all metal finish looks good and stands up to abuse, keeping that PCB straight even with the heft of the heatsink. There is only a single fan on the Titan Z, center mounted, with a large heatsink covering both GPUs on opposite sides. The GeForce logo up top illuminates, as we have seen on all similar designs, which adds a nice touch.
Subject: General Tech | April 25, 2014 - 05:43 PM | Jeremy Hellstrom
Tagged: nvidia, contest, jetson tk1, kepler
Attention enthusiasts, developers and creators. Are you working on a new embedded computing application?
Meet the Jetson TK1 Developer Kit. It’s the world’s first mobile supercomputer for embedded systems, putting unprecedented computing performance in a low-power, portable and fully programmable package.
Power, ports, and portability: the Jetson TK1 development kit.The Jetson TK1 development kit
It’s the ultimate platform for developing next-generation computer vision solutions for robotics, medical devices, and automotive applications.
And we’re giving away 50 of them as part of our Tegra K1 CUDA Vision Challenge.
In addition to the Tegra K1 processor, the Jetson TK1 DevKit is equipped with 2 GB of RAM, 16 GB of storage and a host of ports and connectivity options.
And, because it offers full support for CUDA, the most pervasive, easy-to-use parallel computing platform and programming model, it’s much easier to program than the FPGA, custom ASIC and DSP processors that are typically used in today’s embedded systems.
Jetson TK1 is based on the Kepler computing architecture, the same technology powering today’s supercomputers, professional workstations and high-end gaming rigs. It has 192 CUDA cores, delivering over 300 GFLOPs of performance, and also provides full support for OpenGL 4.4, and CUDA 6.0, as well as the GPU-accelerated OpenCV.
Our Tegra K1 system-on-a-chip offers unprecedented power and portability.Our Tegra K1 system-on-a-chip offers unprecedented power and portability.
Entering the Tegra K1 CUDA Vision Challenge is easy. Just tell us about your embedded application idea. All proposals must be submitted April 30, 2014. Entries will be judged for innovation, impact on research or industry, public availability, and quality of work.
By the end of May, the top 50 submissions will be awarded one of the first Jetson TK1 DevKits to roll off the production line, as well as access to technical support documents and assets.
The five most noteworthy Jetson TK1 breakthroughs may get a chance to share their work at the NVIDIA GPU Technology Conference in 2015.
Subject: General Tech, Mobile | March 26, 2014 - 01:34 AM | Tim Verry
Tagged: GTC 2014, tegra k1, nvidia, CUDA, kepler, jetson tk1, development
NVIDIA recently unified its desktop and mobile GPU lineups by moving to a Kepler-based GPU in its latest Tegra K1 mobile SoC. The move to the Kepler architecture has simplified development and enabled the CUDA programming model to run on mobile devices. One of the main points of the opening keynote earlier today was ‘CUDA everywhere,’ and NVIDIA has officially accomplished that goal by having CUDA compatible hardware from servers to desktops to tablets and embedded devices.
Speaking of embedded devices, NVIDIA showed off a new development board called the Jetson TK1. This tiny new board features a NVIDIA Tegra K1 SoC at its heart along with 2GB RAM and 16GB eMMC storage. The Jetson TK1 supports a plethora of IO options including an internal expansion port (GPIO compatible), SATA, one half-mini PCI-e slot, serial, USB 3.0, micro USB, Gigabit Ethernet, analog audio, and HDMI video outputs.
Of course the Tegra K1 part is a quad core (4+1) ARM CPU and a Kepler-based GPU with 192 CUDA cores. The SoC is rated at 326 GFLOPS which enables some interesting compute workloads including machine vision.
In fact, Audi has been utilizing the Jetson TK1 development board to power its self-driving prototype car (more on that soon). Other intended uses for the new development board include robotics, medical devices, security systems, and perhaps low power compute clusters (such as an improved Pedraforca system).It can also be used as a simple desktop platform for testing and developing mobile applications for other Tegra K1 powered devices, of course.
Beyond the hardware, the Jetson TK1 comes with the CUDA toolkit, OpenGL 4.4 driver, and NVIDIA VisionWorks SDK which includes programming libraries and sample code for getting machine vision applications running on the Tegra K1 SoC.
The Jetson TK1 is available for pre-order now at $192 and is slated to begin shipping in April. Interested developers can find more information on the NVIDIA developer website.