GK110 in all its glory
I bet you didn't realize that October and November were going to become the onslaught of graphics cards it has been. I know I did not and I tend to have a better background on these things than most of our readers. Starting with the release of the AMD Radeon R9 280X, 270X and R7 260X in the first week of October, it has pretty much been a non-stop battle between NVIDIA and AMD for the hearts, minds, and wallets of PC gamers.
Shortly after the Tahiti refresh came NVIDIA's move into display technology with G-Sync, a variable refresh rate feature that will work with upcoming monitors from ASUS and others as long as you have a GeForce Kepler GPU. The technology was damned impressive, but I am still waiting for NVIDIA to send over some panels for extended testing.
Later in October we were hit with the R9 290X, the Hawaii GPU that brought AMD back in the world of ultra-class single GPU card performance. It has produced stellar benchmarks and undercut the prices (then at least) of the GTX 780 and GTX TITAN. We tested it in both single and multi-GPU configurations and found that AMD had made some impressive progress in fixing its frame pacing issues, even with Eyefinity and 4K tiled displays.
NVIDIA dropped a driver release with ShadowPlay that allows gamers to record playback locally without a hit on performance. I posted a roundup of R9 280X cards which showed alternative coolers and performance ranges. We investigated the R9 290X Hawaii GPU and the claims that performance is variable and configurable based on fan speeds. Finally, the R9 290 (non-X model) was released this week to more fanfare than the 290X thanks to its nearly identical performance and $399 price tag.
And today, yet another release. NVIDIA's GeForce GTX 780 Ti takes the performance of the GK110 and fully unlocks it. The GTX TITAN uses one fewer SMX and the GTX 780 has three fewer SMX units so you can expect the GTX 780 Ti to, at the very least, become the fastest NVIDIA GPU available. But can it hold its lead over the R9 290X and validate its $699 price tag?
Subject: General Tech, Graphics Cards, Systems | November 5, 2013 - 09:33 PM | Scott Michaud
Tagged: nvidia, grid, AWS, amazon
Amazon Web Services allows customers (individuals, organizations, or companies) to rent servers of certain qualities to match their needs. Many websites are hosted at their data centers, mostly because you can purchase different (or multiple) servers if you have big variations in traffic.
I, personally, sometimes use it as a game server for scheduled multiplayer events. The traditional method is spending $50-80 USD per month on a... decent... server running all-day every-day and using it a couple of hours per week. With Amazon EC2, we hosted a 200 player event (100 vs 100) by purchasing a dual-Xeon (ironically the fastest single-threaded instance) server connected to Amazon's internet backbone by 10 Gigabit Ethernet. This server cost just under $5 per hour all expenses considered. It was not much of a discount but it ran like butter.
This leads me to today's story: NVIDIA GRID GPUs are now available at Amazon Web Services. Both companies hope their customers will use (or create services based on) these instances. Applications they expect to see are streamed games, CAD and media creation, and other server-side graphics processing. These Kepler-based instances, named "g2.2xlarge", will be available along side the older Fermi-based Cluster Compute Instances ("cg1.4xlarge").
It is also noteworthy that the older Fermi-based Tesla servers are about 4x as expensive. GRID GPUs are based on GK104 (or GK107, but those are not available on Amazon EC2) and not the more compute-intensive GK110. It would probably be a step backwards for customers intending to perform GPGPU workloads for computational science or "big data" analysis. The newer GRID systems do not have 10 Gigabit Ethernet, either.
So what does it have? Well, I created an AWS instance to find out.
Its CPU is advertised as an Intel E5-2670 with 8 threads and 26 Compute Units (CUs). This is particularly odd as that particular CPU is eight-core with 16 threads; it is also usually rated by Amazon at 22 CUs per 8 threads. This made me wonder whether the CPU is split between two clients or if Amazon disabled Hyper-Threading to push the clock rates higher (and ultimately led me to just log in to an instance and see). As it turns out, HT is still enabled and the processor registers as having 4 physical cores.
The GPU was slightly more... complicated.
NVIDIA control panel apparently does not work over remote desktop and the GPU registers as a "Standard VGA Graphics Adapter". Actually, two are available in Device Manager although one has the yellow exclamation mark of driver woe (random integrated graphics that wasn't disabled in BIOS?). GPU-Z was not able to pick much up from it but it was of some help.
Keep in mind: I did this without contacting either Amazon or NVIDIA. It is entirely possible that the OS I used (Windows Server 2008 R2) was a poor choice. OTOY, as a part of this announcement, offers Amazon Machine Image (AMI)s for Linux and Windows installations integrated with their ORBX middleware.
I spot three key pieces of information: The base clock is 797 MHz, the memory size is 2990 MB, and the default drivers are Forceware 276.52 (??). The core and default clock rate, GK104 and 797 MHz respectively, are characteristic of the GRID K520 GPU with its 2 GK104 GPUs clocked at 800 MHz. However, since the K520 gives each GPU 4GB and this instance only has 3GB of vRAM, I can tell that the product is slightly different.
I was unable to query the device's shader count. The K520 (similar to a GeForce 680) has 1536 per GPU which sounds about right (but, again, pure speculation).
I also tested the server with TCPing to measure its networking performance versus the cluster compute instances. I did not do anything like Speedtest or Netalyzr. With a normal cluster instance I achieve about 20-25ms pings; with this instance I was more in the 45-50ms range. Of course, your mileage may vary and this should not be used as any official benchmark. If you are considering using the instance for your product, launch an instance and run your own tests. It is not expensive. Still, it seems to be less responsive than Cluster Compute instances which is odd considering its intended gaming usage.
Regardless, now that Amazon picked up GRID, we might see more services (be it consumer or enterprise) which utilizes this technology. The new GPU instances start at $0.65/hr for Linux and $0.767/hr for Windows (excluding extra charges like network bandwidth) on demand. Like always with EC2, if you will use these instances a lot, you can get reduced rates if you pay a fee upfront.
Subject: Graphics Cards | October 28, 2013 - 09:29 AM | Ryan Shrout
Tagged: nvidia, kepler, gtx 780 ti, gtx 780, gtx 770, geforce
A lot of news coming from the NVIDIA camp today, including some price drops and price announcements.
First up, the high-powered GeForce GTX 780 is getting dropped from $649 to $499, a $150 savings that will bring the GTX 780 into line with the competition of AMD's new Radeon R9 290X launched last week.
Next, the GeForce GTX 770 2GB is going to drop from $399 to $329 to help it compete more closely with the R9 280X.
Even you weren't excited about the R9 290X, you have to be excited by competition.
In a surprising turn of events, NVIDIA is now the company with the great bundle deal with GPUs as well! Starting today you'll be able to get a free copy of Batman: Arkham Origins, Splinter Cell: Blacklist and Assassin's Creed IV: Black Flag with the GeForce GTX 780 Ti, GTX 780 and GTX 770. If you step down to the GTX 760 or 660 you'll lose out on the Batman title.
SHIELD discounts are available as well: $100 off you buy the upper tier GPUs and $50 off if you but the lower tier.
UPDATE: NVIDIA just released a new version of GeForce Experience that enabled ShadowPlay, the ability to use Kepler GPUs to record game play in the background with almost no CPU/system ovehead. You can see Scott's initial impressions of the software right here; it seems like its going to be a pretty awesome feature.
Need more news? The yet-to-be-released GeForce GTX 780 Ti is also getting a price - $699 based on the email we just received. And it will be available starting November 7th!!
With all of this news, how does it change our stance on the graphics market? Quite a bit in fact. The huge price drop on the GTX 780, coupled with the 3-game bundle means that NVIDIA is likely offering the better hardware/software combo for gamers this fall. Yes, the R9 290X is likely still a step faster, but now you can get the GTX 780, three great games and spend $50 less.
The GTX 770 is now poised to make a case for itself against the R9 280X as well with its $70 drop. The R9 280X / HD 7970 GHz Edition was definitely a better option with its $100 price delta but with only $30 separating the two competing cards, and the three free games, again the advantage will likely fall to NVIDIA.
Finally, the price point of the GTX 780 Ti is interesting - if NVIDIA is smart they are pricing it based on comparable performance to the R9 290X from AMD. If that is the case, then we can guess the GTX 780 Ti will be a bit faster than the Hawaii card, while likely being quieter and using less power too. Oh, and again, the three game bundle.
NVIDIA did NOT announce a GTX TITAN price drop which might surprise some people. I think the answer as to why will be addressed with the launch of the GTX 780 Ti next month but from what I was hearing over the last couple of weeks NVIDIA can't make the cards fast enough to satisfy demand so reducing margin there just didn't make sense.
NVIDIA has taken a surprisingly aggressive stance here in the discrete GPU market. The need to address and silence critics that think the GeForce brand is being damaged by the AMD console wins is obviously potent inside the company. The good news for us though, and the gaming community as a whole, is that just means better products and better value for graphics card purchases this holiday.
NVIDIA says these price drops will be live by tomorrow. Enjoy!
ShadowPlay is NVIDIA's latest addition to their GeForce Experience platform. This feature allows their GPUs, starting with Kepler, to record game footage either locally or stream it online through Twitch.tv (in a later update). It requires Kepler GPUs because it is accelerated by that hardware. The goal is to constantly record game footage without any noticeable impact to performance; that way, the player can keep it running forever and have the opportunity to save moments after they happen.
Also, it is free.
I know that I have several gaming memories which come unannounced and leave undocumented. A solution like this is very exciting to me. Of course a feature on paper not the same as functional software in the real world. Thankfully, at least in my limited usage, ShadowPlay mostly lives up to its claims. I do not feel its impact on gaming performance. I am comfortable leaving it on at all times. There are issues, however, that I will get to soon.
This first impression is based on my main system running the 331.65 (Beta) GeForce drivers recommended for ShadowPlay.
- Intel Core i7-3770, 3.4 GHz
- NVIDIA GeForce GTX 670
- 16 GB DDR3 RAM
- Windows 7 Professional
- 1920 x 1080 @ 120Hz.
- 3 TB USB3.0 HDD (~50MB/s file clone).
The two games tested are Starcraft II: Heart of the Swarm and Battlefield 3.
Subject: General Tech, Graphics Cards | October 23, 2013 - 12:21 AM | Scott Michaud
Tagged: nvidia, graphics drivers, geforce
Mid-June kicked up a storm of poop across the internet when IGN broke the AMD optimizations for Frostbite 3. It was reported that NVIDIA would not receive sample code for those games until after they launched. The article was later updated with a statement from AMD: "... the AMD Gaming Evolved program undertakes no efforts to prevent our competition from optimizing for games before their release."
Now, I assume, the confusion was caused by then-not-announced Mantle.
And, as it turns out, NVIDIA did receive the code for Battlefield 4 prior to launch. Monday, the company launched their 331.58 WHQL-certified drivers which are optimized for Batman: Arkham Origins and Battlefield 4. According to the release notes, you should even be able to use SLi out of the gate. If, on the other hand, you are a Civilization V player: HBAO+ should enhance your shadowing.
They also added a DX11 SLi profile for Watch Dogs... awkwarrrrrd.
To check out the blog at GeForce.com for a bit more information, check out the release notes, or just head over to the drivers page. If you have GeForce Experience installed, it probably already asked you to update.
The Really Good Times are Over
We really do not realize how good we had it. Sure, we could apply that to budget surpluses and the time before the rise of global terrorism, but in this case I am talking about the predictable advancement of graphics due to both design expertise and improvements in process technology. Moore’s law has been exceptionally kind to graphics. We can look back and when we plot the course of these graphics companies, they have actually outstripped Moore in terms of transistor density from generation to generation. Most of this is due to better tools and the expertise gained in what is still a fairly new endeavor as compared to CPUs (the first true 3D accelerators were released in the 1993/94 timeframe).
The complexity of a modern 3D chip is truly mind-boggling. To get a good idea of where we came from, we must look back at the first generations of products that we could actually purchase. The original 3Dfx Voodoo Graphics was comprised of a raster chip and a texture chip, each contained approximately 1 million transistors (give or take) and were made on a then available .5 micron process (we shall call it 500 nm from here on out to give a sense of perspective with modern process technology). The chips were clocked between 47 and 50 MHz (though often could be clocked up to 57 MHz by going into the init file and putting in “SET SST_GRXCLK=57”… btw, SST stood for Sellers/Smith/Tarolli, the founders of 3Dfx). This revolutionary graphics card at the time could push out 47 to 50 megapixels and had 4 MB of VRAM and was released in the beginning of 1996.
My first 3D graphics card was the Orchid Righteous 3D. Voodoo Graphics was really the first successful consumer 3D graphics card. Yes, there were others before it, but Voodoo Graphics had the largest impact of them all.
In 1998 3Dfx released the Voodoo 2, and it was a significant jump in complexity from the original. These chips were fabricated on a 350 nm process. There were three chips to each card, one of which was the raster chip and the other two were texture chips. At the top end of the product stack was the 12 MB cards. The raster chip had 4 MB of VRAM available to it while each texture chip had 4 MB of VRAM for texture storage. Not only did this product double performance from the Voodoo Graphics, it was able to run in single card configurations at 800x600 (as compared to the max 640x480 of the Voodoo Graphics). This is the same time as when NVIDIA started to become a very aggressive competitor with the Riva TnT and ATI was about to ship the Rage 128.
Subject: Graphics Cards, Displays | October 20, 2013 - 02:50 PM | Ryan Shrout
Tagged: video, tom petersen, nvidia, livestream, live, g-sync
UPDATE: If you missed our live stream today that covered NVIDIA G-Sync technology, you can watch the replay embedded below. NVIDIA's Tom Petersen stops by to talk about G-Sync in both high level and granular detail while showing off some demonstrations of why G-Sync is so important. Enjoy!!
Last week NVIDIA hosted press and developers in Montreal to discuss a couple of new technologies, the most impressive of which was NVIDIA G-Sync, a new monitor solution that looks to solve the eternal debate of smoothness against latency. If you haven't read about G-Sync and how impressive it was when first tested on Friday, you should check out my initial write up, NVIDIA G-Sync: Death of the Refresh Rate, that not only does that, but dives into the reason the technology shift was necessary in the first place.
G-Sync essentially functions by altering and controlling the vBlank signal sent to the monitor. In a normal configuration, vBlank is a combination of the combination of the vertical front and back porch and the necessary sync time. That timing is set a fixed stepping that determines the effective refresh rate of the monitor; 60 Hz, 120 Hz, etc. What NVIDIA will now do in the driver and firmware is lengthen or shorten the vBlank signal as desired and will send it when one of two criteria is met.
- A new frame has completed rendering and has been copied to the front buffer. Sending vBlank at this time will tell the screen grab data from the card and display it immediately.
- A substantial amount of time has passed and the currently displayed image needs to be refreshed to avoid brightness variation.
In current display timing setups, the submission of the vBlank signal has been completely independent from the rendering pipeline. The result was varying frame latency and either horizontal tearing or fixed refresh frame rates. With NVIDIA G-Sync creating an intelligent connection between rendering and frame updating, the display of PC games is fundamentally changed.
Every person that saw the technology, including other media members and even developers like John Carmack, Johan Andersson and Tim Sweeney, came away knowing that this was the future of PC gaming. (If you didn't see the panel that featured those three developers on stage, you are missing out.)
But it is definitely a complicated technology and I have already seen a lot of confusion about it in our comment threads on PC Perspective. To help the community get a better grasp and to offer them an opportunity to ask some questions, NVIDIA's Tom Petersen is stopping by our offices on Monday afternoon where he will run through some demonstrations and take questions from the live streaming audience.
Be sure to stop back at PC Perspective on Monday, October 21st at 2pm ET / 11am PT as to discuss G-Sync, how it was developed and the various ramifications the technology will have in PC gaming. You'll find it all on our PC Perspective Live! page on Monday but you can sign up for our "live stream mailing list" as well to get notified in advance!
NVIDIA G-Sync Live Stream
11am PT / 2pm ET - October 21st
We also want your questions!! The easiest way to get them answered is to leave them for us here in the comments of this post. That will give us time to filter through the questions and get the answers you need from Tom. We'll take questions via the live chat and via Twitter (follow me @ryanshrout) during the event but often time there is a lot of noise to deal with.
So be sure to join us on Monday afternoon!
Subject: Graphics Cards | October 18, 2013 - 07:55 PM | Ryan Shrout
Tagged: video, tim sweeney, nvidia, Mantle, john carmack, johan andersson, g-sync, amd
If you weren't on our live stream from the NVIDIA "The Way It's Meant to be Played" tech day this afternoon, you missed a hell of an event. After the announcement of NVIDIA G-Sync variable refresh rate monitor technology, NVIDIA's Tony Tomasi brough one of the most intriguing panels of developers on stage to talk.
John Carmack, Tim Sweeney and Johan Andersson talk for over an hour, taking questions from the audience and even getting into debates amongst themselves in some instances. Topics included NVIDIA G-Sync of course, AMD's Mantle low-level API, the hurdles facing PC gaming and what direction each luminary is currently on for future development.
If you are a PC enthusiast or gamer you are definitely going to want to listen and watch the video below!
Subject: General Tech, Graphics Cards | October 18, 2013 - 01:21 PM | Scott Michaud
Tagged: nvidia, GeForce GTX 780 Ti
So the really interesting news today was G-Sync but that did not stop NVIDIA from sneaking in a new high-end graphics card. The GeForce GTX 780 Ti follows the company's old method of releasing successful products:
- Attach a seemingly arbitrary suffix to a number
In all seriousness, we know basically nothing about this card. It is entirely possible that its architecture might not even be based on GK110. We do know it will be faster than a GeForce 780 but we have no frame of reference in regards to the GeForce Titan. The two cards were already so close in performance that Ryan struggled to validate the 780's existence. Imagine how difficult it would be for NVIDIA to wedge yet another product in that gap.
And if it does outperform the Titan, what is its purpose? Sure, Titan is a GPGPU powerhouse if you want double-precision performance without purchasing a Tesla or a Quadro, but that is not really relevant for gamers yet.
We shall see, soon, when we get review samples in. You, on the other hand, will likely see more when the card launches mid-November. No word on pricing.
Our Legacys Influence
We are often creatures of habit. Change is hard. And often times legacy systems that have been in place for a very long time can shift and determine the angle at which we attack new problems. This happens in the world of computer technology but also outside the walls of silicon and the results can be dangerous inefficiencies that threaten to limit our advancement in those areas. Often our need to adapt new technologies to existing infrastructure can be blamed for stagnant development.
Take the development of the phone as an example. The pulse based phone system and the rotary dial slowed the implementation of touch dial phones and forced manufacturers to include switches to select between pulse and tone based dialing options on phones for decades.
Perhaps a more substantial example is that of the railroad system that has based the track gauge (width between the rails) on the transportation methods that existed before the birth of Christ. Horse drawn carriages pulled by two horses had an axle gap of 4 feet 8 inches in the 1800s and thus the first railroads in the US were built with a track gauge of 4 feet 8 inches. Today, the standard rail track gauge remains 4 feet 8 inches despite the fact that a wider gauge would allow for more stability of larger cargo loads and allow for higher speed vehicles. But the cost of updating the existing infrastructure around the world would be so cost prohibitive that it is likely we will remain with that outdated standard.
What does this have to do with PC hardware and why am I giving you an abbreviated history lesson? There are clearly some examples of legacy infrastructure limiting our advancement in hardware development. Solid state drives are held back by the current SATA based storage interface though we are seeing movements to faster interconnects like PCI Express to alleviate this. Some compute tasks are limited by the “infrastructure” of standard x86 processor cores and the move to GPU compute has changed the direction of these workloads dramatically.
There is another area of technology that could be improved if we could just move past an existing way of doing things. Displays.