Our first DX12 Performance Results
Late last week, Microsoft approached me to see if I would be interested in working with them and with Futuremark on the release of the new 3DMark API Overhead Feature Test. Of course I jumped at the chance, with DirectX 12 being one of the hottest discussion topics among gamers, PC enthusiasts and developers in recent history. Microsoft set us up with the latest iteration of 3DMark and the latest DX12-ready drivers from AMD, NVIDIA and Intel. From there, off we went.
First we need to discuss exactly what the 3DMark API Overhead Feature Test is (and also what it is not). The feature test will be a part of the next revision of 3DMark, which will likely ship in time with the full Windows 10 release. Futuremark claims that it is the "world's first independent" test that allows you to compare the performance of three different APIs: DX12, DX11 and even Mantle.
It was almost one year ago that Microsoft officially unveiled the plans for DirectX 12: a move to a more efficient API that can better utilize the CPU and platform capabilities of future, and most importantly current, systems. Josh wrote up a solid editorial on what we believe DX12 means for the future of gaming, and in particular for PC gaming, that you should check out if you want more background on the direction DX12 has set.
One of DX12 keys for becoming more efficient is the ability for developers to get closer to the metal, which is a phrase to indicate that game and engine coders can access more power of the system (CPU and GPU) without having to have its hand held by the API itself. The most direct benefit of this, as we saw with AMD's Mantle implementation over the past couple of years, is improved quantity of draw calls that a given hardware system can utilize in a game engine.
Subject: Graphics Cards | March 23, 2015 - 07:30 AM | Scott Michaud
Tagged: quadro, nvidia, m6000, gm200
Alongside the Titan X, NVIDIA has announced the Quadro M6000. In terms of hardware, they are basically the same component: 12 GB of GDDR5 on a 384-bit memory bus, 3072 CUDA cores, and a reduction in double precision performance to 1/32nd of its single precision. The memory, but not the cache, is capable of ECC (error-correction) for enterprises who do not want a stray photon to mess up their computation. That might be the only hardware difference between it and the Titan X.
Compared to other Quadro cards, it loses some double precision performance as mentioned earlier, but it will be an upgrade in single precision (FP32). The add-in board connects to the power supply with just a single eight-pin plug. Technically, with its 250W TDP, it is slightly over the rating for one eight-pin PCIe connector, but NVIDIA told Anandtech that they're confident that it won't matter for the card's intended systems.
That is probably true, but I wouldn't put it past someone to do something spiteful given recent events.
The lack of double precision performance (IEEE 754 FP64) could be disappointing for some. While NVIDIA would definitely know their own market better than I do, I was under the impression that a common workstation system for GPU compute was a Quadro driving a few Teslas (such as two of these). It would seem weird for a company to have such a high-end GPU be paired with Teslas that have such a significant difference in FP64 compute. I wonder what this means for the Tesla line, and whether we will see a variant of Maxwell with a large boost in 64-bit performance, or if that line will be in an awkward place until Pascal.
Or maybe not? Maybe NVIDIA is planning to launch products based on an unannounced, FP64-focused architecture? The aim could be to let the Quadro deal with the heavy FP32 calculations, while the customer could opt to load co-processors according to their double precision needs? It's an interesting thought as I sit here at my computer musing to myself, but then I immediately wonder why did they not announce it at GTC if that is the case? If that is the case, and honestly I doubt it because I'm just typing unfiltered thoughts here, you would think they would kind-of need to be sold together. Or maybe not. I don't know.
Pricing and availability is not currently known, except that it is “soon”.
Subject: Graphics Cards | March 19, 2015 - 03:20 PM | Jeremy Hellstrom
Tagged: titan x, nvidia, gtx titan x, gm200, geforce, 4k
You have read Ryan's review of the $999 behemoth from NVIDIA and now you can take the opportunity to see what other reviewers think of the card. [H]ard|OCP tested it against the GTX 980 which shares the same cooler and is every bit as long as the TITAN X. Along the way they found a use for the 12GB of VRAM as both Watch_Dogs and Far Cry 4 used over 7GB of memory when tested at 4k resolution though the frame rates were not really playable, you will need at least two TITAN X's to pull that off. They will be revisiting this card in the future, providing more tests for a card with incredible performance and an even more incredible price.
"The TITAN X video card has 12GB of VRAM, not 11.5GB, 50% more streaming units, 50% more texture units, and 50% more CUDA cores than the current GTX 980 flagship NVIDIA GPU. While this is not our full TITAN X review, this preview focuses on what the TITAN X delivers when directly compared to the GTX 980."
Here are some more Graphics Card articles from around the web:
- Nvidia's GeForce GTX Titan X @ The Tech Report
- NVIDIA GeForce Titan X 12GB GPU Review @HiTech Legion
- NVIDIA GeForce GTX Titan X Review @ OCC
- NVIDIA GTX TITAN X Performance Review @ Hardware Canucks
- he New Single GPU King Of The Hill: A Look At NVIDIA’s GeForce GTX TITAN X @ Techgage
- NVIDIA GeForce GTX Titan X Review @ Neoseeker
- NVIDIA GeForce Titan X 12 GB @ techPowerUp
- ASUS ROG Poseidon GTX 980 Platinum vs. AMD R9 295X2 @ [H]ard|OCP
- Inno3D GeForce GTX 960 iChill X3 Air Boss Ultra @ Kitguru
- EVGA GTX 960 SuperSC SweetSpot Plus Style @ Bjorn3d
Subject: General Tech, Graphics Cards, Shows and Expos | March 17, 2015 - 03:44 PM | Allyn Malventano
Tagged: nvidia, DIGITS
At GTC, NVIDIA announced a new device called the DIGITS DevBox:
The DIGITS DevBox is a device that data scientists can purchase and install locally. Plugged into a single electrical outlet, this modified Corsair Air 540 case equipped with quad TITAN X (reviewed here) GPUs can crank out 28 TeraFLOPS of compute power. The installed CPU is a Haswell-E 5930K, and the system is rated to draw 1300W of power. NVIDIA is building these in-house as the expected volume is low, with these units likely going to universities and small compute research firms.
Why would you want such compute power?
DIGITS is a software package available from NVIDIA. Its purpose is to act as a tool for data scientists to manipulate deep learning environments (neural networks). This package, running on a DIGITS DevBox, will give much more compute power capability to scientists who need it for their work. Getting this tech in the hands of more scientists will accelerate this technology and lead to what NVIDIA hopes will be a ‘Big Bang’ in this emerging GPU-compute-heavy field.
Ryan interviewed the lead developer of DIGITS in the video below. This offers a great explanation (and example) of what this deep learning stuff is all about:
With the release of the GeForce GTX 980 back in September of 2014, NVIDIA took the lead in performance with single GPU graphics cards. The GTX 980 and GTX 970 were both impressive options. The GTX 970 offered better performance than the R9 290 as did the GTX 980 compared to the R9 290X; on top of that, both did so while running at lower power consumption and while including new features like DX12 feature level support, HDMI 2.0 and MFAA (multi-frame antialiasing). Because of those factors, the GTX 980 and GTX 970 were fantastic sellers, helping to push NVIDIA’s market share over 75% as of the 4th quarter of 2014.
But in the back of our mind, and in the minds of many NVIDIA fans, we knew that the company had another GPU it was holding on to: the bigger, badder version of Maxwell. The only question was going to be WHEN the company would release it and sell us a new flagship GeForce card. In most instances, this decision is based on the competitive landscape, such as when AMD might be finally updating its Radeon R9 290X Hawaii family of products with the rumored R9 390X. Perhaps NVIDIA is tired of waiting or maybe the strategy is to launch soon before Fiji GPUs make their debut. Either way, NVIDIA officially took the wraps off of the new GeForce GTX TITAN X at the Game Developers Conference two weeks ago.
At the session hosted by Epic Games’ Tim Sweeney, NVIDIA CEO Jen-Hsun Huang arrived when Tim lamented about needing more GPU horsepower for their UE4 content. In his hands he had the first TITAN X GPU and talked about only a couple of specifications: the card would have 12GB of memory and it would be based on a GPU with 8 billion transistors.
Since that day, you have likely seen picture after picture, rumor after rumor, about specifications, pricing and performance. Wait no longer: the GeForce GTX TITAN X is here. With a $999 price tag and a GPU with 3072 CUDA cores, we clearly have a new king of the court.
Subject: Graphics Cards | March 17, 2015 - 01:47 PM | Ryan Shrout
Tagged: pascal, nvidia, gtc 2015, GTC, geforce
At the keynote of the GPU Technology Conference (GTC) today, NVIDIA CEO Jen-Hsun Huang disclosed some more updates on the roadmap for future GPU technologies.
Most of the detail was around Pascal, due in 2016, that will introduce three new features including mixed compute precision, 3D (stacked) memory, and NVLink. Mixed precision is a method of computing in FP16, allowing calculations to run much faster at lower accuracy than full single or double precision when they are not necessary. Keeping in mind that Maxwell doesn't have an implementation with full speed DP compute (today), it would seem that NVIDIA is targeting different compute tasks moving forward. Though details are short, mixed precision would likely indicate processing cores than can handle both data types.
3D memory is the ability to put memory on-die with the GPU directly to improve overall memory banwidth. The visual diagram that NVIDIA showed on stage indicated that Pascal would have 750 GB/s of bandwidth, compared to 300-350 GB/s on Maxwell today.
NVLink is a new way of connecting GPUs, improving on bandwidth by more than 5x over current implementations of PCI Express. They claim this will allow for connecting as many as 8 GPUs for deep learning performance improvements (up to 10x). What that means for gaming has yet to be discussed.
NVIDIA made some other interesting claims as well. Pascal will be more than 2x more performance per watt efficient than Maxwell, even without the three new features listed above. It will also ship (in a compute targeted product) with a 32GB memory system compared to the 12GB of memory announced on the Titan X today. Pascal will also have 4x the performance in mixed precision compute.
Subject: Graphics Cards, Shows and Expos | March 17, 2015 - 10:31 AM | Ryan Shrout
Tagged: nvidia, video, GTC, gtc 2015
NVIDIA is streaming today's keynote from the GPU Technology Conference (GTC) on Ustream, and we have the embed below for you to take part. NVIDIA CEO Jen-Hsun Huang will reveal the details about the new GeForce GTX TITAN X but there are going to be other announcements as well, including one featuring Tesla CEO Elon Musk.
Should be interesting!
Subject: Graphics Cards | March 16, 2015 - 07:13 PM | Ryan Shrout
Tagged: video, tom petersen, titan x, nvidia, maxwell, live, gtx titan x, gtx, gm200, geforce
UPDATE 2: If you missed the live stream, we now have the replay available below!
UPDATE: The winner has been announced: congrats to Ethan M. for being selected as the random winner of the GeForce GTX TITAN X graphics card!!
Get yourself ready, it’s time for another GeForce GTX live stream hosted by PC Perspective’s Ryan Shrout! This time the focus is going to be NVIDIA's brand-new GeForce GTX TITAN X graphics card, first teased a couple of weeks back at GDC. NVIDIA's Tom Petersen will be joining us live from the GPU Technology Conference show floor to discuss the GM200 GPU, it's performance and to show off some demos of the hardware in action.
And what's a live stream without a prize? One lucky live viewer will win a GeForce GTX TITAN X 12GB graphics card of their very own! That's right - all you have to do is tune in for the live stream tomorrow afternoon and you could win a Titan X!!
NVIDIA GeForce GTX TITAN X Live Stream and Giveaway
1pm PT / 4pm ET - March 17th
Need a reminder? Join our live mailing list!
The event will take place Tuesday, March 17th at 1pm PT / 4pm ET at http://www.pcper.com/live. There you’ll be able to catch the live video stream as well as use our chat room to interact with the audience. To win the prize you will have to be watching the live stream, with exact details of the methodology for handing out the goods coming at the time of the event.
Tom has a history of being both informative and entertaining and these live streaming events are always full of fun and technical information that you can get literally nowhere else.
If you have questions, please leave them in the comments below and we'll look through them just before the start of the live stream. Of course you'll be able to tweet us questions @pcper and we'll be keeping an eye on the IRC chat as well for more inquiries. What do you want to know and hear from Tom or I?
So join us! Set your calendar for this coming Tuesday at 1pm PT / 4pm ET and be here at PC Perspective to catch it. If you are a forgetful type of person, sign up for the PC Perspective Live mailing list that we use exclusively to notify users of upcoming live streaming events including these types of specials and our regular live podcast. I promise, no spam will be had!
Huge thanks to ASUS for supplying a new G751JY notebook, featuring an Intel Core i7-4710HQ and a GeForce GTX 980M 4GB GPU to power our live stream from GTC!!
Subject: Graphics Cards | March 14, 2015 - 02:12 PM | Ryan Shrout
Tagged: nvidia, quadro, m6000, deadmau5, gtc 2015
Sometimes information comes from the least likely of sources. Deadmau5, one of the world's biggest names in house music, posted an interesting picture to his Instagram feed a couple of days ago.
Well, that's interesting. A quick hunt on Google for the NVIDIA M6000 reveals rumors of it being a GM200-based 12GB graphics card. Sound familiar? NVIDIA recently announced the GeForce GTX TITAN X based on an identical configuration at GDC last week.
A backup of the Instagram image...in case it gets removed.
With NVIDIA's GPU Technology Conference coming up starting this Tuesday, it would appear we have more than one version of GM200 incoming.
Subject: Graphics Cards | March 12, 2015 - 11:13 PM | Sebastian Peak
Tagged: nvidia, maxwell, GTX 960M, GTX 950M, gtx 860m, gtx 850m, gm107, geforce
NVIDIA has announced new GPUs to round out their 900-series mobile lineup, and the new GTX 960M and GTX 950M are based on the same GM107 core as the previous 860M/850M parts.
Both GPUs feature 640 CUDA Cores and are separated by Base clock speed, with the GTX 960M operating at 1096 MHz and GTX 950M at 914 MHz. Both have unlisted maximum Boost frequencies that will likely vary based on thermal constraints. The memory interface is the other differentiator between the GPUs, with the GTX 960M sporting dedicated GDDR5 memory, and the GTX 950M can be implemented with either DDR3 or GDDR5 memory. Both GTX 960M and 950M use the same 128-bit memory interface and support up to 4GB of memory.
As reported by multiple sources the core powering the 960M/950M is a GM107 Maxwell GPU, which means that we are essentially talking about rebadged 860M/850M products, though the unlisted Boost frequencies could potentially be higher with these parts with improved silicon on a mature 28nm process. In contrast the previously announced GTX 965M is based on a cut down Maxwell GM204 GPU, with its 1024 CUDA Cores representing half of the GPU core introduced with the GTX 980.
New notebooks featuring the GTX 960M have already been announced by NVIDIA's partners, so we will soon see if there is any performance improvement to these refreshed GM107 parts.