Through the looking glass
Futuremark has been the most consistent and most utilized benchmark company for PCs for quite a long time. While other companies have faltered and faded, Futuremark continues to push forward with new benchmarks and capabilities in an attempt to maintain a modern way to compare performance across platforms with standardized tests.
Back in March of 2015, 3DMark added support for an API Overhead test to help gamers and editors understand the performance advantages of Mantle and DirectX 12 compared to existing APIs. Though the results were purely “peak theoretical” numbers, the data helped showcase to consumers and developers what low levels APIs brought to the table.
Today Futuremark is releasing a new benchmark that focuses on DX12 gaming. No longer just a feature test, Time Spy is a fully baked benchmark with its own rendering engine and scenarios for evaluating the performance of graphics cards and platforms. It requires Windows 10 and a DX12-capable graphics card, and includes two different graphics tests and a CPU test. Oh, and of course, there is a stunningly gorgeous demo mode to go along with it.
I’m not going to spend much time here dissecting the benchmark itself, but it does make sense to have an idea of what kind of technologies are built into the game engine and tests. The engine is based purely on DX12, and integrates technologies like asynchronous compute, explicit multi-adapter and multi-threaded workloads. These are highly topical ideas and will be the focus of my testing today.
Futuremark provides an interesting diagram to demonstrate the advantages DX12 has over DX11. Below you will find a listing of the average number of vertices, triangles, patches and shader calls in 3DMark Fire Strike compared with 3DMark Time Spy.
It’s not even close here – the new Time Spy engine has more than a factor of 10 more processing calls for some of these items. As Futuremark states, however, this kind of capability isn’t free.
With DirectX 12, developers can significantly improve the multi-thread scaling and hardware utilization of their titles. But it requires a considerable amount of graphics expertise and memory-level programming skill. The programming investment is significant and must be considered from the start of a project.
Subject: Graphics Cards | July 13, 2016 - 09:20 PM | Scott Michaud
Tagged: vulkan, R9 Fury X, nvidia, Mantle, gtx 1070, fury x, doom, amd
We haven't yet benchmarked DOOM on Vulkan Update (immediately after posting): Ryan has just informed me that, apparently, we did benchmark Vulkan on our YouTube page (embed below). I knew we were working on it, I just didn't realize we published content yet. Original post continues below.
As far as I know, we're trying to get our testing software for frame time analysis running on the new API, but other sites have posted framerate-based results. The results show that AMD's cards benefit greatly from the new, Mantle-derived interface (versus the OpenGL one). On the other hand, while NVIDIA never really sees a decrease, more than 1% at least, it doesn't really get much of a boost, either.
Image Credit: ComputerBase.de
I tweeted out to ID's lead renderer programmer, Tiago Sousa, to ask whether they take advantage of NVIDIA-specific extensions on the OpenGL path (like command submission queues). I haven't got a response yet, so it's difficult to tell whether this speaks more toward NVIDIA's OpenGL performance, or AMD's Vulkan performance. In the end, it doesn't really matter, though. AMD's Fury X (which can be found for as low as $399 with a mail-in rebate) is beating the GTX 1070 (which is in stock for the low $400s) by a fair margin. The Fury X also beats its own OpenGL performance by up to 66% (at 1080p) with the new API.
The API should also make it easier for games to pace their frames, too, which should allow smoother animation at these higher rates. That said, we don't know for sure because we can't test that from just seeing FPS numbers. The gains are impressive from AMD, though.
Subject: Graphics Cards | July 12, 2016 - 12:01 AM | Tim Verry
Tagged: strix, rx 480, Radeon RX 480, polaris 10, asus, amd
Alongside the launch of AMD’s reference design Radeon RX 480, the company’s various AIB (Add-In Board) partners began announcing their own custom versions pairing AMD’s Polaris 10 GPU with custom PCBs and coolers. Asus took the launch to heart and teased its Radeon RX 480 STRIX under it’s ROG lineup. The press release was rather scant with details, but it does look like a promising card that will let users really push Polaris 10 to it’s limits.
Thanks to forum user Eroticus over at VideoCardz, the RX 480 STRIX looks to use a custom PCB and power delivery design that feeds the GPU via two PCI-E power connectors in addition to the PCI-E slot. Asus is not talking clock speeds on the GPU, but they did reveal that they are going with 8GB of GDDR5 memory at 8 GHz. The DirectCU III cooler pairs heatpipes and an aluminum fin stack with three shrouded fans. There is also a backplate (of course, with a LED backlit logo) which should help support the card and provide a bit more cooling.
I would not expect too much of a factory (out of the box) overclock from this card. However, I do expect that users will be able to seriously overclock the Polaris 10 GPU thanks to the extra power connector (allegedly one 6-pin and one 8-pin which seems a bit much but we’ll see!) and beefy air cooler.
For reference, the, well, reference design RX 480 has base and boost clock speeds of 1120 MHz and 1266 MHz respectively. The Polaris 10 GPU has 2,304 cores, 144 texture units, and 32 raster operators. If buyers get a good chip in their RX 480 Strix, it may be possible for them to get to 1400 MHz boost as some of the rumors around the Internet claim though it’s hard to say for sure as that may require quite a bit more voltage (and heat) to reach. I wouldn’t put it out of the realm of possibility though!
Of course it would not be Republic of Gamers’ material without LEDs, and ASUS delivers with the inclusion of its Aura RGB LEDs on the cooler shroud and backplate which I believe are user configurable in Asus’ software utility.
Beyond that, not much is known about the upcoming RX 480 STRIX graphics card. Stay tuned to PC Perspective for more information as it gets closer to availability!
- The AMD Radeon RX 480 Review - The Polaris Promise
- PowerColor Radeon RX 480 Red Devil Leak
- PCPer Live! Radeon RX 480 Live Stream with Raja Koduri!
- Meet ASUS' DirectCU III on the Radeon Fury
Subject: Graphics Cards | July 11, 2016 - 01:59 PM | Jeremy Hellstrom
Tagged: GTX 1080, GameRock Premium, palit, factory overclocked
Palit's card is certainly unique looking in the GTX 1080 market, that blue, white and silver is not a colour palette used by other manufacturers. That is not the only difference between this card and a stock GTX 1080, it is also overclocked with a core of 1746 MHz and VRAM at 1315 MHz, along with a cooler that covers the entire card and takes up three slots. That extra cooling ability translates into a card that runs at 30dBA under load, and TechPowerUp did not see temperatures exceeding 72°C. It is a little on the expensive side but if you have space in your case this is a worth contender for your hard earned cash.
"Palit's GTX 1080 GameRock uses a mighty triple-slot dual-fan design, which provides excellent temperatures and noise levels better than any GTX 1080 we tested so far. The fans also turn off in idle, and thanks to the large overclock out the box, the card is the fastest GTX 1080 we ever tested, too."
Here are some more Graphics Card articles from around the web:
- EVGA GeForce GTX 1070 SuperClocked 8 GB @ techPowerUp
- ASUS STRIX GAMING GTX 1080 @ eTeknix
- ASUS GTX 950-2G “Unplugged” @ Kitguru
- PNY GTX 950 2GB and GTX 960 4GB XLR8 OC Gaming @ Kitguru
- Radeon Software 16.7.1 Performance Comparison @ Tech ARP
Subject: Graphics Cards | July 7, 2016 - 10:13 PM | Scott Michaud
Tagged: nvidia, GTX 1080, ea, dice, battlefield, battlefield 1
Battlefield 1 looks pretty good. To compare how it scales between its settings, DigitalFoundry took a short amount of video at 4K across all four, omnibus graphics settings: Low, Medium, High, and Ultra. These are, as should be expected for a high-end PC game, broken down into more specific categories, like lighting quality and texture filtering, but I can't blame them for not adding that many permutations to a single video. It would just be a mess.
The rendering itself doesn't change too much between settings to my eye. Higher quality settings draw more distant objects than lower ones, and increases range that level of detail falls off, too. About a third of the way into the video, they show a house from a moderate distance. The lowest quality version was almost completely devoid of shadowing and its windows would not even draw. The lighting then scaled up from there as the settings were moved progressively toward Ultra.
Image Credit: DigitalFoundry
While it's still Alpha-level code, a single GTX 1080 was getting between 50 and 60 FPS at 4K. This is a good range to be in for a G-Sync monitor, as the single-card machine doesn't need to deal with multi-GPU issues, like pacing and driver support.
Subject: Graphics Cards | July 7, 2016 - 04:37 PM | Scott Michaud
Tagged: amd, rx 480, powercolor
According to Videocardz, a custom RX 480 from PowerColor has been caught on camera. The most interesting part about this variant is that it connects to the power supply with a single eight-pin PCIe connector. With AMD's latest driver, and hopefully even a modified vBIOS and PCB, this should be plenty enough power for the GPU, even with overclocking.
Image Credit: Videocardz
The card itself is a three-fan design with three DisplayPorts, one HDMI, and a single DVI. This retains the reference design's three DisplayPorts, but also adds the option to use DVI without an adapter. I'm not sure whether all five connectors can be used simultaneously, which isn't too bad -- apparently the GTX 1080 also cannot use all five connectors at the same time, so I wouldn't plan on connecting five monitors to a single-GPU system, just in case.
No pricing and availability yet... this is just a picture. We don't even know clock rates.
Radeon Software 16.7.1 Adjustments
Last week we posted a story that looked at a problem found with the new AMD Radeon RX 480 graphics card’s power consumption. The short version of the issue was that AMD’s new Polaris 10-based reference card was drawing more power than its stated 150 watt TDP and that it was drawing more power through the motherboard PCI Express slot that the connection was rated for. And sometimes that added power draw was significant, both at stock settings and overclocked. Seeing current draw over a connection rated at just 5.5A peaking over 7A at stock settings raised an alarm (validly) and our initial report detailed the problem very specifically.
AMD responded initially that “everything was fine here” but the company eventually saw the writing on the wall and started to work on potential solutions. The Radeon RX 480 is a very important product for the future of Radeon graphics and this was a launch that needs to be as perfect as it can be. Though the risk to users’ hardware with the higher than expected current draw is muted somewhat by motherboard-based over-current protection, it’s crazy to think that AMD actually believed that was the ideal scenario. Depending on the “circuit breaker” in any system to save you when standards exists for exactly that purpose is nuts.
Today AMD has released a new driver, version 16.7.1, that actually introduces a pair of fixes for the problem. One of them is hard coded into the software and adjusts power draw from the different +12V sources (PCI Express slot and 6-pin connector) while the other is an optional flag in the software that is disabled by default.
Reconfiguring the power phase controller
The Radeon RX 480 uses a very common power controller (IR3567B) on its PCB to cycle through the 6 power phases providing electricity to the GPU itself. Allyn did some simple multimeter trace work to tell us which phases were connected to which sources and the result is seen below.
The power controller is responsible for pacing the power coming in from the PCI Express slot and the 6-pin power connection to the GPU, in phases. Phases 1-3 come in from the power supply via the 6-pin connection, while phases 4-6 source power from the motherboard directly. At launch, the RX 480 drew nearly identical amounts of power from both the PEG slot and the 6-pin connection, essentially giving each of the 6 phases at work equal time.
That might seem okay, but it’s far from the standard of what we have seen in the past. In no other case have we measured a graphics card drawing equal power from the PEG slot as from an external power connector on the card. (Obviously for cards without external power connections, that’s a different discussion.) In general, with other AMD and NVIDIA based graphics cards, the motherboard slot would provide no more than 50-60 watts of power, while any above that would come from the 6/8-pin connections on the card. In many cases I saw that power draw through the PEG slot was as low as 20-30 watts if the external power connections provided a lot of overage for the target TDP of the product.
Subject: Graphics Cards | July 7, 2016 - 02:50 PM | Sebastian Peak
Tagged: rx480, rx 480, Radeon RX 480, radeon, power draw, PCIe power, graphics drivers, driver, Crimson Edition 16.7.1, amd
As promised, AMD has released an updated driver for the RX 480 graphics card, and the Radeon Software Crimson Edition 16.7.1 promises a fix for the power consumption concerns we have been covering in-depth.
Note: We have published our full analysis of the new 16.7.1 driver, available here.
AMD lists these highlights for the new Crimson Edition 16.7.1 software:
"The Radeon RX 480’s power distribution has been improved for AMD reference boards, lowering the current drawn from the PCIe bus.
A new 'compatibility mode' UI toggle has been made available in the Global Settings menu of Radeon Settings. This option is designed to reduce total power with minimal performance impact if end users experience any further issues. This toggle is 'off' by default.
Performance improvements for the Polaris architecture that yield performance uplifts in popular game titles of up to 3%. These optimizations are designed to improve the performance of the Radeon RX 480, and should substantially offset the performance impact for users who choose to activate the 'compatibility' toggle."
You can go directly to AMD's page for this updated driver from this direct link: http://support.amd.com/en-us/download/desktop?os=Windows%2010%20-%2064
It’s probably not going to come as a surprise to anyone that reads the internet, but NVIDIA is officially taking the covers off its latest GeForce card in the Pascal family today, the GeForce GTX 1060. As the number scheme would suggest, this is a more budget-friendly version of NVIDIA’s latest architecture, lowering performance in line with expectations. The GP106-based GPU will still offer impressive specifications and capabilities and will probably push AMD’s new Radeon RX 480 to its limits.
Let’s take a quick look at the card’s details.
|GTX 1060||RX 480||R9 390||R9 380||GTX 980||GTX 970||GTX 960||R9 Nano||GTX 1070|
|GPU||GP106||Polaris 10||Grenada||Tonga||GM204||GM204||GM206||Fiji XT||GP104|
|Rated Clock||1506 MHz||1120 MHz||1000 MHz||970 MHz||1126 MHz||1050 MHz||1126 MHz||up to 1000 MHz||1506 MHz|
|Texture Units||80 (?)||144||160||112||128||104||64||256||120|
|ROP Units||48 (?)||32||64||32||64||56||32||64||64|
|Memory Clock||8000 MHz||7000 MHz
|6000 MHz||5700 MHz||7000 MHz||7000 MHz||7000 MHz||500 MHz||8000 MHz|
|Memory Interface||192-bit||256-bit||512-bit||256-bit||256-bit||256-bit||128-bit||4096-bit (HBM)||256-bit|
|Memory Bandwidth||192 GB/s||224 GB/s
|384 GB/s||182.4 GB/s||224 GB/s||196 GB/s||112 GB/s||512 GB/s||256 GB/s|
|TDP||120 watts||150 watts||275 watts||190 watts||165 watts||145 watts||120 watts||275 watts||150 watts|
|Peak Compute||3.85 TFLOPS||5.1 TFLOPS||5.1 TFLOPS||3.48 TFLOPS||4.61 TFLOPS||3.4 TFLOPS||2.3 TFLOPS||8.19 TFLOPS||5.7 TFLOPS|
The GeForce GTX 1060 will sport 1280 CUDA cores with a GPU Boost clock speed rated at 1.7 GHz. Though the card will be available in only 6GB varieties, the reference / Founders Edition will ship with 6GB of GDDR5 memory running at 8.0 GHz / 8 Gbps. With 1280 CUDA cores, the GP106 GPU is essentially one half of a GP104 in terms of compute capability. NVIDIA decided not to cut the memory interface in half though, instead going with a 192-bit design compared to the GP104 and its 256-bit option.
The rated GPU clock speeds paint an interesting picture for peak performance of the new card. At the rated boost clock speed, the GeForce GTX 1070 produces 6.46 TFLOPS of performance. The GTX 1060 by comparison will hit 4.35 TFLOPS, a 48% difference. The GTX 1080 offers nearly the same delta of performance above the GTX 1070; clearly NVIDIA has set the scale Pascal and product deviation.
NVIDIA wants us to compare the new GeForce GTX 1060 to the GeForce GTX 980 in gaming performance, but the peak theoretical performance results don’t really match up. The GeForce GTX 980 is rated at 4.61 TFLOPS at BASE clock speed, while the GTX 1060 doesn’t hit that number at its Boost clock. Obviously Pascal improves on performance with memory compression advancements, but the 192-bit memory bus is only able to run at 192 GB/s, compared to the 224 GB/s of the GTX 980. Obviously we’ll have to wait for performance result from our own testing to be sure, but it seems possible that NVIDIA’s performance claims might depend on technology like Simultaneous Multi-Projection and VR gaming to be validated.
Subject: Graphics Cards | July 6, 2016 - 11:56 PM | Scott Michaud
Tagged: titan, pascal, nvidia, gtx 1080 ti, gp102, GP100
Normally, I pose these sorts of rumors as “Well, here you go, and here's a grain of salt.” This one I'm fairly sure is bogus, at least to some extent. I could be wrong, but especially the GP100 aspects of it just doesn't make sense.
Before I get to that, the rumor is that NVIDIA will announce a GeForce GTX Titan P at Gamescom in Germany. The event occurs mid-August (17th - 21st) and it has been basically Europe's E3 in terms of gaming announcements. It also overlaps with Europe's Game Developers Conference (GDC), which occurs in March for us. The rumor says that it will use GP100 (!?!) with either 12GB of VRAM, 16GB of VRAM, or two variants as we've seen with the Tesla P100 accelerator.
The rumor also acknowledges the previously rumored GP102 die, claims that it will be for the GTX 1080 Ti, and suggests that it will have up to 3840 CUDA cores. This is the same number of CUDA cores as the GP100, which is where I get confused. This would mean that NVIDIA made a special die, which other rumors claim is ~450mm2, for just the GeForce GTX 1080 Ti.
I mean, it's possible that NVIDIA would split the GTX 1080 Ti and the next Titan by similar gaming performance, just with better half- and double-precision performance and faster memory for GPGPU developers. That would be a very weird to me, though, developing two different GPU dies for the consumer market with probably the same gaming performance.
And they would be announcing the Titan P first???
The harder to yield one???
When the Tesla version isn't even expected until Q4???
I can see it happening, but I seriously doubt it. Something may be announced, but I'd have to believe it will be at least slightly different from the rumors that we are hearing now.