Subject: Graphics Cards | August 8, 2016 - 01:08 PM | Sebastian Peak
Tagged: amd, radeon, RX460, rx 460, graphics, gpu, gaming, benchmark, 1080p, 1920x1080, gtx 950, gtx 750 ti
HEXUS has posted their review of Sapphire's AMD Radeon RX 460 Nitro 4GB graphics card, pitting it against the NVIDIA GTX 950 and GTX 750 Ti in a 1920x1080 benchmarking battle.
Image credit: HEXUS
"Unlike the two previous AMD GPUs released under the Polaris branding recently, RX 460 is very much a mainstream part that's aimed at buyers who are taking their first real steps into PC gaming. RX 460 uses a distinct, smaller die and is to be priced from £99. As usual, let's fire up the comparison specification table and dissect the latest offering from AMD."
Image credit: HEXUS
The results might surprise you, and vary somewhat based on the game selected. Check out the source link for the full review over at HEXUS.
Subject: Graphics Cards | August 6, 2016 - 02:24 PM | Tim Verry
Tagged: sapphire, rx 470, polaris 10, dual x, amd
Following the official launch of AMD's Radeon RX 470 GPU, Sapphire has unleashed its own custom graphics card with the Nitro+ RX 470 in 4GB and 8GB factory overclocked versions. Surprisingly, the new cards are up for purchase now at various retailers at $210 for the 4GB model and $240 for the 8GB model (more on that in a bit).
The new Nitro+ RX 470 uses the same board and cooler design as the previously announced Nitro+ RX 480 which is a good thing both for Sapphire (less R&D cost) and for consumers as they get a rather beefy cooler that should allow them to push the RX 470 clocks quite a bit. The card uses the same Dual X cooler with two 95mm quick connect fans, three nickel plated copper heatpipes, and an aluminum fin stack. The card features the same black fan shroud and black and silver colored backplate. Out of the box this cooler should keep the RX 470 GPU running cooler and quieter than the RX 480, but it should also enable users to get higher clocks out of the smaller GPU (less cores means less heat and more overclocking headroom assuming you get a good chip from the silicon lottery).
Sapphire is using Black Diamond 4 chokes and a 4+1 power phase design that is driven by a single 8-pin PCI-E power connector (and up to 75W from the motherboard slot). This mirrors the design of its RX 480 sibling.
Display outputs include a single DVI, two HDMI 2.0b, and two DisplayPort 1.4 ports.
The chart below outlines the comparison between the Nitro+ RX 470 cards, RX 470 reference specifications, and the RX 480.
Nitro+ RX 470 4GB
|Nitro+ RX 470 8GB||RX 470 Reference||RX 480|
|GPU Clock (Base)||1143 MHz||1121 MHz||926 MHz||1120 MHz|
|GPU Clock (Boost)||1260 MHz||1260 MHz||1206 MHz||1266 MHz|
|Memory||4GB GDDR5 @ 7 GHz||8GB GDDR5 @ 8 GHz||4 or 8 GB GDDR5 @ 6.6 GHz||4 or 8 GB GDDR5 @ up to 8 GHz|
|Memory Bandwidth||224 GB/s||256 GB/s||211 GB/s||256 GB/s|
|GPU||Polaris 10||Polaris 10||Polaris 10||Polaris 10|
|Price||$210||$240||$180+||$200+ ($240+ for 8GB)|
The RX 470 GPU is only slightly cut down from RX 480 in that it features four fewer CUs though the processor maintains the same number of ROP units and the same 256-bit memory bus. Reference clocks are 926 MHz base and 1206 MHz boost. Memory can be up to 8GB of GDDR5 with reference memory clocks of 6.6 GHz (effective). Sapphire has overclocked both the GPU and memory with the NItro+ series. The Nitro+ RX 470 with 4GB of GDDR5 is clocked at 1143 MHz base, 1260 MHz boost, and 7 GHz memory while the 8GB version has a lower base clock of 1121 but a higher memory clock of 8 GHz.
The 8GB model having a lower base overclock is a bit strange to me, but at least they are rated at the same boost clock. These specifications are very close to the RX 480 actually and with a bit of user overclocking beyond the factory overclock you could get even closer to the performance of it.
The problem with this RX 470 that gets so close to the RX 480 though is that the price is also very close to reference RX 480s! The Sapphire Nitro+ RX 470 4GB is priced at $209.99 while the Nitro+ RX 470 8GB is $239.99.
These prices put the card well into RX 480 territory though not quite up to the MSRPs of factory overclocked RX 480s (e.g. Sapphire's own Nitro+ RX 480 is $219 and $269 for 4GB and 8GB respectively). The company has a nice looking (and hopefully performing) RX 470, but it is going to be tough to choose this card over a RX 480 that has more shaders and TMUs. One advantage though is that this is a card that will just work without having to manually overclock (though where is the fun in that? heh) and it is actually available right now unlike the slew of RX 480 cards that have been launched but are consistently out of stock everywhere! If you simply can't wait for a RX 480, this might not be a bad option.
EDIT: Of course the 8GB model goes out of stock at Newegg as I write this and Amazon's prices are higher than MSRP! hah.
A Beautiful Graphics Card
As a surprise to nearly everyone, on July 21st NVIDIA announced the existence of the new Titan X graphics cards, which are based on the brand new GP102 Pascal GPU. Though it shares a name, for some unexplained reason, with the Maxwell-based Titan X graphics card launched in March of 2015, this is card is a significant performance upgrade. Using the largest consumer-facing Pascal GPU to date (with only the GP100 used in the Tesla P100 exceeding it), the new Titan X is going to be a very expensive, and very fast gaming card.
As has been the case since the introduction of the Titan brand, NVIDIA claims that this card is for gamers that want the very best in graphics hardware as well as for developers and need an ultra-powerful GPGPU device. GP102 does not integrate improved FP64 / double precision compute cores, so we are basically looking at an upgraded and improved GP104 Pascal chip. That’s nothing to sneeze at, of course, and you can see in the specifications below that we expect (and can now show you) Titan X (Pascal) is a gaming monster.
|Titan X (Pascal)||GTX 1080||GTX 980 Ti||TITAN X||GTX 980||R9 Fury X||R9 Fury||R9 Nano||R9 390X|
|GPU||GP102||GP104||GM200||GM200||GM204||Fiji XT||Fiji Pro||Fiji XT||Hawaii XT|
|Rated Clock||1417 MHz||1607 MHz||1000 MHz||1000 MHz||1126 MHz||1050 MHz||1000 MHz||up to 1000 MHz||1050 MHz|
|Memory Clock||10000 MHz||10000 MHz||7000 MHz||7000 MHz||7000 MHz||500 MHz||500 MHz||500 MHz||6000 MHz|
|Memory Interface||384-bit G5X||256-bit G5X||384-bit||384-bit||256-bit||4096-bit (HBM)||4096-bit (HBM)||4096-bit (HBM)||512-bit|
|Memory Bandwidth||480 GB/s||320 GB/s||336 GB/s||336 GB/s||224 GB/s||512 GB/s||512 GB/s||512 GB/s||320 GB/s|
|TDP||250 watts||180 watts||250 watts||250 watts||165 watts||275 watts||275 watts||175 watts||275 watts|
|Peak Compute||11.0 TFLOPS||8.2 TFLOPS||5.63 TFLOPS||6.14 TFLOPS||4.61 TFLOPS||8.60 TFLOPS||7.20 TFLOPS||8.19 TFLOPS||5.63 TFLOPS|
GP102 features 40% more CUDA cores than the GP104 at slightly lower clock speeds. The rated 11 TFLOPS of single precision compute of the new Titan X is 34% higher than that of the GeForce GTX 1080 and I would expect gaming performance to scale in line with that difference.
Titan X (Pascal) does not utilize the full GP102 GPU; the recently announced Pascal P6000 does, however, which gives it a CUDA core count of 3,840 (256 more than Titan X).
A full GP102 GPU
The complete GPU effectively loses 7% of its compute capability with the new Titan X, although that is likely to help increase available clock headroom and yield.
The new Titan X will feature 12GB of GDDR5X memory, not HBM as the GP100 chip has, so this is clearly a unique chip with a new memory interface. NVIDIA claims it has 480 GB/s of bandwidth on a 384-bit memory controller interface running at the same 10 Gbps as the GTX 1080.
Realworldtech with Compelling Evidence
Yesterday David Kanter of Realworldtech posted a pretty fascinating article and video that explored the two latest NVIDIA architectures and how they have branched away from the traditional immediate mode rasterization units. It has revealed through testing that with Maxwell and Pascal NVIDIA has gone to a tiling method with rasterization. This is a somewhat significant departure for the company considering they have utilized the same basic immediate mode rasterization model since the 90s.
The Videologic Apocolypse 3Dx based on the PowerVR PCX2.
(photo courtesy of Wikipedia)
Tiling is an interesting subject and we can harken back to the PowerVR days to see where it was first implemented. There are many advantages to tiling and deferred rendering when it comes to overall efficiency in power and memory bandwidth. These first TBDR (Tile Based Deferred Renderers) offered great performance per clock and could utilize slower memory as compared to other offerings of the day (namely Voodoo Graphics). There were some significant drawbacks to the technology. Essentially a lot of work had to be done by the CPU and driver in scene setup and geometry sorting. On fast CPU systems the PowerVR boards could provide very good performance, but it suffered on lower end parts as compared to the competition. This is a very simple explanation of what is going on, but the long and short of it is that TBDR did not take over the world due to limitations in its initial implementations. Traditional immediate mode rasters would improve in efficiency and performance with aggressive Z checks and other optimizations that borrow from the TBDR playbook.
Tiling is also present in a lot of mobile parts. Imagination’s PowerVR graphics technologies have been implemented by others such as Intel, Apple, Mediatek, and others. Qualcomm (Adreno) and ARM (Mali) both implement tiler technologies to improve power consumption and performance while increasing bandwidth efficiency. Perhaps most interestingly we can remember back to the Gigapixel days with the GP-1 chip that implemented a tiling method that seemed to work very well without the CPU hit and driver overhead that had plagued the PowerVR chips up to that point. 3dfx bought Gigapixel for some $150 million at the time. That company then went on to file bankruptcy a year later and their IP was acquired by NVIDIA.
Screenshot of the program used to uncover the tiling behavior of the rasterizer.
It now appears as though NVIDIA has evolved their raster units to embrace tiling. This is not a full TBDR implementation, but rather an immediate mode tiler that will still break up the scene in tiles but does not implement deferred rendering. This change should improve bandwidth efficiency when it comes to rasterization, but it does not affect the rest of the graphics pipeline by forcing it to be deferred (tessellation, geometry setup and shaders, etc. are not impacted). NVIDIA has not done a deep dive on this change for editors, so we do not know the exact implementation and what advantages we can expect. We can look at the evidence we have and speculate where those advantages exist.
The video where David Kanter explains his findings
Bandwidth and Power
Tilers have typically taken the tiled regions and buffered them on the chip. This is a big improvement in both performance and power efficiency as the raster data does not have to be cached and written out to the frame buffer and then swapped back. This makes quite a bit of sense considering the overall lack of big jumps in memory technologies over the past five years. We have had GDDR-5 since 2007/2008. The speeds have increased over time, but the basic technology is still much the same. We have seen HBM introduced with AMD’s Fury series, but large scale production of HBM 2 is still to come. Samsung has released small amounts of HBM 2 to the market, but not nearly enough to handle the needs of a mass produced card. GDDR-5X is an extension of GDDR-5 that does offer more bandwidth, but it is still not a next generation memory technology like HBM 2.
By utilizing a tiler NVIDIA is able to lower memory bandwidth needs for the rasterization stage. Considering that both Maxwell and Pascal architectures are based on GDDR-5 and 5x technologies, it makes sense to save as much bandwidth as possible where they can. This is again probably one, among many, of the reasons that we saw a much larger L2 cache in Maxwell vs. Kepler (2048 KB vs. 256KB respectively). Every little bit helps when we are looking at hard, real world bandwidth limits for a modern GPU.
The area of power efficiency has also come up in discussion when going to a tiler. Tilers have traditionally been more power efficient as well due to how the raster data is tiled and cached, requiring fewer reads and writes to main memory. The first impulse is to say, “Hey, this is the reason why NVIDIA’s Maxwell was so much more power efficient than Kepler and AMD’s latest parts!” Sadly, this is not exactly true. The tiler is more power efficient, but it is a small part to the power savings on a GPU.
The second fastest Pascal based card...
A modern GPU is very complex. There are some 7.2 billion transistors on the latest Pascal GP-104 that powers the GTX 1080. The vast majority of those transistors are implemented in the shader units of the chip. While the raster units are very important, they are but a fraction of that transistor budget. The rest is taken up by power regulation, PCI-E controllers, and memory controllers. In the big scheme of things the raster portion is going to be dwarfed in power consumption by the shader units. This does not mean that they are not important though. Going back to the hated car analogy, one does not achieve weight savings by focusing on one aspect alone. It is going over every single part of the car and shaving ounces here and there, and in the end achieving significant savings by addressing every single piece of a complex product.
This does appear to be the long and short of it. This is one piece of a very complex ASIC that improves upon memory bandwidth utilization and power efficiency. It is not the whole story, but it is an important part. I find it interesting that NVIDIA did not disclose this change to editors with the introduction of Maxwell and Pascal, but if it is transparent to users and developers alike then there is no need. There is a lot of “secret sauce” that goes into each architecture, and this is merely one aspect. The one question that I do have is how much of the technology is based upon the Gigapixel IP that 3dfx bought at such a premium? I believe that particular tiler was an immediate mode renderer as well due to it not having as many driver and overhead issues that PowerVR exhibited back in the day. Obviously it would not be a copy/paste of the technology that was developed back in the 90s, it would be interesting to see if it was the basis for this current implementation.
Subject: Graphics Cards | August 2, 2016 - 07:37 AM | Scott Michaud
Tagged: windows 10, vulkan, microsoft, DirectX 12
Update (August 3rd @ 4:30pm): Turns out Khronos Group announced at SIGGRAPH that Subgroup Instructions have been recently added to SPIR-V (skip video to 21:30), and are a "top priority" for "Vulkan Next". Some (like WaveBallot) are already ARB (multi-vendor) OpenGL extensions, too.
Original post below:
DirectX 12's shading language will receive some new functionality with the new Shader Model 6.0. According to their GDC talks, it is looking like it will be structured similar to SPIR-V in how it's compiled and ingested. Code will be compiled and optimized as an LLVM-style bytecode, which the driver will accept and execute on the GPU. This could make it easy to write DX12-compatible shader code in other languages, like C++, which is a direction that Vulkan is heading, but Microsoft hasn't seemed to announce that yet.
This news shows a bit more of the nitty gritty details. It looks like they added 16-bit signed (short) and unsigned (ushort) integers, which might provide a performance improvement on certain architectures (although I'm not sure that it's new and/or GPUs exist the natively operate upon them) because they operate on half of the data as a standard, 32-bit integer. They have also added more functionality, to both the pixel and compute shaders, to operate in multiple threads, called lanes, similar to OpenCL. This should allow algorithms to work more efficiently in blocks of pixels, rather than needing to use one of a handful of fixed function calls (ex: partial derivates ddx and ddy) to see outside their thread.
When will this land? No idea, but it is conspicuously close to the Anniversary Update. It has been added to Feature Level 12.0, so its GPU support should be pretty good. Also, Vulkan exists, doing its thing. Not sure how these functions overlap with SPIR-V's feature set, but, since SPIR was original for OpenCL, it could be just sitting there for all I know.
Subject: Graphics Cards | August 2, 2016 - 03:50 AM | Tim Verry
Tagged: sapphire, rx 460, polaris 11, nitro, amd
AMD and its board partners will officially launch the first Polaris 11 GPU and the Radeon RX 460 graphics cards based around that processor on August 8th. Fortunately Videocardz.com got a hold of an image that shows off Sapphire's take on the RX 460 in the form of a factory overclocked and custom cooled RX460 Nitro OC. This gives us a hint at the kinds of cards we can expect and it appears to be good news for budget gamers as it suggests that there will be several options around this firm $100 price point that are a bit more than the bare necessities.
In the case of Sapphire's RX 460 Nitro OC, it uses a custom dual fan cooler with two copper heatpipes, an aluminum fin stack (that is much larger than reference), and two 90mm fans. Display IO includes one DVI, one HDMI, and one DisplayPort. The card itself uses a physical PCI-E x16 connector that is electrically PCI-E 3.0 x8. The x8 connection will be more than enough for this GPU though it also enables partners to cut costs.
Clockspeeds are not yet known, but the Polaris 11 GPU (896 cores, 56 TMUs, 16 ROPs) will be paired with 4GB GDDR5 memory.
It is encouraging to me to see custom cards at this price point out of the gate with the full 4GB of memory (AMD allows 2GB or 4GB versions). Gamers that simply can't justify spending much more than a hundred dollars on a GPU should have ample options to choose from and I am looking forward to seeing what all the partners have to offer.
Are you looking at Polaris 11 and the RX 460 for a super budget gaming build? What do you think about Sapphire's card with the company's custom cooler?
Subject: Graphics Cards | August 1, 2016 - 06:52 PM | Scott Michaud
Tagged: nvidia, Lawsuit, GTX 980, gtx 960
Update @ 9:45pm: I heard that some AMD users were notified about their R9 purchase as well, calling it simply "R9". Since I didn't see concrete proof, I omit it from the post in case it was a hoax (as the story is still developing). I have since been notified of a tweet with an email screenshot.
Original post below:
Apparently, Newegg is informing customers that NVIDIA has settled a class action lawsuit with customers of the GeForce GTX 960 and GTX 980 cards, along with the GTX 970. It's currently unclear whether this is an error, or whether this is one of the sibling class action lawsuits that were apparently bundled together with the GTX 970 one. Users on the NVIDIA Reddit are claiming that it has to do with DirectX 12 feature level support, although that seems like knee-jerk confirmation bias to me.
Regardless, if you purchased a GeForce 900-series graphics card from Newegg, maybe even including the 980 Ti, then you should check your email. You might have a settlement en-route.
That's all we know at this point, though. Thanks to our readers for pointing this out.
Subject: Graphics Cards | August 1, 2016 - 03:39 PM | Sebastian Peak
Tagged: pascal, nvidia, notebooks, mobile gpu, mobile gaming, laptops, GTX 1080M, GTX 1070M, GTX 1060M, discrete gpu
VideoCardz is reporting that an official announcement of the rumored mobile GPUs might be coming at Gamescom later this month.
"Mobile Pascal may arrive at Gamescom in Europe. According to DigiTimes, NVIDIA would allow its notebook partners to unveil mobile Pascal between August 17th to 21st, so just when Gamescom is hosted is hosted in Germany."
We had previously reported on the rumors of a mobile GTX 1070 and 1060, and we can only assume a 1080 will also be available (though VideoCardz is not speculating on the specs of this high-end mobile card just yet).
Rumored NVIDIA Mobile Pascal GPU specs (Image credit: VideoCardz)
Gamescom runs from August 17 - 21 in Germany, so we only have to wait about three weeks to know for sure.
Subject: Graphics Cards | August 1, 2016 - 10:16 AM | Sebastian Peak
Tagged: amd, radeon, radeon software, Crimson Edition 16.7.3, driver, graphics, update, rx480, rise of the tomb raider
AMD has released the Radeon Software Crimson Edition 16.7.3 driver, with improved performance in Rise of the Tomb Raider for Radeon RX 480 owners, as well as various bug fixes.
Radeon Software Crimson Edition is AMD's revolutionary new graphics software that delivers redesigned functionality, supercharged graphics performance, remarkable new features, and innovation that redefines the overall user experience. Every Radeon Software release strives to deliver new features, better performance and stability improvements.
Radeon Software Crimson Edition 16.7.3 Highlights
Rise of the Tomb Raider performance increase up to 10% versus Radeon Software Crimson Edition 16.7.2 on Radeon RX 480 graphics
Subject: Graphics Cards | July 30, 2016 - 11:35 PM | Tim Verry
Tagged: xfx, rx 470, polaris 10, Double Dissipation Edition, amd
AMD's budget (under $200) Polaris-based graphics cards are coming next week, and the leaks are starting to appear online. In the case of the Radeon RX 470, AMD is expecting that most (if not all) of its board partners will be using their own custom coolers. Thanks to Chinese technology site EXPReview, we finally have an idea of what an RX 470 will look like – or at least what an XFX-branded RX 470 will look like!
The website posted several photos of the alleged (but likely legitimate) XFX RX 470 "Black Wolf" graphics card which will probably be branded as the XFX RX 470 Double Dissipation in North America. This is a dual slot card with dual fan cooler that measures 9.45 inches long. Three copper heat pipes pull heat into an aluminum heatsink that is cooled by two 80mm fans that can reportedly be removed by the user for cleaning (and maybe user RMA replacement like Sapphire is planning). The card also features a full backplate and LED-backlit XFX logo along the side of the card. The design is all black with a white XFX logo.
Video outputs include three DisplayPort 1.4, one HDMI 2.0b, and one DL-DVI which seems about right for this price point.
The card is powered by a single 6-pin PCI-E power connector and the card will use AMD's RX 470 GPU and 4GB of GDDR5 memory. The RX 470 features 2048 cores, 128 texture units, and 32 raster operators, This is essentially a RX 480 GPU with four less Compute Units though it maintains the same number of ROPs and the same 256-bit memory bus. We do not know clockspeeds on this custom cooled XFX card yet, but overclockers may well be able to push clocks further than they could on RX 480 (there are less cores so the chips may be able to be pushed further on clocks), but it is hard to say right now. I would expect out of the box clocks to be a bit above the reference RX 470 clocks of 926 MHz base and 1206 MHz boost.
You can check out all of the photos of this card here.
Stay tuned to PC Perspective for more RX 470 and RX 460 news as we near the official launch dates!
- AMD Details the RX 470 and RX 460 Graphics Cards, Coming in August
- The AMD Radeon RX 480 Review - The Polaris Promise