Subject: Graphics Cards | January 25, 2016 - 11:51 AM | Ryan Shrout
Tagged: fury x2, Fiji, dual fiji, amd
Lo and behold! The dual-Fiji card that we have previous dubbed the AMD Radeon Fury X2 still lives! Based on a tweet from AMD PR dude Antal Tungler, a PC from Falcon Northwest at the VRLA convention was utilizing a dual-GPU Fiji graphics card to power some demos.
— Antal Tungler (@coloredrocks) January 23, 2016
This prototype Falcon Northwest Tiki system was housing the GPU beast but no images were shown of the interior of the system. Still, it's good to see AMD at least recognize that this piece of hardware still exists at all, since it was initially promised to the enthusiast market by "fall of 2015." Even in October we had hints that the card might be coming soon after seeing some shipping manifests leak out to the web.
Better late than never, right? One theory floating around inside the offices here is that AMD is going to release the Fury X2 along with the VR headsets coming out this spring, with hopes of making it THE VR graphics card of choice. The value of using multi-GPU for VR is interesting, with one GPU dedicated to each eye, though the pitfalls that could haunt both AMD and NVIDIA in this regard (latency, frame time consistency) make the technological capability a debate.
Subject: Graphics Cards, Memory | January 22, 2016 - 11:08 AM | Ryan Shrout
Tagged: Polaris, pascal, nvidia, jedec, gddr5x, GDDR5, amd
Though information about the technology has been making rounds over the last several weeks, GDDR5X technology finally gets official with an announcement from JEDEC this morning. The JEDEC Solid State Foundation is, as Wikipedia tells us, an "independent semiconductor engineering trade organization and standardization body" that is responsible for creating memory standards. Getting the official nod from the org means we are likely to see implementations of GDDR5X in the near future.
The press release is short and sweet. Take a look.
ARLINGTON, Va., USA – JANUARY 21, 2016 –JEDEC Solid State Technology Association, the global leader in the development of standards for the microelectronics industry, today announced the publication of JESD232 Graphics Double Data Rate (GDDR5X) SGRAM. Available for free download from the JEDEC website, the new memory standard is designed to satisfy the increasing need for more memory bandwidth in graphics, gaming, compute, and networking applications.
Derived from the widely adopted GDDR5 SGRAM JEDEC standard, GDDR5X specifies key elements related to the design and operability of memory chips for applications requiring very high memory bandwidth. With the intent to address the needs of high-performance applications demanding ever higher data rates, GDDR5X is targeting data rates of 10 to 14 Gb/s, a 2X increase over GDDR5. In order to allow a smooth transition from GDDR5, GDDR5X utilizes the same, proven pseudo open drain (POD) signaling as GDDR5.
“GDDR5X represents a significant leap forward for high end GPU design,” said Mian Quddus, JEDEC Board of Directors Chairman. “Its performance improvements over the prior standard will help enable the next generation of graphics and other high-performance applications.”
JEDEC claims that by using the same signaling type as GDDR5 but it is able to double the per-pin data rate to 10-14 Gb/s. In fact, based on leaked slides about GDDR5X from October, JEDEC actually calls GDDR5X an extension to GDDR5, not a new standard. How does GDDR5X reach these new speeds? By doubling the prefech from 32 bytes to 64 bytes. This will require a redesign of the memory controller for any processor that wants to integrate it.
Image source: VR-Zone.com
As for usable bandwidth, though information isn't quoted directly, it would likely see a much lower increase than we are seeing in the per-pin statements from the press release. Because the memory bus width would remain unchanged, and GDDR5X just grabs twice the chunk sizes in prefetch, we should expect an incremental change. No mention of power efficiency is mentioned either and that was one of the driving factors in the development of HBM.
Performance efficiency graph from AMD's HBM presentation
I am excited about any improvement in memory technology that will increase GPU performance, but I can tell you that from my conversations with both AMD and NVIDIA, no one appears to be jumping at the chance to integrate GDDR5X into upcoming graphics cards. That doesn't mean it won't happen with some version of Polaris or Pascal, but it seems that there may be concerns other than bandwidth that keep it from taking hold.
Subject: Graphics Cards | January 20, 2016 - 03:26 PM | Scott Michaud
Tagged: nvidia, linux, tesla, fermi, kepler, maxwell
It's nice to see long-term roundups every once in a while. They do not really provide useful information for someone looking to make a purchase, but they show how our industry is changing (or not). In this case, Phoronix tested twenty-seven NVIDIA GeForce cards across four architectures: Tesla, Fermi, Kepler, and Maxwell. In other words, from the GeForce 8 series all the way up to the GTX 980 Ti.
Image Credit: Phoronix
Nine years of advancements in ASIC design, with a doubling time-step of 18 months, should yield a 64-fold improvement. The number of transistors falls short, showing about a 12-fold improvement between the Titan X and the largest first-wave Tesla, although that means nothing for a fabless semiconductor designer. The main reason why I include this figure is to show the actual Moore's Law trend over this time span, but it also highlights the slowdown in process technology.
Performance per watt does depend on NVIDIA though, and the ratio between the GTX 980 Ti and the 8500 GT is about 72:1. While this is slightly better than the target 64:1 ratio, these parts are from very different locations in their respective product stacks. Swapping the 8500 GT for the following year's 9800 GTX, which leads to a comparison between top-of-the-line GPUs of their respective times, and you see a 6.2x improvement in performance per watt versus the GTX 980 Ti. On the other hand, that part was outstanding for its era.
I should note that each of these tests take place on Linux. It might not perfectly reflect the landscape on Windows, but again, it's interesting in its own right.
Subject: Graphics Cards, Processors | January 19, 2016 - 11:38 PM | Scott Michaud
Digitimes is reporting on statements that were allegedly made by TSMC co-CEO, Mark Liu. We are currently seeing 16nm parts come out of the foundry, which is expected to be used in the next generation of GPUs, replacing the long-running 28nm node that launched with the GeForce GTX 680. (It's still unannounced whether AMD and NVIDIA will use 14nm FinFET from Samsung or GlobalFoundries, or 16nm FinFET from TSMC.)
Update (Jan 20th, @4pm EST): Couple minor corrections. Radeon HD 7970 launched at 28nm first by a couple of months. I just remember NVIDIA getting swamped in delays because it was a new node, so that's probably why I thought of the GTX 680. Also, AMD announced during CES that they will use GlobalFoundries to fab their upcoming GPUs, which I apparently missed. We suspect that NVIDIA will use TSMC, and have assumed that for a while, but it hasn't been officially announced yet (if ever).
According to their projections, which (again) are filtered through Digitimes, the foundry expects to have 7nm in the first half of 2018. They also expect to introduce extreme ultraviolet (EUV) lithography methods with 5nm in 2020. Given that Silicon in a solid has a lattice spacing of ~0.54nm at room temperature, 7nm transistors will consist of about 13 atoms, and 5nm transistors will have features containing about 9 atoms.
We continue the march toward the end of silicon lithography.
Even if the statement is correct, much can happen between then and now. It wouldn't be the first time that I've seen a major foundry believe that a node would be available, but end up having it delayed. I wouldn't hold my breath, but I might cross my fingers if my hands were free.
At the very least, we can assume that TSMC's roadmap is 16nm, 10nm, 7nm, and then 5nm.
Subject: Graphics Cards, Memory | January 19, 2016 - 11:01 PM | Scott Michaud
Tagged: Samsung, HBM2, hbm
Samsung has just announced that they have begun mass production of 4GB HBM2 memory modules. When used on GPUs, four packages can provide 16GB of Video RAM with very high performance. They do this with a very wide data bus, which trade off frequency for transferring huge chunks. Samsung's offering is rated at 256 GB/s per package, which is twice what the Fury X could do with HBM1.
They also expect to mass produce 8GB HBM2 packages within this calendar year. I'm guessing that this means we'll see 32GB GPUs in the late-2016 or early-2017 time frame unless "within this year" means very, very soon (versus Q3/Q4). They will likely be for workstation or professional cards, but, in NVIDIA's case, those are usually based on architectures that are marketed to high-end gaming enthusiasts through some Titan offering. There's a lot of ways this could go, but a 32GB Titan seems like a bit much; I wouldn't expect that this affects the enthusiast gamer segment. It might mean that professionals looking to upgrade from the Kepler-based Tesla K-series might be waiting a little longer, maybe even GTC 2017. Alternatively, they might get new cards, just with a 16GB maximum until a refresh next year. There's not enough information to know one way or the other, but it's something to think about when more of it starts rolling in.
Samsung's HBM2 are compatible with ECC, although I believe that was also true for at least some HBM1 modules from SK Hynix.
Subject: Graphics Cards | January 19, 2016 - 10:31 AM | Sebastian Peak
Tagged: rumor, report, nvidia, GTX 980MX, GTX 980M, GTX 970MX, GTX 970M, geforce
NVIDIA is reportedly preparing faster mobile GPUs based on Maxwell, with a GTX 980MX and 970MX on the way.
The new GTX 980MX would sit between the GTX 980M and the laptop version of the full GTX 980, with 1664 CUDA cores (compared to 1536 with the 980M), 104 Texture Units (up from the 980M's 96), a 1048 MHz core clock, and up to 8 GB of GDDR5. Memory speed and bandwidth will reportedly be identical to the GTX 980M at 5000 MHz and 160 GB/s respectively, with both GPUs using a 256-bit memory bus.
The GTX 970MX represents a similar upgrade over the existing GTX 970M, with CUDA Core count increased from 1280 to 1408, Texture Units up from 80 to 88, and 8 additional raster devices available (56 vs. 48). Both the 970M and 970MX use 192-bit GDDR5 clocked at 5000 MHz, and available with the same 3 GB or 6 GB of frame buffer.
WCCFtech prepared a chart to demonstrate the differences between NVIDIA's mobile offerings:
|Model||GeForce GTX 980 Laptop Version||GeForce GTX 980MX||
GeForce GTX 980M
|GeForce GTX 970MX||GeForce GTX 970M||GeForce GTX 965M||
GeForce GTX 960M
|Clock Speed||1218 MHz||1048 MHz||1038 MHz||941 MHz||924 MHz||950 MHz||1097 MHz|
|Frame Buffer||8 GB GDDR5||8/4 GB GDDR5||8/4 GB GDDR5||6/3 GB GDDR5||6/3 GB GDDR5||4 GB GDDR5||4 GB GDDR5|
|Memory Frequency||7008 MHz||5000 MHz||5000 MHz||5000 MHz||5000 MHz||5000 MHz||5000 MHz|
|Memory Bandwidth||224 GB/s||160 GB/s||160 GB/s||120 GB/s||120 GB/s||80 GB/s||80 GB/s|
These new GPUs will reportedly be based on the same Maxwell GM204 core, and TDPs are apparently unchanged at 125W for the GTX 980MX, and 100W for the 970MX.
We will await any official announcement.
Subject: Graphics Cards | January 18, 2016 - 09:44 PM | Scott Michaud
Tagged: Polaris, amd
When AMD announced their Polaris architecture at CES, it was focused on mid-range applications. Their example was an add-in board that could compete against an NVIDIA GeForce GTX 950, 1080p60 medium settings in Battlefront, but do so at 39% less wattage than this 28nm, Maxwell chip. These Polaris chips are planned for a “mid 2016” launch.
Raja Koduri, Chief Architect for the Radeon Technologies Group, spoke with VentureBeat at the show. In his conversation, he mentioned two architectures, Polaris 10 and Polaris 11, in the context of a question about their 2016 product generation. In the “high level” space, they are seeing “the most revolutionary jump in performance so far.” This doesn't explicitly state that the high-end Polaris video card will launch in 2016. That said, when combined with the November announcement, covered by us as “AMD Plans Two GPUs in 2016,” it further supports this interpretation.
We still don't know much about what the actual performance of this high-end GPU will be, though. AMD was able to push 8 TeraFLOPs of compute throughput by creating a giant 28nm die and converting the memory subsystem to HBM, which supposedly requires less die complexity than a GDDR5 memory controller (according to a conference call last year that preceded Fury X). The two-generation jump will give them more complexity to work with, but that could be partially offset by a smaller die because of the potential differences in yields (and so forth).
Also, while the performance of the 8 TeraFLOP Fury X was roughly equivalent to NVIDIA's 5.6 TeraFLOP GeForce GTX 980 Ti, we still don't know why. AMD has redesigned a lot of their IP blocks with Polaris; you would expect that, if something unexpected was bottlenecking Fury X, the graphics manufacturer wouldn't overlook it the next chance that they are able to tweak it. This could have been graphics processing or something much more mundane. Either way, upcoming benchmarks will be interesting.
And it seems like that may be this year.
Subject: Graphics Cards | January 13, 2016 - 07:42 PM | Scott Michaud
Tagged: graphics drivers, amd
AMD's recent “Hotfix” drivers don't seem to mean what NVIDIA's does. In the Green Team's case, they usually fix one or two issues that slipped past QA. While they likely won't break anything, they are probably a bad idea to install if you're not experiencing the listed problems. The changelog on AMD's drivers are significantly longer with a list of known issues that is roughly the same size.
So should you install it? That depends. It's a little less cut-and-dry than NVIDIA's hotfixes, which are only useful for a handful of people. It sounds like the worst known issue is “Game stuttering may be experienced when running two Radeon R9 295X2 graphics cards in CrossFire mode” and “Display corruption may occur on multiple display systems when it has been running idle for some time.” The latter would affect me greatly, because I run four displays and basically never sleep or shutdown (except for updates). On the other hand, it fixes a variety of crash, hang, and flicker issues.
Check it out. If it sounds good, then pick it up. Otherwise, wait for the next Beta or WHQL driver.
Subject: Graphics Cards | January 12, 2016 - 08:11 PM | Scott Michaud
Tagged: graphics drivers, graphics driver, nvidia
NVIDIA has been pushing for WHQL certification for their drivers, but sometimes issues slip through QA, both at Microsoft and their own, internal team(s). Sometimes these issues will be fixed in a future release, but sometimes they push out a “HotFix” driver immediately. This is often great for people who experience the problems, but they should not be installed otherwise.
In this case, GeForce Hotfix driver 361.60 fixes two issues. One is listed as “install & clocking related issues,” which refers to the GPU memory clock. According to Manuel Guzman of NVIDIA, some games and software was not causing the driver to fully wake the memory clock to a high-performance state. The other issue is “Crashes in Photoshop & Illustrator,” which fixes blue screen issues in both software, and possibly other programs that use the GPU in similar ways. I've never seen GeForce Driver 361.43 cause a BSOD in Photoshop, but I am a few versions behind with CS5.5.
Download links are available at NVIDIA Support, but unaffected users should just wait for an official driver in case the patch causes other issues, due to its minimal QA.
Subject: Graphics Cards | January 11, 2016 - 06:05 PM | Sebastian Peak
Tagged: rumor, report, pascal, nvidia, HBM2, hbm, GP104
A delivery of GPUs and related test equipment from Taiwan to Banglore has led to speculation about NVIDIA's upcoming GP104 Pascal GPU.
Image via Zauba.com
How much information can be gleaned from an import shipping manifest (linked here)? The data indicates a chip with a 37.5 x 37.5 mm package and 2152 pins, which is being attributed to the GP104 based on knowledge of “earlier, similar deliveries” (or possible inside information). This has prompted members of the 3dcenter.org forums (German language) to speculate on the use of GDDR5 or GDDR5X memory based on the likelihood of HBM being implemented on a die of this size.
Of course, NVIDIA has stated that Pascal will implement 3D memory, and the upcoming GP100 will reportedly be on a 55 x 55 mm package using HBM2. Could this be a new, lower-cost part using the existing GDDR5 standard or the faster GDDR5X instead? VideoCardz and WCCFtech have posted stories based on the 3DCenter report, and to quote directly from the VideoCardz post on the subject:
"3DCenter has a theory that GP104 could actually not use HBM, but GDDR5(X) instead. This would rather be a very strange decision, but could NVIDIA possibly make smaller GPU (than GM204) and still accommodate 4 HBM modules? This theory is not taken from the thin air. The GP100 aka the Big Pascal, would supposedly come in 55x55mm BGA package. That’s 10mm more than GM200, which were probably required for additional HBM modules. Of course those numbers are for the whole package (with interposer), not just the GPU."
All of this is a lot to take from a shipping record that might not even be related to an NVIDIA product, but the report has made the rounds at this point so now we’ll just have to wait for new information.