Subject: General Tech, Graphics Cards | December 4, 2017 - 05:47 PM | Tim Verry
Tagged: navi, HBM2, hbm, gddr6, amd
WCCFTech reports that AMD is working on a GDDR6 memory controller for its upcoming graphics cards. Starting with an AMD Technical Engineer listing GDDR6 on his portfolio, the site claims to have verified through sources familiar with the matter that AMD is, in fact, supporting the new graphics memory standard and will be using their own controller to support it (rather than licensing one).
AMD is not abandoning HBM2 memory though. The company is sticking to its previously released roadmaps and Navi will still utilize HBM2 memory – at least on the high-end SKUs. While AMD has so far only released RX Vega 64 and RX Vega 56 graphics cards, the company may well release lower-end Vega-based cards with GDDR5 at some point although for now the Polaris architecture is handling the lower end. AMD supporting GDDR6 is a good thing and should enable cheaper mid-range cards that are not limited by supply shortages of the more expensive (albeit much higher bandwidth) High Bandwidth Memory that have seemingly plagues both NVIDIA and AMD at various points in time. GDDR6 further offers several advantages over GDDR5 with almost twice the speed (9 Gbps versus 16 Gbps) at lower power (1.5V versus 1.35V) and more density and underlying technology optimizations than even GDDR5X. While the G5X memory is capable of hitting the same 16 Gbps launch speeds of GDDR6, the newer memory technology offers up to 32Gb dies* versus 16Gb and a two channel design (which ends up being a bit more efficient and easier to produce / for GPU manufacturers to wire up). GDDR6 will represent a nice speed bump for mid-range cards (very low end may well stick with GDDR5 save for mobile parts which could benefit from the lower power GDDR6) while letting AMD have a bit better profit margins on these lower end margin SKUs and being able to produce more cards to satisfy demand. HBM2 is nice to have but it is more well suited for the compute-oriented cards for workstation and data center usage rather than gaming right now and GDDR6 can offer more price-to-performance for the consumer gaming cards.
As for the question of why AMD would want to design their own GDDR6 memory controller rather than license one, I think that comes down to AMD thinking long-term. It will be more expensive up front to design their own controller, but AMD will be able to more fully integrate it and tune it to work with their graphics cards such that it can be more power efficient. Also, having their own GDDR6 memory controller means they can use it in other areas such as their APUs and SoCs offered through their Semi Custom Business Unit (e.g. the SoCs used in gaming consoles). Being able to offer that controller to other companies in their semi-custom SoCs free of third party licensing fees is a good thing for AMD.
With GDDR6 becoming readily available early next year, there is a good chance AMD will be ready to use the new memory technology as soon as Navi but likely not until closer to the end of 2018 or early 2019 when AMD launches new lower and mid-range gaming cards (consumer-level) based on Navi and/or Vega.
*At launch it appears that GDDR6 from the big three (Micron, Samsung, and SK Hynix) will use 16Gb dies, but the standard allows for up to 32Gb dies. The G5X standard allows for up to 16Gb dies.
- (Leak) AMD Vega 10 and Vega 20 Information Leaked
- Micron Pushes GDDR5X To 16Gbps, Expects To Launch GDDR6 In Early 2018
- Micron Planning To Launch GDDR6 Graphics Memory In 2017
- Podcast #436 - ECS Mini-STX, NVIDIA Quadro, AMD Zen Arch, Optane, GDDR6 and more!
- AMD Q3 2017 Earnings: A Pleasant Surprise
Subject: Processors | November 6, 2017 - 02:00 PM | Josh Walrath
Tagged: radeon, Polaris, mobile, kaby lake, interposer, Intel, HBM2, gaming, EMIB, apple, amd, 8th generation core
In what is probably considered one of the worst kept secrets in the industry, Intel has announced a new CPU line for the mobile market that integrates AMD’s Radeon graphics. For the past year or so rumors of such a partnership were freely flowing, but now we finally get confirmation as to how this will be implemented and marketed.
Intel’s record on designing GPUs has been rather pedestrian. While they have kept up with the competition, a slew of small issues and incompatibilities have plagued each generation. Performance is also an issue when trying to compete with AMD’s APUs as well as discrete mobile graphics offerings from both AMD and NVIDIA. Software and driver support is another area where Intel has been unable to compete due largely to economics and the competitions’ decades of experience in this area.
There are many significant issues that have been solved in one fell swoop. Intel has partnered with AMD’s Semi-Custom Group to develop a modern and competent GPU that can be closely connected to the Intel CPU all the while utilizing HBM2 memory to improve overall performance. The packaging of this product utilizes Intel’s EMIB (Embedded Multi-die Interconnect Bridge) tech.
EMIB is an interposer-like technology that integrates silicon bridges into the PCB instead of relying upon a large interposer. This allows a bit more flexibility in layout of the chips as well as lowers the Z height of the package as there is not a large interposer sitting between the chips and the PCB. Just as interposer technology allows the use of chips from different process technologies to work seamlessly together, EMIB provides that same flexibility.
The GPU looks to be based on the Polaris architecture which is a slight step back from AMD’s cutting edge Vega architecture. Polaris does not implement the Infinity Fabric component that Vega does. It is more conventional in terms of data communication. It is a step beyond what AMD has provided for Sony and Microsoft, who each utilize a semi-custom design for the latest console chips. AMD is able to integrate the HBM2 controller that is featured in Vega. Using HBM2 provides a tremendous amount of bandwidth along with power savings as compared to traditional GDDR-5 memory modules. It also saves dramatically on PCB space allowing for smaller form factors.
EMIB provides nearly all of the advantages of the interposer while keeping the optimal z-height of the standard PCB substrate.
Intel did have to do quite a bit of extra work on the power side of the equation. AMD utilizes their latest Infinity Fabric for fine grained power control in their upcoming Raven Ridge based Ryzen APUs. Intel had to modify their current hardware to be able to do much the same work with 3rd party silicon. This is no easy task as the CPU needs to monitor and continually adjust for GPU usage in a variety of scenarios. This type of work takes time and a lot of testing to fine tune as well as the inevitable hardware revisions to get thing to work correctly. This then needs to be balanced by the GPU driver stack which also tends to take control of power usage in mobile scenarios.
This combination of EMIB, Intel Kaby Lake CPU, HBM2, and a current AMD GPU make this a very interesting combination for the mobile and small form factor markets. The EMIB form factor provides very fast interconnect speeds and a smaller footprint due to the integration of HBM2 memory. The mature AMD Radeon software stack for both Windows and macOS environments provides Intel with another feature in which to sell their parts in areas where previously they were not considered. The 8th Gen Kaby Lake CPU provides the very latest CPU design on the new 14nm++ process for greater performance and better power efficiency.
This is one of those rare instances where such cooperation between intense rivals actually improves the situation for both. AMD gets a financial shot in the arm by signing a large and important customer for their Semi-Custom division. The royalty income from this partnership should be more consistent as compared to the console manufacturers due to the seasonality of the console product. This will have a very material effect on AMD’s bottom line for years to come. Intel gets a solid silicon solution with higher performance than they can offer, as well as aforementioned mature software stack for multiple OS. Finally throw in the HBM2 memory support for better power efficiency and a smaller form factor, and it is a clear win for all parties involved.
The PCB savings plus faster interconnects will allow these chips to power smaller form factors with better performance and battery life.
One of the unknowns here is what process node the GPU portion will be manufactured on. We do not know which foundry Intel will use, or if they will stay in-house. Currently TSMC manufactures the latest console SoCs while GLOBALFOUNDRIES handles the latest GPUS from AMD. Initially one would expect Intel to build the GPU in house, but the current rumor is that AMD will work to produce the chips with one of their traditional foundry partners. Once the chip is manufactured then it is sent to Intel to be integrated into their product.
Apple is one of the obvious candidates for this particular form factor and combination of parts. Apple has a long history with Intel on the CPU side and AMD on the GPU side. This product provides all of the solutions Apple needs to manufacture high performance products in smaller form factors. Gaming laptops also get a boost from such a combination that will offer relatively high performance with minimal power increases as well as the smaller form factor.
The potential (leaked) performance of the 8th Gen Intel CPU with Radeon Graphics.
The data above could very well be wrong about the potential performance of this combination. What we see is pretty compelling though. The Intel/AMD product performs like a higher end CPU with discrete GPU combo. It is faster than a NVIDIA GTX 1050 Ti and trails the GTX 1060. It also is significantly faster than a desktop AMD RX 560 part. We can also see that it is going to be much faster than the flagship 15 watt TDP AMD Ryzen 7 2700U. We do not yet know how it compares to the rumored 65 watt TDP Raven Ridge based APUs from AMD that will likely be released next year. What will be fascinating here is how much power the new Intel combination will draw as compared to the discrete solutions utilizing NVIDIA graphics.
To reiterate, this is Intel as a customer for AMD’s Semi-Custom group rather than a licensing agreement between the two companies. They are working hand in hand in developing this solution and then both profiting from it. AMD getting royalties from every Intel package sold that features this technology will have a very positive effect on earnings. Intel gets a cutting edge and competent graphics solution along with the improved software and driver support such a package includes.
Update: We have been informed that AMD is producing the chips and selling them directly to Intel for integration into these new SKUs. There are no royalties or licensing, but the Semi-Custom division should still receive the revenue for these specialized products made only for Intel.
Subject: General Tech | March 28, 2017 - 01:04 PM | Jeremy Hellstrom
Tagged: amd, Vega, rumour, HBM2
The Inquirer have posted a tiny bit of information about AMD's upcoming Vega and as any rumours about the new GPU are hard to find it is the best we have at the moment. AMD's claim is that the second generation HBM present on the 4GB and 8GB models could offer equivalent memory bandwidth to a GTX 1080 Ti, which makes perfect sense. The GTX 1080 Ti offers 484 GB/s of memory bandwidth while AMD's R9 series first generation HBM offers 512 GB/s. The real trick is filling that pipeline to give AMD's HBM2 based cards a chance to shine and which depends on software developers as much as it does the hardware. As well, The Inquirer discusses the possible efficiency advantages that Vega will have, which could result in smaller cards as well as an effective mobile product. Pop over to take a look at the current rumours, here is hoping we can provide more detailed information in the near future.
"AMD HAS TEASED more information about its forthcoming Vega-based graphics cards, revealing that they will come with either 4GB or 8GB memory and hinting that a launch is imminent."
Here is some more Tech News from around the web:
- iPhone-havers think they're safe. But they're not @ The Register
- FYI Docs.com users: You may have leaked passwords, personal info – thousands have @ The Register
- LastPass scrambles to fix another major flaw – once again spotted by Google's bugfinders @ The Register
- Johnny Depp signs on to play John McAfee in a film of his life @ The Inquirer
- Samsung 4K Blu-ray Player @ Hardware Secrets
- Futuremark Ends Support for 3DMark Vantage and PCMark Vantage @ [H]ard|OCP
- Konica Minolta Unveils the Future of Work, Or At Least Its Version @ Kitguru
- Win a PC hardware bundle with Gigabyte AORUS, HyperX and KitGuru
NVIDIA P100 comes to Quadro
At the start of the SOLIDWORKS World conference this week, NVIDIA took the cover off of a handful of new Quadro cards targeting professional graphics workloads. Though the bulk of NVIDIA’s discussion covered lower cost options like the Quadro P4000, P2000, and below, the most interesting product sits at the high end, the Quadro GP100.
As you might guess from the name alone, the Quadro GP100 is based on the GP100 GPU, the same silicon used on the Tesla P100 announced back in April of 2016. At the time, the GP100 GPU was specifically billed as an HPC accelerator for servers. It had a unique form factor with a passive cooler that required additional chassis fans. Just a couple of months later, a PCIe version of the GP100 was released under the Tesla GP100 brand with the same specifications.
Today that GPU hardware gets a third iteration as the Quadro GP100. Let’s take a look at the Quadro GP100 specifications and how it compares to some recent Quadro offerings.
|Quadro GP100||Quadro P6000||Quadro M6000||Full GP100|
|FP32 CUDA Cores / SM||64||64||64||64|
|FP32 CUDA Cores / GPU||3584||3840||3072||3840|
|FP64 CUDA Cores / SM||32||2||2||32|
|FP64 CUDA Cores / GPU||1792||120||96||1920|
|Base Clock||1303 MHz||1417 MHz||1026 MHz||TBD|
|GPU Boost Clock||1442 MHz||1530 MHz||1152 MHz||TBD|
|FP32 TFLOPS (SP)||10.3||12.0||7.0||TBD|
|FP64 TFLOPS (DP)||5.15||0.375||0.221||TBD|
|Memory Interface||1.4 Gbps
|Memory Bandwidth||716 GB/s||432 GB/s||316.8 GB/s||?|
|Memory Size||16GB||24 GB||12GB||16GB|
|TDP||235 W||250 W||250 W||TBD|
|Transistors||15.3 billion||12 billion||8 billion||15.3 billion|
|GPU Die Size||610mm2||471 mm2||601 mm2||610mm2|
There are some interesting stats here that may not be obvious at first glance. Most interesting is that despite the pricing and segmentation, the GP100 is not the de facto fastest Quadro card from NVIDIA depending on your workload. With 3584 CUDA cores running at somewhere around 1400 MHz at Boost speeds, the single precision (32-bit) rating for GP100 is 10.3 TFLOPS, less than the recently released P6000 card. Based on GP102, the P6000 has 3840 CUDA cores running at something around 1500 MHz for a total of 12 TFLOPS.
GP100 (full) Block Diagram
Clearly the placement for Quadro GP100 is based around its 64-bit, double precision performance, and its ability to offer real-time simulations on more complex workloads than other Pascal-based Quadro cards can offer. The Quadro GP100 offers 1/2 DP compute rate, totaling 5.2 TFLOPS. The P6000 on the other hand is only capable of 0.375 TLOPS with the standard, consumer level 1/32 DP rate. Inclusion of ECC memory support on GP100 is also something no other recent Quadro card has.
Raw graphics performance and throughput is going to be questionable until someone does some testing, but it seems likely that the Quadro P6000 will still be the best solution for that by at least a slim margin. With a higher CUDA core count, higher clock speeds and equivalent architecture, the P6000 should run games, graphics rendering and design applications very well.
There are other important differences offered by the GP100. The memory system is built around a 16GB HBM2 implementation which means more total memory bandwidth but at a lower capacity than the 24GB Quadro P6000. Offering 66% more memory bandwidth does mean that the GP100 offers applications that are pixel throughput bound an advantage, as long as the compute capability keeps up on the backend.
93% of a GP100 at least...
NVIDIA has announced the Tesla P100, the company's newest (and most powerful) accelerator for HPC. Based on the Pascal GP100 GPU, the Tesla P100 is built on 16nm FinFET and uses HBM2.
NVIDIA provided a comparison table, which we added what we know about a full GP100 to:
|Tesla K40||Tesla M40||Tesla P100||Full GP100|
|GPU||GK110 (Kepler)||GM200 (Maxwell)||GP100 (Pascal)||GP100 (Pascal)|
|FP32 CUDA Cores / SM||192||128||64||64|
|FP32 CUDA Cores / GPU||2880||3072||3584||3840|
|FP64 CUDA Cores / SM||64||4||32||32|
|FP64 CUDA Cores / GPU||960||96||1792||1920|
|Base Clock||745 MHz||948 MHz||1328 MHz||TBD|
|GPU Boost Clock||810/875 MHz||1114 MHz||1480 MHz||TBD|
|Memory Interface||384-bit GDDR5||384-bit GDDR5||4096-bit HBM2||4096-bit HBM2|
|Memory Size||Up to 12 GB||Up to 24 GB||16 GB||TBD|
|L2 Cache Size||1536 KB||3072 KB||4096 KB||TBD|
|Register File Size / SM||256 KB||256 KB||256 KB||256 KB|
|Register File Size / GPU||3840 KB||6144 KB||14336 KB||15360 KB|
|TDP||235 W||250 W||300 W||TBD|
|Transistors||7.1 billion||8 billion||15.3 billion||15.3 billion|
|GPU Die Size||551 mm2||601 mm2||610 mm2||610mm2|
|Manufacturing Process||28 nm||28 nm||16 nm||16nm|
This table is designed for developers that are interested in GPU compute, so a few variables (like ROPs) are still unknown, but it still gives us a huge insight into the “big Pascal” architecture. The jump to 16nm allows for about twice the number of transistors, 15.3 billion, up from 8 billion with GM200, with roughly the same die area, 610 mm2, up from 601 mm2.
A full GP100 processor will have 60 shader modules, compared to GM200's 24, although Pascal stores half of the shaders per SM. The GP100 part that is listed in the table above is actually partially disabled, cutting off four of the sixty total. This leads to 3584 single-precision (32-bit) CUDA cores, which is up from 3072 in GM200. (The full GP100 architecture will have 3840 of these FP32 CUDA cores -- but we don't know when or where we'll see that.) The base clock is also significantly higher than Maxwell, 1328 MHz versus ~1000 MHz for the Titan X and 980 Ti, although Ryan has overclocked those GPUs to ~1390 MHz with relative ease. This is interesting, because even though 10.6 TeraFLOPs is amazing, it's only about 20% more than what GM200 could pull off with an overclock.
Some Hints as to What Comes Next
On March 14 at the Capsaicin event at GDC AMD disclosed their roadmap for GPU architectures through 2018. There were two new names in attendance as well as some hints at what technology will be implemented in these products. It was only one slide, but some interesting information can be inferred from what we have seen and what was said in the event and afterwards during interviews.
Polaris the the next generation of GCN products from AMD that have been shown off for the past few months. Previously in December and at CES we saw the Polaris 11 GPU on display. Very little is known about this product except that it is small and extremely power efficient. Last night we saw the Polaris 10 being run and we only know that it is competitive with current mainstream performance and is larger than the Polaris 11. These products are purportedly based on Samsung/GLOBALFOUNDRIES 14nm LPP.
The source of near endless speculation online.
In the slide AMD showed it listed Polaris as having 2.5X the performance per watt over the previous 28 nm products in AMD’s lineup. This is impressive, but not terribly surprising. AMD and NVIDIA both skipped the 20 nm planar node because it just did not offer up the type of performance and scaling to make sense economically. Simply put, the expense was not worth the results in terms of die size improvements and more importantly power scaling. 20 nm planar just could not offer the type of performance overall that GPU manufacturers could achieve with 2nd and 3rd generation 28nm processes.
What was missing from the slide is mention that Polaris will integrate either HMB1 or HBM2. Vega, the architecture after Polaris, does in fact list HBM2 as the memory technology it will be packaged with. It promises another tick up in terms of performance per watt, but that is going to come more from aggressive design optimizations and likely improvements on FinFET process technologies. Vega will be a 2017 product.
Beyond that we see Navi. It again boasts an improvement in perf per watt as well as the inclusion of a new memory technology behind HBM. Current conjecture is that this could be HMC (hybrid memory cube). I am not entirely certain of that particular conjecture as it does not necessarily improve upon the advantages of current generation HBM and upcoming HBM2 implementations. Navi will not show up until 2018 at the earliest. This *could* be a 10 nm part, but considering the struggle that the industry has had getting to 14/16nm FinFET I am not holding my breath.
AMD provided few details about these products other than what we see here. From here on out is conjecture based upon industry trends, analysis of known roadmaps, and the limitations of the process and memory technologies that are already well known.
Subject: Graphics Cards | March 15, 2016 - 02:02 AM | Ryan Shrout
Tagged: vulkan, raja koduri, Polaris, HBM2, hbm, dx12, crossfire, amd
After hosting the AMD Capsaicin event at GDC tonight, the SVP and Chief Architect of the Radeon Technologies Group Raja Koduri sat down with me to talk about the event and offered up some additional details on the Radeon Pro Duo, upcoming Polaris GPUs and more. The video below has the full interview but there are several highlights that stand out as noteworthy.
- Raja claimed that one of the reasons to launch the dual-Fiji card as the Radeon Pro Duo for developers rather than pure Radeon, aimed at gamers, was to “get past CrossFire.” He believes we are at an inflection point with APIs. Where previously you would abstract two GPUs to appear as a single to the game engine, with DX12 and Vulkan the problem is more complex than that as we have seen in testing with early titles like Ashes of the Singularity.
But with the dual-Fiji product mostly developed and prepared, AMD was able to find a market between the enthusiast and the creator to target, and thus the Radeon Pro branding was born.
Raja further expands on it, telling me that in order to make multi-GPU useful and productive for the next generation of APIs, getting multi-GPU hardware solutions in the hands of developers is crucial. He admitted that CrossFire in the past has had performance scaling concerns and compatibility issues, and that getting multi-GPU correct from the ground floor here is crucial.
- With changes in Moore’s Law and the realities of process technology and processor construction, multi-GPU is going to be more important for the entire product stack, not just the extreme enthusiast crowd. Why? Because realities are dictating that GPU vendors build smaller, more power efficient GPUs, and to scale performance overall, multi-GPU solutions need to be efficient and plentiful. The “economics of the smaller die” are much better for AMD (and we assume NVIDIA) and by 2017-2019, this is the reality and will be how graphics performance will scale.
Getting the software ecosystem going now is going to be crucial to ease into that standard.
- The naming scheme of Polaris (10, 11…) has no equation, it’s just “a sequence of numbers” and we should only expect it to increase going forward. The next Polaris chip will be bigger than 11, that’s the secret he gave us.
There have been concerns that AMD was only going to go for the mainstream gaming market with Polaris but Raja promised me and our readers that we “would be really really pleased.” We expect to see Polaris-based GPUs across the entire performance stack.
- AMD’s primary goal here is to get many millions of gamers VR-ready, though getting the enthusiasts “that last millisecond” is still a goal and it will happen from Radeon.
- No solid date on Polaris parts at all – I tried! (Other than the launches start in June.) Though Raja did promise that after tonight, he will only have his next alcoholic beverage until the launch of Polaris. Serious commitment!
- Curious about the HBM2 inclusion in Vega on the roadmap and what that means for Polaris? Though he didn’t say it outright, it appears that Polaris will be using HBM1, leaving me to wonder about the memory capacity limitations inherent in that. Has AMD found a way to get past the 4GB barrier? We are trying to figure that out for sure.
Why is Polaris going to use HBM1? Raja pointed towards the extreme cost and expense of building the HBM ecosystem prepping the pipeline for the new memory technology as the culprit and AMD obviously wants to recoup some of that cost with another generation of GPU usage.
Speaking with Raja is always interesting and the confidence and knowledge he showcases is still what gives me assurance that the Radeon Technologies Group is headed in the correct direction. This is going to be a very interesting year for graphics, PC gaming and for GPU technologies, as showcased throughout the Capsaicin event, and I think everyone should be looking forward do it.
Subject: Graphics Cards | March 11, 2016 - 05:03 PM | Sebastian Peak
Tagged: rumor, report, pascal, nvidia, HBM2, gtx1080, GTX 1080, gtx, GP104, geforce, gddr5x
We are expecting news of the next NVIDIA graphics card this spring, and as usual whenever an announcement is imminent we have started seeing some rumors about the next GeForce card.
(Image credit: NVIDIA)
Pascal is the name we've all being hearing about, and along with this next-gen core we've been expecting HBM2 (second-gen High Bandwidth Memory). This makes today's rumor all the more interesting, as VideoCardz is reporting (via BenchLife) that a card called either the GTX 1080 or GTX 1800 will be announced, using the GP104 GPU core with 8GB of GDDR5X - and not HBM2.
The report also claims that NVIDIA CEO Jen-Hsun Huang will have an announcement for Pascal in April, which leads us to believe a shipping product based on Pascal is finally in the works. Taking in all of the information from the BenchLife report, VideoCardz has created this list to summarize the rumors (taken directly from the source link):
- Pascal launch in April
- GTX 1080/1800 launch in May 27th
- GTX 1080/1800 has GP104 Pascal GPU
- GTX 1080/1800 has 8GB GDDR5X memory
- GTX 1080/1800 has one 8pin power connector
- GTX 1080/1800 has 1x DVI, 1x HDMI, 2x DisplayPort
- First Pascal board with HBM would be GP100 (Big Pascal)
Rumored GTX 1080 Specs (Credit: VideoCardz)
The alleged single 8-pin power connector with this GTX 1080 would place the power limit at 225W, though it could very well require less power. The GTX 980 is only a 165W part, with the GTX 980 Ti rated at 250W.
As always, only time will tell how accurate these rumors are; though VideoCardz points out "BenchLife stories are usually correct", though they are skeptical of the report based on the name GTX 1080 (though this would follow the current naming scheme of GeForce cards).
Subject: Memory | February 15, 2016 - 05:59 PM | Jeremy Hellstrom
Tagged: Samsung, HBM2, Data Memory Systems
Samsung is ready to roll out the next generation of High Bandwidth Memory, aka HBM2, for your desktop and not just your next generation of GPU. They have already begun production on 4GB HBM2 DRAM and promise 8GB DIMMs by the end of this year. The modules will provide double the bandwidth of HBM1, up 256GB/s of bandwidth which is very impressive compared to the up to 70GB/s DDR4-3200 theoretically offers.
Not only is this technology going to appear in the next genertation of NVIDIA and AMD GPUs but could also work its way into main system memory. Of course these DIMMs are not going to work with any desktop or mobile processor currently on the market but we will hopefully see new processors with compatible memory controllers in the near future. You can also expect this to come with a cost, not just in expensive DIMMs at launch but also a comparible increaset in CPU prices as they will cost more to manufacture initially.
It will be very interesting to see how this effects the overall market; will we see a split similar to what is currently seen in mainstream GPUs, a lower cost DDR version and a standard GDDR version? The new market could see DDRx and HMBx models of CPUs and motherboards and could do the same for the GPU market, with the end of DDR on graphics cards. If so will it spell the end of DDR5 development? Interesting times to be living in, we should be hearing more from Samsung in the near future.
Subject: Graphics Cards, Memory | January 19, 2016 - 11:01 PM | Scott Michaud
Tagged: Samsung, HBM2, hbm
Samsung has just announced that they have begun mass production of 4GB HBM2 memory modules. When used on GPUs, four packages can provide 16GB of Video RAM with very high performance. They do this with a very wide data bus, which trade off frequency for transferring huge chunks. Samsung's offering is rated at 256 GB/s per package, which is twice what the Fury X could do with HBM1.
They also expect to mass produce 8GB HBM2 packages within this calendar year. I'm guessing that this means we'll see 32GB GPUs in the late-2016 or early-2017 time frame unless "within this year" means very, very soon (versus Q3/Q4). They will likely be for workstation or professional cards, but, in NVIDIA's case, those are usually based on architectures that are marketed to high-end gaming enthusiasts through some Titan offering. There's a lot of ways this could go, but a 32GB Titan seems like a bit much; I wouldn't expect that this affects the enthusiast gamer segment. It might mean that professionals looking to upgrade from the Kepler-based Tesla K-series might be waiting a little longer, maybe even GTC 2017. Alternatively, they might get new cards, just with a 16GB maximum until a refresh next year. There's not enough information to know one way or the other, but it's something to think about when more of it starts rolling in.
Samsung's HBM2 are compatible with ECC, although I believe that was also true for at least some HBM1 modules from SK Hynix.