Feedback

High Bandwidth Memory (HBM) Architecture - AMD Plans for the Future of GPUs

Author:
Manufacturer: AMD

High Bandwidth Memory

UPDATE: I have embedded an excerpt from our PC Perspective Podcast that discusses the HBM technology that you might want to check out in addition to the story below.

The chances are good that if you have been reading PC Perspective or almost any other website that focuses on GPU technologies for the past year, you have read the acronym HBM. You might have even seen its full name: high bandwidth memory. HBM is a new technology that aims to turn the ability for a processor (GPU, CPU, APU, etc.) to access memory upside down, almost literally. AMD has already publicly stated that its next generation flagship Radeon GPU will use HBM as part of its design, but it wasn’t until today that we could talk about what HBM actually offers to a high performance processor like Fiji. At its core HBM drastically changes how the memory interface works, how much power is required for it and what metrics we will use to compare competing memory architectures. AMD and its partners started working on HBM with the industry more than 7 years ago, and with the first retail product nearly ready to ship, it’s time to learn about HBM.

We got some time with AMD’s Joe Macri, Corporate Vice President and Product CTO, to talk about AMD’s move to HBM and how it will shift the direction of AMD products going forward.

The first step in understanding HBM is to understand why it’s needed in the first place. Current GPUs, including the AMD Radeon R9 290X and the NVIDIA GeForce GTX 980, utilize a memory technology known as GDDR5. This architecture has scaled well over the past several GPU generations but we are starting to enter the world of diminishing returns. Balancing memory performance and power consumption is always a tough battle; just ask ARM about it. On the desktop component side we have much larger power envelopes to work inside but the power curve that GDDR5 is on will soon hit a wall, if you plot it far enough into the future. The result will be either drastically higher power consuming graphics cards or stalling performance improvements of the graphics market – something we have not really seen in its history.

View Full Size

While it’s clearly possible that current and maybe even next generation GPU designs could still have depended on GDDR5 as the memory interface, the move to a different solution is needed for the future; AMD is just making the jump earlier than the rest of the industry.

Continue reading our look at high bandwidth memory (HBM) architecture!!

But GDDR5 also limits GPU designs and graphics card designs in another way: form factor. Implementing a high performance GDDR5 memory interface requires a large number of chips to reach the required bandwidth levels. Because of that, PCB real estate becomes a concern and routing those traces and chips on a board becomes complicated. And the wider the GPU memory interface (256-bit, 384-bit), the more board space is taken up for the memory implementation. As frequencies increase and power draw goes up on GDDR5, the need for larger voltage regulators becomes a concern.

View Full Size

This diagram provided by AMD shows the layout of the GPU and memory chips required to get the rated bandwidth for the graphics card. Even though the GPU die is a small portion of that total area, the need to surround the GPU by 16 DRAM chips, all equidistant from their GPU PHY locations, takes time, engineering and space.

Another potential concern is that GDDR5 memory performance scaling above where we reside today will cause issues with power. More bandwidth requires more power and DRAM power consumption is not linear; you see a disproportionate increase in power consumption as the bandwidth level rises. As GPUs increase compute rates and games demand more pixels for larger screens and higher refresh rates, the demand for more memory bandwidth is not stabilizing and certainly isn’t regressing. Thus a move to HBM makes sense, today.

Historically, when technology comes to an inflection point like this, we have seen the integration of technologies on the same piece of silicon. In 1989 we saw Intel move cache and floating point units onto the processor die, in 2003 AMD was the first to merge the north bridge and memory controller into a design, then graphics, the south bridge even voltage regulation – they all followed suit.

View Full Size

But on-chip integration of DRAM is problematic. The process technology used for GPUs and high performance processors differs greatly from that use on DRAM chips, traditionally. Density of transistors for a GPU is not nearly at the level of density for DRAM and thus putting both on the same piece of silicon would degrade the maximum quality and performance (or power consumption) of both. It might be possible to develop a process technology that does work with both at the same level as current integrations but that would drive up production cost – something all parties would like to avoid.

The answer for HBM is an interposer. The interposer is a piece of silicon that both the memory and processor reside on, allowing the DRAM to be in very close proximity to the GPU/CPU/APU without being on the same physical die. This close proximity allows for several very important characteristics that give HBM the advantages it has over GDDR5. First, this proximity allows for extremely wide communication bus widths.  Rather than 32-bits per DRAM we are looking at 1024-bits for a stacked array of DRAM (more on that in minute). Being closer to the GPU also means the clocks that regulate data transfer between the memory and processor can be simplified, and slower, to save power and complication of design. As a result, the proximity of the memory means that the overall memory design and architecture can improve performance per watt to an impressive degree.

View Full Size

Integration of the interposer also means that the GPU and the memory chips themselves can be made with different process technologies. If AMD wants to use the 28nm process for its GPU but wants to utilize 19nm DRAM, it can do that. The interposer itself, also made of silicon, can be built on a much larger and more cost efficient process technology as well. For AMD’s first interposer development it will have no active transistors and essentials acts like a highway for data to move from one logic location to another: memory to GPU and back. At only 100 microns thick, the interposer will not add much to the z-height of the product and with tricks like double exposures you can build a interposer big enough for any GPU and memory requirement. As an interesting side note, AMD’s Joe Macri did tell me that the interposer is so thin that holding it in your fingers will result in a sheet-of-paper-like flopping.

AMD’s partnerships with ASE, Ankor and UMC are responsible for the manufacturing of this first interposer – the first time I have heard UMC’s name in many years!

So now that we know what an interposer is and how it allows the HBM solution to exist today, what does the high bandwidth memory itself bring to the table? HBM is DRAM-based but was built with low power consumption and ultra wide bus widths in mind. The idea was to target a “wide and slow” architecture, one that scales up with high amounts of bandwidth and where latency wasn’t as big of a concern. (Interestingly, latency was improved in the design without intent.) The DRAM chips are stacked vertically, four high, with a logic die at the base. The DRAM die and logic die are connected to each other with through silicon vias, small holes drilled in the silicon that permit die to die communication at incredible speeds. Allyn taught us all about TSVs back in September of 2014 after a talk at IDF and if you are curious in how this magic happens, that story is worth reading.

View Full Size

Note: In reality the GPU die and the HBM stack are approximately the height

Where the HBM stack logic die meets the interposer, micro-bumps are used for a more traditional communication, power transfer and installation method. These pads are also used to connect the GPU/APU/CPU to the interposer and the interposer to the package substrate.

Moving the control logic of the DRAM to the bottom of the stack allows for better utilization of die space as well as allowing for closer proximity of the PHYs (the physical connection layer) of the memory to the matching PHYs on the GPU itself. This helps to save power and simplify design.

View Full Size

Each HBM memory stack of HBM 1 (more on that designation later) is comprised of four 256MB DRAMs for a total of 1GB of memory per stack. When compared to a single DRAM of GDDR5 (essentially a stack of one), the HBM offering changes specifications in nearly every way. The width of the HBM stack is now 1024-bits though clock speed is reduced substantially to 500 MHz. Even with GDDR5 is hitting clock speeds as high as 1750 MHz, the bus width offsets that change in favor of HBM, resulting in total memory bandwidth per stack of 128 GB/s, compared to 28 GB/s per chip on GDDR5. Because of the changes to clocking styles and rates, the HBM stacks can operate at 1.3v rather than 1.5v.

View Full Size

The first iteration of HBM on the flagship AMD Radeon GPU will include four stacks of HBM, a total of 4GB of GPU memory. That should give us in the area of 500 GB/s of total bandwidth for the new AMD Fiji GPU; compare that to the R9 290X today at 320 GB/s and you’ll see a raw increase of around ~56%. Memory power efficiency improves at an even great rate: AMD claims that HBM will result in more than 35 GB/s of bandwidth per watt of power consumption by the memory system while GDDR5 only gets over 10 GB/s.

View Full Size

Physical space savings are just as impressive for HBM over current GDDR5 configurations. 1GB GDDR5 DRAM chip takes about 28mm x 24mm of space on a PCB with all four 256MB packages laid out on the board. The 1GB HBM stack takes only 7mm x 5mm of space, a savings of 94% in terms of surface area. Obviously that HBM stack has to be placed on the interposer chip itself, not on the PCB of the graphics card, but the area saved is still accurate. Comparing the full implementation of Hawaii and 16 DRAM packages for GDDR5 to Fiji with its HBM configuration shows us why AMD was adamant that form factor changes were coming soon. What an HBM-enabled system with 4GB of system memory can do in under 4900 mm2 would take 9900 mm2 to implement with GDDR5 memory technology. It’s easy to see now why the board vendors and GPU designers are excited about new places that discrete GPUs could find themselves.

View Full Size

Besides the spacing consideration and bandwidth improvements, there are likely going to be some direct changes to the GPUs that integrate support for HBM. Die size of the GPU should go down to some degree because of the memory interface reduction. With more simplistic clocking mechanisms and lower required clock rates, as well as with much finer pitches coming in through the GPUs PHY, integration of memory on an interposer can change die requirements for memory connections. Macri indicated that it would be nearly impossible for any competent GPU designer to build a GPU that doesn’t save die space with a move to HBM over GDDR5.

Because AMD isn’t announcing a specific product using HBM today, it’s hard to talk specifics, but the question of total power consumption improvements was discussed. Even though we are seeing drastic improvements in memory system power consumption, the overall effect on the GPU will be muted somewhat as the total power draw a memory controller on a GPU is likely under 10% of the total. Don’t expect a 300 watt GPU that was built on GDDR5 to translate into a 200 watt GPU with HBM. Also interesting, Macri did comment that the HBM DRAM stacks will act as a heatsink for the GPU, allowing the power dissipation of the total package and heat spreader to improve. I don’t think this will mean much in the grand scheme of high performance GPUs but it may help AMD deal with power consumption concerns that have plagued them in the last couple of generations.

Moving to a GPU platform with more than 500 GB/s of memory bandwidth gives AMD the opportunity to really improve performance in key areas were memory utilization are at their peak. I would assume that we would see 4K and higher resolution performance improvements over the previous generation GPUs where memory bandwidth is crucial. GPGPU applications could also see performance scaling above what we normally see as new GPU generations release.

An obvious concern is the limit of 4GB of memory for the upcoming Fiji GPU – even though AMD didn’t verify that claim for the upcoming release, implementation of HBM today guarantees that will be the case. Is this enough for a high end GPU? After all, both AMD and NVIDIA have been crusading for larger and larger memory capacities including AMD’s 8GB R9 290X offerings released last year. Will gaming suffer on the high end with only 4GB? Macri doesn’t believe so; mainly because of a renewed interest in optimizing frame buffer utilization. Macri admitted that in the past very little effort was put into measuring and improving the utilization of the graphics memory system, calling it “exceedingly poor.” The solution was to just add more memory – it was easy to do and relatively cheap. With HBM that isn’t the case as there is a ceiling of what can be offered this generation. Macri told us that with just a couple of engineers it was easy to find ways to improve utilization and he believes that modern resolutions and gaming engines will not suffer at all from a 4GB graphics memory limit. It will require some finesse from the marketing folks at AMD though…

 

The Future

High bandwidth memory is clearly the future of high performance GPUs with both AMD and NVIDIA integrating it relatively soon. AMD’s Fiji GPU will include it this quarter and NVIDIA’s next-generation Pascal architecture will use it too, likely released in 2016. NVIDIA will have to do a bit of management of expectations with AMD being the first out the gate and AMD will be doing all it can to tout the advantages it offers over GDDR5. And there are plenty.

View Full Size

HBM has been teased for a long time...

I’ll be very curious how long it takes HBM to roll out to the entire family of GPUs from either company. The performance advantages high bandwidth memory offers come at some additional cost, at least today, and there is no clear roadmap for getting HBM to non-flagship level products. AMD and the memory industry see HBM as a wide scale adoption technology and Macri expects to see not only other GPUs using it but HPC applications, servers, APUs and more. Will APUs see an even more dramatic and important performance increase when they finally see HBM implemented on them? With system memory as the primary bottleneck for integrated GPU performance it’s hard to not see that being the case.

When NVIDIA gets around to integrating HBM we’ll have another generational jump to HBM 2 (cleverly named). The result will be stacks of 4GB each and bandwidth increases along the same multiplier.  That would alleviate any concerns over memory capacities on GPUs using HBM and improve the overall bandwidth story yet again; and all of that will be available in the next calendar year. (AMD will integrate HBM 2 at that time as well.)

AMD has sold me on HBM for high end GPUs, I think that comes across in this story. I am excited to see what AMD has built around it and how this improves their competitive stance with NVIDIA. Don’t expect to see dramatic decreases in total power consumption with Fiji simply due to the move away from GDDR5, though every bit helps when you are trying to offer improved graphics performance per watt. How a 4GB limit to the memory system of a flagship card in 2015-2016 will pan out is still a question to be answered but the additional bandwidth it provides offers never before seen flexibility to the GPU and software developers.

June everyone. June is going to be the shit.


May 19, 2015 | 08:29 AM - Posted by Tim Morgan (not verified)

So in June we'll get lowly 4GB cards and no hairworks gg.

June 10, 2015 | 03:03 PM - Posted by Anonymous (not verified)

Hair??? Seriously????

May 19, 2015 | 08:29 AM - Posted by Rustknuckle (not verified)

"June everyone. June is going to be the shit." you say at the end there.

There may be a "the" that should not have been there in the last sentence there if the rumors of the 390 being a rebrand and Fiji coming later turns out to be true.

May 19, 2015 | 08:46 AM - Posted by Rustknuckle (not verified)

A little edit. I missed it just being pushed back a bit and is still in June possibly.

May 19, 2015 | 08:44 AM - Posted by Anonymous (not verified)

http://www.fudzilla.com/news/graphics/37790-amd-fiji-aims-at-849-retail-...

$849 for 4GB HBM, good one, AMD.

May 19, 2015 | 09:53 AM - Posted by Anonymous (not verified)

Would you like them to give you one for free?
Cry about it more peasant.

May 19, 2015 | 09:58 AM - Posted by obababoy

fudzilla.com ... Good one.

May 19, 2015 | 11:06 AM - Posted by heydan83

So it is ok for nvidia to charge you a lot for a product but amd cant do it even if it is packed with new technology... the fan boy logic

May 19, 2015 | 12:58 PM - Posted by H1tman_Actua1

Pascal...AMD=10 steps behind.

May 19, 2015 | 01:42 PM - Posted by JohnGR

Because 2015 is AFTER 2016.
Nice one.

May 19, 2015 | 12:51 PM - Posted by Heavy (not verified)

personaly i think this card is going to be between 600-700. and for the nvidia thing, their cards cost less to produce but hey have better architecture so im not sure if people should be mad or okay with that since you paying more then amd guys

May 19, 2015 | 05:51 PM - Posted by Arkamwest

i hope is a 600 dlls card, i really whant this card. now i have a gtx 780 and its fine, but i need a little bit more for 1440p gaming, the titan x is too expensive.

May 19, 2015 | 02:22 PM - Posted by arbiter

Thing with reading that story, they are forgetting about 980TI which will be on level of Titan X, if that 4gb model is 850$ they just left not a door open a bit they left the garage door big enough to drive a train through to run them over.

May 19, 2015 | 08:54 AM - Posted by Chronium (not verified)

Just curious would the HBM design offer any improvements to CPU's?

May 19, 2015 | 09:39 AM - Posted by Josh Walrath

Yes, but most especially for APUs as they are more memory bound in most graphics applications.  The physical structure of the interposer also allows improvements in latency (no pins and PCB routing to contend with, plus physical DIMMs) and power.  The only issue right now is the inability to socket HBM.  Motherboards would have to have the CPU/HBM module soldered onto the board.

May 19, 2015 | 10:32 AM - Posted by YTech

That reminds me of the older PentiumII which were a soldered CPU on a PCB module :)

Maybe a smaller form factor of such will return for the APU. Something like the M.2 cards.

May 19, 2015 | 10:46 AM - Posted by Anonymous (not verified)

That's pretty much a no brainer, a 1024 bit wide channel between CPU and memory(per Stack if more than one stack is used) is going to provide any processor(CPU, GPU, Other) with plenty of bandwidth, you are talking 128 bytes per half clock, and DDR provides data on the rising and falling edge of the clock cycle so that's 256 bytes per full clock.
So if you have a wider channel(1024 or more) to memory, that memory could be clocked lower, and still provide more aggregate bandwidth, and lower clocked memory saves on power usage. So say a 64 bit CPU has HBM, well for every memory clock the CPU is going to be able to receive 4, 64 bit words per clock(Per stack if more than on stack of HBM is used). More bandwidth is always going to improve a CPU's/APUs performance. And talk about some APUs with big GPUs, with silicon interposers some of those dies placed around the GPU could be CPU cores, and because the dies are all able to be fabricated on separate fab lines, all of the different dies can use the fab process best suited for their functionality.

Hell with Plenty of HBM, and CPU/GPU a whole gaming system could be fitted on a PCI card and a tailored to the APU, and the on card gaming OS could host the entire gaming engine, game and other necessary code on the PCI based system. Those HPC/server grade APUs may lead to an entire consumer derived gaming system, that users could plug into their PC's gaming rig and have the beginnings of a clustered gaming system, with the general purpose OS running on the main board and each discrete gaming APU based card running its own gaming engine, OS, etc. It would not be too hard for more than one of these PCI based gaming systems to be hosted on a PC, with the card based systems able to split the workloads, or one of the card's CPU cores used for ray tracing/lighting code, while the other card's CPU cores used for physics, and both GPUs could share the raster, tessellation, etc. workloads.

When I say OS running on the PCI based APU system, I'm talking about almost an embedded type of OS specifically tailored to gaming, and hosting the gaming engine in a client mode with some form of mainboard master program initializing the necessary client/s engine/s to one or more PCI based gaming systems, after which the PCI based gaming system/s would take over running the game, and/or load balancing the game between more than one system if multi systems were installed. The mainboard's CPU and OS would be little more needed than mostly for getting the systems started, and so would not have to be all that powerful, as the gaming APU/s on the PCI card/s would be running the game.

for laptops having an APU with HBM will save on space, and allow more room for other components, even if the APU only came with 4 GB of HBM, some regular RAM could be added as a second tier memory while the HBM hosted the OS, and other essential code. I'm sure an APUs memory controller and Cache system could be designed to treat the HBM as a 4th level Cache. HBM could be used to provide an extra layer of security for a system, with no DMA access allowed directly into HBM, with DMA only allowed access into a second tier RAM to be handled, and loaded, by the APUs more secured memory subsystem. At some point in time the cost of HBM will fall to a low enough point that HBM will replace off module RAM on most systems like Laptops, but for workstation, and server workloads, or on some PCs that may need more than 16GB of memory there will still be regular RAM for data staging/code swap. For sure getting the memory as close to the Processor, CPU, GPU, other is going to reduce latency as the signals will have much shorter distances to travel, and the clock speeds can be made higher on HBM should there be a need in the future, as extra bandwidth is needed.

May 19, 2015 | 01:02 PM - Posted by Heavy (not verified)

dude you just blown my mind.

May 19, 2015 | 04:45 PM - Posted by Coupe

Awesome post. One thing though. CPUs are mostly serial type so HBM won't do much for that part of things. CPUs haven't been memory bottlenecked in a long time.

May 19, 2015 | 07:58 PM - Posted by arbiter

It would be for the gpu part that will benifit but cpu side won't

May 19, 2015 | 10:58 PM - Posted by BillDStrong (not verified)

CPUs have been more memory bottlenecked as time goes on. They simply hide it. Most of the time, the CPU sits idle. It can cost 400 or more cycles to get items from memory. This is why CPUs have a L1/L2 and L3 cache. The cpu "guesses" what will be needed next, and calls into memory to "cache" that data. If it guesses wrong, and it often does thanks to poorly optimized programs, then it has to sit and wait for those 400 cycles for the next bit of memory.

The CPU hides this partly by being so fast, and partly by doing more than one thing at once, in addition to hyperthreading. It basically computes multiple paths of a program, and hopes that it actually did work that will be used.

If HBM is used as a replacement to for say L3 cache, say 1GB, the CPU would need to redesign its memory retreival logic to account for it to see the most benefit, but it could average out as a net benefit. If it were used in addition to L3, the only changes would be interconnects.

It could also be used in place of system memory, but this would be the end of modular, upgradable systems.

I already dislike the fact that memory is so high priced, and it shouldn't be.

I can buy a computer for my pocket with 4GB of memory for $200. Why are most cheap systems still shipping with 4GB, 8GB if you are lucky? And 16GB is $100? Since we are clearly making more memory than ever before, and cost to produce should go down with such scale, why are those prices so high?

Speaking of price, how much more expensive will this memory be to produce, with the lower clock speed requirements, it should lower the price, but the new form factor will raise it until it is mass produced.

May 21, 2015 | 12:00 AM - Posted by Anonymous (not verified)

The cache hit rate is near 100% for most non-streaming consumer applications, which is why increasing memory speed has not helped much lately. HBM would help with streaming applications, but I think most CPU-based streaming applications are limited by CPU execution width, rather than memory bandwidth. There are server applications which would benefit from such a large L4 cache, things that deal with large data sets such as databases.

May 21, 2015 | 11:06 AM - Posted by Josh Walrath

Yeah, both AMD and Intel have circumvented most of the memory latency issues.  It is kind of a solved problem when we look at it from L1, L2, and L3 caches.  Sure, APUs will benefit more, but increasing memory speeds for CPUs has pretty much resulted in no tangible gains (other than synthetic streaming benchmarks).

May 22, 2015 | 04:21 PM - Posted by MRFS (not verified)

I see a direct analogy to "quad-pumped" LGA-775 chipsets, only now we lower the clock rate and buy more bandwidth with huge "bursts of bytes" per clock tick.

Then, would CPUs do better with 8-channels to such memory,
up from quad-channel?

I LIKE IT!

May 19, 2015 | 09:11 AM - Posted by Nick (not verified)

Great article. Really looking forward to you guys putting HBM through its paces. I'm sure it'll make for some very interesting discussion.
I'm very keen to see how heat dissipation is handled - especially on non flagship parts. If AMD decides to re-brand this next year or in 5 years time (hehe), will we see an R7 850 with an expensive watercooler? Surely not - but a true re-brand will not have dramatic thermal improvements.
Maybe the re-branding is just a stop gap while they pour R&D into HBM...

May 19, 2015 | 12:14 PM - Posted by BBMan (not verified)

Heat has always been the "stacking" problem. The materials they use to place the active components on have to be able to pipe the heat out as well as layer the levels. Honestly, I'm not sure how well it will work.

We've been upping PSU capability as it is and AMD is notorious for creating the best room heaters on a single card. We've been upping the power ante with that company anyhow. I never thought to see a 1500W PSU- but I was beginning to be afraid that's what you might need to drive a single card.

May 19, 2015 | 03:02 PM - Posted by Anonymous (not verified)

Actually AMD have had cooler and more power efficient cards much longer than nvidia ever has. Its been two gens where nvidia have had a lead on AMD. Maybe you just got in to this game two years ago, but people that know will tell you what I just said.

Must I really list all the cards form Nvidia every gen? That were power hogs and HOT like a barbecue grill (Literately cook of them)

May 19, 2015 | 03:22 PM - Posted by Anonymous (not verified)

Quite right. NVIDIA's Geforce 480 had crazy power consumption up against its AMD contemporaries.

The Geforce FX series? Hah, terrible.

It seems like a lot of people who comment on things are 20 years old or something, and have absolutely no idea about any part history.

May 19, 2015 | 09:39 PM - Posted by arbiter

Lookin back a power numbers AMD's competing gpu at the time was 20 watts less then gtx480. So wouldn't say that is crazy.
250watts at time was a lot but now its pretty normal.

May 20, 2015 | 11:26 AM - Posted by chizow (not verified)

No doubt the 480 was power hungry as hell, but it was also the fastest GPU without question. Same cannot be said for AMD's power hungry chips of the last few years.

May 20, 2015 | 10:50 AM - Posted by BBMan (not verified)

The point of this article has been to change the power curve. While your talking years-old tech, BOTH AMD and nVidia have been upping the power draw until recently- nVidia making a marked turn around. Sorry, but AMD is owning the rapacious pig crown today and if you caught it, you might have noticed that I'm cautiously optimistic that they might turn the corner too. It's not like AMD hasn't dissappointed me before ....

May 19, 2015 | 04:17 PM - Posted by ppi (not verified)

They will probably update their GPU line with 14/16nm process.

May 19, 2015 | 10:00 AM - Posted by obababoy

Good article! Excited news for all gamers, regardless of Green or Red.

May 19, 2015 | 10:24 AM - Posted by puppetworx

June is going to be the shit.

Great read. Consider me hyped.

May 19, 2015 | 10:29 AM - Posted by mLocke

So... will it have drivers on launch?

May 19, 2015 | 09:19 PM - Posted by arbiter

Will it have drivers? Yes

Will be any good? That is the question.

May 19, 2015 | 10:52 AM - Posted by nevzim (not verified)

I think 4GB is enough to make sure that upcoming games do not stutter 80% of the time ;-)
Poor AMD.

May 19, 2015 | 11:04 AM - Posted by obababoy

Hey (insert any insulting yet accurate name here), is it that hard for you to stay on topic and talk about the article and not your teenage social tendencies to troll about a fucking computer chip being better than the other...

Anyways, Witcher 3 on my computer uses 2gb Vram on Ultra everything so go on about what you think you know about upcoming games..

May 19, 2015 | 11:16 AM - Posted by nevzim (not verified)

Please read the article first so you know what I refer to.

Witcher 3 is for sure nice looking game but I doubt it is the ultimate pinnacle of PC gaming.

May 19, 2015 | 12:32 PM - Posted by obababoy

I did read it but your comment was so vague and far from any real statement other than trolling which is why I mentioned that.

May 19, 2015 | 01:05 PM - Posted by Heavy (not verified)

yeah i head they gimped the pc verzion of the witcher 3 i saw side by side videos of ps4 and pc and they look the same the only difference if you have nvidia and use their hairworks

May 19, 2015 | 01:37 PM - Posted by obababoy

Not true. The Witcher does have a few things that are not as good as the first gameplay videos we saw. The PS4 version is having framerate issues and slowdowns, not to mention it has lower settings overall. There are things like draw distance, foliage density, framerate, godrays, and so on that are either non-existant in PS4 or are quite a bit lower. Comparison videos don't show this. Comparison videos are also compressed and you can't even make out detail that well. Ill leave this right here. http://www.reddit.com/r/PS4/comments/36hvgn/your_thoughts_on_the_witcher_3/

May 19, 2015 | 06:03 PM - Posted by Arkamwest

yesterday, i played witcher 3 (4 Hrs) in my pc on 30 fps 1440p, postprocessing on high and graphics on ultra whit out hairworks and no AA. and it great..!!! My gpu is a GTX 780

May 19, 2015 | 06:04 PM - Posted by Arkamwest

Witcher 3 use alittle bit more tha 2 gbs of vram and like 5 gb of system ram

May 20, 2015 | 07:53 AM - Posted by obababoy

Interesting. I didn't check my system ram yet. So hairworks is also a bit hit for NVIDIA cards?

May 19, 2015 | 12:52 PM - Posted by heydan83

Yeah that´s why the 980 have more than 4gb of vram.....

May 19, 2015 | 12:59 PM - Posted by nevzim (not verified)

980 will have more then 4GB the moment sales slow down a bit. Remember, Nvidia has ~75% of the market.

May 19, 2015 | 12:00 PM - Posted by chizow (not verified)

Not going to poo poo AMD because they are pushing the envelope here with HBM, but I think they are going to have trouble selling this at $850, if that rumor is true, just my opinion though of course!

However, it does sound like Ryan got some real exclusive confirmation directly from Joe Macri that Fiji single-GPU version is stuck at 4GB, and that the Fiji VR X2 will have 8GB, but still logically 4GB per GPU. Very interesting.

"Macri told us that with just a couple of engineers it was easy to find ways to improve utilization and he believes that modern resolutions and gaming engines will not suffer at all from a 4GB graphics memory limit."

In any case, will be interested to see what has holds in store for AMD and their fans. Will the wait be worth it??? We'll see.

In the meantime, still feeling pretty good about my Titan X purchase almost 2 months ago! :)

May 19, 2015 | 12:38 PM - Posted by obababoy

I don't think $850 will stick at all and I agree that if it does it will hurt AMD pretty bad. Hopefully a 980ti comes out and adds competing prices.

"and that the Fiji VR X2 will have 8GB, but still logically 4GB per GPU." Wasn't microsoft mentioning that DX12 will solve the Vram sharing issues for SLI and Crossfile solutions?

May 19, 2015 | 02:35 PM - Posted by chizow (not verified)

Yep, we'll see what it firms up to be, but yeah $850 for AMD is really sky high.

As for the DX12 thing solving VRAM sharing, we will see, I don't think it will be that easy even if memory is unified resources as they claim. I think memory locked to a read state can be accessed by both GPUs, ie. textures, but I think the frame buffers (relatively small part of VRAM) will need to remain independent of one another.

I just start thinking Hydra and failure anytime I start hearing mix-vendor rendering controlled by API.

May 21, 2015 | 12:12 AM - Posted by Anonymous (not verified)

I was hoping that they would have higher speed interconnect between the GPUs since they will have plenty of interconnect space available with the memory on package. Nvidia has been talking about their nvlink which will be much faster than PCI-e. This will not be available for a while though. I would suspect AMD would have similar plans, but it will probably not be available unti second or third generation HBM. For the current generation, multi gpu performance scaling seems to be getting quite good anyway, but I don't see how they can avoid having duplicated resources loaded in memory of both GPUs. Some duplication could be avoided simply by developers using a larger number of small draw calls which do not require as many or the same resources.

May 19, 2015 | 12:56 PM - Posted by heydan83

So you´re ok paying $1000 for your titan x, but it is to high (if that rumor is true) to pay $850... fan boy logic...

May 19, 2015 | 02:26 PM - Posted by arbiter

yea "pay 850$ for a 4gb vram card is fine, yet 1000$ for a card with 12gb vram is absurd" Yea that is pretty terrible logic for certain fans.

May 19, 2015 | 06:13 PM - Posted by Arkamwest

i dont think this is a 850 dlls GPU,a sweet 600 will be great...
...sweet sweet victory (Van Halen style)

May 19, 2015 | 02:29 PM - Posted by chizow (not verified)

Nah, because its not just about raw performance, and hasn't been for a long time with Nvidia vs. AMD.

There's certainly other features and technologies that are always going to justify and drive that premium myself and others are willing to spend on Nvidia, but more hesitant to do the same for AMD. Drivers for example, lots of concerns there especially lately with AMD. Game support, outstanding issues with FreeSync (ghosting, CF), just some examples. Then you get a number of features like G-Sync, DSR, 3D Vision, GameWorks that you come to enjoy.

But beyond that, Titan X is a premium card in every way, no compromises. Full GK210 front and center, 12GB VRAM, awesome cooler, awesome performance. 4GB of VRAM on Fiji just doesn't really measure up, imo, so its going to be really hard even before you get into actual performance.

May 19, 2015 | 02:46 PM - Posted by Anonymous (not verified)

VSR is superior to DSR, and Freesync is on the same footing if not better over Gsync as freesync is also open. You also have TrueAudio, Mantle, DX12.3 vs nvidia DX12.2, AMD DX12 will have more features and better performance. 99% of nvidia cards sold this and last year have been 4GB, so I see no difference here, and there are no games that demand over 4GB at the moment (Maybe the odd game here). At 4K you may need more, but that is a very small market. But lets wait and see how HBM does against DDR5.

DDR5 is old news.. bring on the latest and greatest HBM technology for the masses and enthusiasts, bcoz Everybody loves new tech.

May 19, 2015 | 03:52 PM - Posted by chizow (not verified)

lol of course you would post this nonsense anonymously.

VSR limited to some janky 3200x1600 resolution is good?

FreeSync is not on equal footing as G-Sync, given all the problems that are still outstanding with ghosting, broken overdrive, and limited refresh windows. Oya, still no FreeSync + CF support.

Who cares if it is Open if it is still broken and bad? 2 months and counting since promises of simple driver fixes remember.

TrueAudio and Mantle? LMAO, did you post this as a reverse-psychology joke? If so you got me! Dead tech is dead. Please throw TressFX, HD3D and Enduro in there too so we can all laugh at more dead/unused AMD tech, thanks.

99% of Nvidia cards sold at $550 and below, you are happy AMD is going to ask $850 for a part that has 4GB of VRAM? If so, don't be disappointed when you get what you ask for. :)

May 20, 2015 | 08:04 AM - Posted by obababoy

I wont defend DSR vs VSR but the 3200x1800 in Witcher 3 looked incredible on my 1080p monitor(broken Asus 3D monitor).

How many games are using Hairworks right now? I didn't use TressFX in Tombraider because I had an nvidia card at the time and it went to like 3fps.

Either way I don't feel like you are attacking AMD per say, but you are pointing out only the negatives. I feel I got a steal for my Sapphire Vapor-X R9 290 and it has the longevity and power to keep my 1080p monitor pushing max graphics for a while. I have no cooling issues and performance/dollar is fantastic!

BUT, the 4GB of the AMD fiji card is not bothering me as much as the price. A gap from $400 290x to $850 is bad juju! I wont touch the thing until it is under $700.

May 20, 2015 | 10:59 AM - Posted by chizow (not verified)

Yes of course any form of SSAA is going to look good, even if it is only ~1.5x or whatever 3200x1600 is from 1080p, but for someone to claim they are on equal footing simply isn't the case. Some might argue more validly that AMD's filter is less blurry, but from what I have seen, AMD also suffers from much more shimmering in motion due to using a sharper LoD filter, so it is a trade-off. But in terms of actual support and support among all GPUs that use this feature, Nvidia wins, hands down.

For Hairworks games, there's 3 that I know of, FC4, CoD: Ghosts and now Witcher 3. Its one of their newer libraries though and it obviously looks awesome, so I wouldn't be surprised to see it used more. As for TressFX, did you play right at release? Or did you wait the week or so for Nvidia to fix performance in a new driver? I played it in 3D with TressFX maxed with just 2x670 and it ran great. Again, I have no qualms with TressFX, it doesn't look as good as Hairworks but it is still better than turning it off. That's why its funny reading these comments from AMD fanboys who all claim Hairworks and Gameworks sucks. If it sucks, just turn it off!

I'm simply clearing up the nonsense that you frequently see from AMD fanboys. I mean obviously, if an AMD fan says there is no difference between Nvidia and AMD to justify the price, I am going to point out the very real differences and more often than not, that is going to rightfully cast AMD in a negative light.

I think you have a level head and approach when it comes to AMD's next card, I just don't see how all these AMD fans feel obligated to defend and protect something that has to be disappointing. I mean I was thoroughly disappointed by the initial Titan launch and wasn't afraid to voice my displeasure about it, but its obvious to me some of these very vocal AMD fans are just going to defend AMD regardless because they're not in the market for one, they're just fanboys of AMD "just because".

May 20, 2015 | 06:35 PM - Posted by Anonymous (not verified)

VSR is getting updated soon with Fiji..
Mantle is the future, and its also DX12, Vulcan, Metal and many more to come. And after a year of DX12/windows 10.. Mantle will rise again..

True-audio is superior bcoz nvidia have nothing in comparison.

TressFX 2.0 is also superior in performance and looks vs hairworks

If you think Gsync with one input is better or even expectable is hilarious.. Gsync is no better than freesync.. And wait for asus monitor and updated freesync drivers.. then all nvidia fanboys gonna cry (remember me when you bawling)

May 19, 2015 | 04:43 PM - Posted by arbiter

Everyone does love new tech but they won't love it when its 850$. If a 980ti starts at 700$ and what looks like matching performance AMD is between a rock and a hard place then. Shoot nvidia could really stick it to AMD with 980ti at 600$ and do price drop on 980 and small drop on 970.

AMD fans should be happy Nvidia didn't use HBM not cause it makes AMD better cause they used it but cause Price of that 390x you are all fawning about isn't 1100$ cause both sides buying ever single chip for sale

May 21, 2015 | 12:18 AM - Posted by Anonymous (not verified)

It would be hilarious if Nvidia becomes the cheap option after the release of HBM.

May 19, 2015 | 02:49 PM - Posted by Anonymous (not verified)

Titan X is history and will not be a premium card when Fiji releases with HBM.

May 19, 2015 | 03:52 PM - Posted by chizow (not verified)

lol k whatever

May 19, 2015 | 02:49 PM - Posted by Anonymous (not verified)

Titan X is history and will not be a premium card when Fiji releases with HBM.

May 19, 2015 | 04:32 PM - Posted by ppi (not verified)

Of the features you mentioned, only GameWorks (and possibly PhysX if too integrated into game engine - Project Cars) are worth mentionig.

nVidia tends to have better drivers, though that is typically an issue only if you must play a game right after launch. And a game must be badly tuned for AMD. Lower CPU utilisaiton on DX11 is sure a plus.

Otherwise:
CF/SLI - nobody with single card cares
G-Sync/FreeSync - Current offer of screens is lackluster for both camps, so better wait a year more. Ghosting must be fixed via scaler chip.
3D - 3D with glasses is auto-fail
DSR - buzzword for ineffective AA

May 20, 2015 | 08:08 AM - Posted by obababoy

Hey now, I loved my Asus 3D monitor. Tombraider was incredible in 3D but unfortunately my DVI slot died on it and HDMI can't carry the 120hz signal :( So now I have an R9 290 and enjoyed a bit jump from SLI 460's :)

May 20, 2015 | 11:01 AM - Posted by chizow (not verified)

Again, these are your opinions, to anyone who uses these features that are more likely to drive premium purchases, they are absolutely going to be the difference.

So yes, if you want console+ version of a game that allows you to run a higher resolution, textures, and maybe built-in AA, then sure Nvidia and AMD are almost equal footing. But if you want more advanced features as I've listed, its not even close, Nvidia wins hands down.

May 19, 2015 | 02:51 PM - Posted by Anonymous (not verified)

I will buy two, no matter the cost, just to have the latest and greatest technology.. Yes I am an enthusiast

May 19, 2015 | 02:52 PM - Posted by Anonymous (not verified)

And for the biggest EPEEN!!!

May 19, 2015 | 03:55 PM - Posted by chizow (not verified)

Hope so, AMD needs more of their die hard fanboys putting their money where their mouth is.

May 19, 2015 | 04:22 PM - Posted by ppi (not verified)

I was kind of hoping that nVidia will make Titan X the first card on the market able to play at 4K resolution with all the bells and whistles.

But that did not happen.

If Fiji claims it, I do not know, but one thing is certain: For current games, 4GB RAM will not hold Fiji back.

May 19, 2015 | 04:45 PM - Posted by arbiter

Funny how when 980 launched with 4gb ppl said it wouldn't be enough for future, so here we are what 6-8 month later, AMD doing a 4gb card that costs 300$ more and it is fine. Pretty funny logic of some people.

May 19, 2015 | 05:53 PM - Posted by Anonymous (not verified)

Don't forget to mention when Nvidia admitted the 970 had segmented memory all of a sudden memory was a none issue even when 1 of 2 games this site tested showed stutters in 1440p

May 19, 2015 | 07:33 PM - Posted by ppi (not verified)

While I was not among those people laughing at 4GB (especially after seeing like no gains for 8GB 290X), here's a few conflicting points of view that I happen to have on the matter:

1. 4GB RAM is by no means future-proof (with current consoles 8GB shared RAM anyway), though it will likely mean that in future games some detail settings will have to be on "High" rather than "Ultra";

2. We still have no idea how Fiji actually performs (and costs) and how/whether the 4GB limits its performance in practice in current games; and

3. 970/980 are still the best reasonable cards to buy, even with 4GB.

May 19, 2015 | 12:38 PM - Posted by Anonymous (not verified)

Looks awesome so far! Can't come soon enough.

May 19, 2015 | 12:53 PM - Posted by Heavy (not verified)

i was gidy as a school girl reading this article dosnt matter if its nvidia or amd seeing something new like this on the pc market is great

May 19, 2015 | 01:38 PM - Posted by obababoy

Thank you!

May 19, 2015 | 12:57 PM - Posted by H1tman_Actua1

ummm pascal....

May 19, 2015 | 01:39 PM - Posted by obababoy

Your point? hahaha

May 19, 2015 | 02:38 PM - Posted by Anonymous (not verified)

Pascal is 99% guaranteed to be no earlier than Q3 2016 (Easily a year). Nvidia will have lots of problems with HBM and Pascal architecture needs allot of work for HBM to work at its potential.

AMD have been working on HBM for 7 years, whereas Nvidia only a year at most. Do the math..

May 19, 2015 | 04:39 PM - Posted by arbiter

Well Nvidia has good R&D staff so what AMD needed 7 years to do, Nvidia probably will catch up in a year. But kinda sad that 2x memory bandwidth still doesn't make it much faster then current chip from nvidia.

May 19, 2015 | 02:29 PM - Posted by arbiter

Serious question least for Air cooled versions of the card, How much of an issue will be the heat coming off the gpu be to the memory chips if they are as close to the GPU as they look like they will be?

May 19, 2015 | 02:56 PM - Posted by Ryan Shrout

I think in total is should be a bit less than a typical GDDR5 implementation.

May 20, 2015 | 03:01 AM - Posted by JohnGR

In total yes, but I think he means that now all the heat will be concentrated in a smaller area and the memory chips will be IN that area.

May 20, 2015 | 08:14 AM - Posted by obababoy

The two major heat issues are GPU and VRM's. I don't think there will be an issue with cooling. While large surface area would ideally be better, I think having them on the GPU will cause them to put off less heat due to their far less use of power. They could also expand the actual chip and make the casing larger to have more surface area in contact with the heatsink.

May 21, 2015 | 11:35 AM - Posted by Josh Walrath

I believe Macri actually said that heat will not be an issue for the memory, and in fact they can help act as a heatsink for the GPU.  It wouldn't amount to much heat, mind... but it is not adding to the heat issue.

May 19, 2015 | 02:41 PM - Posted by Anonymous (not verified)

Xbox One already have it in some form.

This summer Microsoft should reveal it.

May 19, 2015 | 02:57 PM - Posted by Anonymous (not verified)

I bet People cant wait to drop DDR5 for superior HBM Memory the first chance they get. Just imagine how small that card will be, and most importantly how fast? The future is looking good especially for enthusiasts and epeen`ers and Gamers.

May 19, 2015 | 04:37 PM - Posted by arbiter

Um i doubt its gonna be that many people if card goes for 850$. Most people will say hell with that. Funny part in it though is Even though AMD fan boyz will still attack nvidia over gpu pricing but reality if 390x 4gb starts at 850$ well any future beef about nvidia pricing is well dead since AMD is doing it too, reality of it is AMD is bit worse with only 4gb ram on the card.

May 20, 2015 | 02:01 AM - Posted by Heavy (not verified)

dude dint you read the article this stuff is new theirs barely any mass production amd is the first to have this.this has always been the same new tech always cost more but as soon as more people start buying it price will go down if no one becomes the first one to use it then no one will use it.enthusiast dosnt care about price anyway the care about top of the line high end stuff do you care if people by a titan or r9 290x2

May 19, 2015 | 03:35 PM - Posted by DaKrawnik

AMD should of let nvidia take the risk of introducing hbm to dGPU's first, waited until the next gpu process node, or at the very least, tested it on a GPU that wasn't a flagship (Fiji) like they did with GDDR4 on the 4770 and nvidia did with maxwell and the 750 Ti. HBM is most likely the cause of delay for Fiji. A delay that is hurting them so bad that they can't say anything about it for another month when they will be showing it.

*taps AMD on the shoulder*
you done [expletive] it up!

May 19, 2015 | 04:21 PM - Posted by Mac (not verified)

Without AMD pushing these new technologies, who would? Certainly not nvidia.

May 19, 2015 | 04:33 PM - Posted by arbiter

Yea Nvidia hasn't done anything to push new techs ever. Its always been AMD with new tech first with things like GPU recording of game play and Freesync. /sarcasm

May 19, 2015 | 04:43 PM - Posted by Mac (not verified)

I'm talking important stuff here, like gddr5 memory and now HBM, not some proprietary,self-serving gimmicky contributions. AMD is representing for you'll in more meaningful ways.

May 19, 2015 | 05:01 PM - Posted by arbiter

I guess its wrong for a company wanting to do something Right from day 1 and not a total train wreck.

May 19, 2015 | 05:48 PM - Posted by Anonymous (not verified)

Whats this doing things right from day 1 ?

The GPU recording you mentioned was an still is limited. You couldn't even change the resolution when it was first introduced. Took several forum bashing for them to change it and they haven't adjusted the bit. The more they change it the more performance hit it taxes.

May 19, 2015 | 08:04 PM - Posted by arbiter

Least Nvidia is doing the WORK themselves not expecting someone else to do it all for them for next to nothing.

May 19, 2015 | 10:04 PM - Posted by Anonymous (not verified)

Hate to break it to you but you could of software/hybrid record it from CPU/APU long before Nvidia thought about it.

The only thing new they did is package it to GeForce Experience and they haven't imported much of the decent recording features because it taxes the GPU too much.

May 22, 2015 | 05:32 PM - Posted by dahippo (not verified)

Your memory is like a goldfish. There is alots of issues with g-sync, Google it, refering to your WORK ... .HBM is very intresting technology which in the future (what ever it will be called later or likely tech)is futureproff and will benefit both GPU, initialy and later CPU/APU/SoC. With Win10/dx12 it will benefit it even more, because you can feed those 4k sp and probably make games at 4k work flawless even if you screame, it wont be enough. Lets see in 2-3 weeks and stop trolling.

May 19, 2015 | 04:35 PM - Posted by arbiter

Well that was Only thing so far keeps AMD competition is the extra memory bandwidth making it so they can compete. Put 290x at lower bandwidth of say a 780 or even 980 i bet you are looking at much slower product.

May 19, 2015 | 04:23 PM - Posted by Nossgrr (not verified)

Considering most x70+ NVidia cards still have less than 3GB, I dont see the 4gb limit for this gen a problem at all.

Heck my gtx 770 runs pretty much everything at max so 4GB HBMs, yeah bring it!

May 19, 2015 | 05:00 PM - Posted by arbiter

this card is more focused for 1440 rez and up. 4gb ram to run say 3x 1080p or multi 1440p or even 1 4k monitor, you tend to max out that 4gb pretty quick.

May 19, 2015 | 05:37 PM - Posted by DaveSimonH

AMD announcing they will be going from 290X with 4GB of GDDR5 to 4GB HBM on the 390X, surely that's got to be low hanging fruit for Nvidia to go after? Wouldn't be surprised to see Nvidias next mid-high end cards sporting 6GB or 8GB GDDR5;
"Even our budget cards now have at least 4GB, the 'minimum' required for today's gaming" etc.

May 19, 2015 | 08:05 PM - Posted by arbiter

Its been pretty much confirmed that a 980ti which is pretty much titan x will be coming with 6gb ram. Which likely could be priced Lower then 390x

May 20, 2015 | 08:26 AM - Posted by obababoy

Which is great because that would drive the "still rumored" $850 price down. I am all for competition and would still prefer the 390x if I plan on upgrading.

May 19, 2015 | 05:52 PM - Posted by Anonymous (not verified)

the nvidia is strong with the fanboys

May 19, 2015 | 07:29 PM - Posted by Anonymous (not verified)

I love how every commenter does his/her best to find and highlight the single drawback (4gb) of a completely new evolutionary improvement.

May 19, 2015 | 08:06 PM - Posted by arbiter

Well All certain sides fans boyz did same thing when gtx980 with 4gb ram was released yet since its AMD doing this time its fine. pretty clear where the problem is.

May 19, 2015 | 08:37 PM - Posted by skline00

I think the 4G HBM1 of the 390x will probably perform better than 4G DDR5. However, how will AMD explain the huge gap between the 390x 4G HBM1 vs the Nvidia Titan X with 12 G DDR5?

BTW, I have 2 Sapphire Tri-X R9 290s water cooled CF in my rig so I'm hardly a Nvidia fanboy. Nonetheless,. that's going to be a lot of memory territory to make up.

May 19, 2015 | 09:18 PM - Posted by arbiter

4gb HBM will be better then gddr5 on games that use more then 4gb but it still gonna start to suffer when game needs 4gb+. Start to see that some in 1440p/4k which these cards are more and more the focus for.

June 1, 2015 | 01:02 AM - Posted by Disturbed Jim (not verified)

quite easy really HBM has 3 times the data transfer rate of GDDR5 (HBM=1Tb/s GDDR5=336Gb/s) so you dont need 12GB Ram because GDDR5 is already got to the point of diminishing returns which is why Both AMD/Nvidia have until now gone the "add more Vram" route.

In short its the data transfer rate thats been holding cards back so 4GB of HBM should be able to read and access that data in the same amount of time that 12GB of GDDR5.

May 19, 2015 | 09:42 PM - Posted by Anonymous (not verified)

"At only 100 microns thick, the interposer will not add much to the z-height of the product and with tricks like double exposures you can build a interposer big enough for any GPU and memory requirement. As an interesting side note, AMD’s Joe Macri did tell me that the interposer is so thin that holding it in your fingers will result in a sheet-of-paper-like flopping."

Tech report states this differently:

"Macro said those storage chips are incredibly thin, on the order of 100 microns, and that one of them "flaps like paper" when held in the hand."

It is the memory die which are super thin, not the interposer. It makes sense that memory die are really thin since a stack of 5 die (4 dram + 1 logic) is the same height as the gpu die. I believe these die are made by etching holes into the silicon wafer and filling with metal for the TSV. Then they build the dram on top. The wafer is then flipped over and polished down to expose the TSVs. The bottom logic die has TSVs also, but the GPU does not need them, so the gpu die is much thicker. I would expect the interposer to be quite thick for mechanical and thermal stability.

Also, I don't know if the interposer size is that big of a limitation. From a previous discussion, after Josh's comments about interposer production (which seems to have been totally wrong), it seems that the maximum size using a single reticule is over 800 square mm. This would only limit the gpu to 600 square mm, which is huge.

HBM could be a massive change in many different market segments and it is going to cause a lot of confusion. The media really needs to try and keep the facts straight and avoid semantic difficulties with the terminology.

May 20, 2015 | 06:32 AM - Posted by Anonymous (not verified)

I have seen this reported in yet another way, so it is unclear what was actually said. The interposer uses TSVs also, so it could be very thin. The TSVs in the interposer are much larger than those in the memory stack though. They are only for connections routed outside the package so this is a rather small number comparatively speaking. It will only be the PCIe interface, the display outputs, and power.

May 19, 2015 | 10:11 PM - Posted by Anonymous (not verified)

The signal to noise ratio is so low in threads involving AMD that it is difficult to find any post I actually want to read. It would be nice to be able to collapse all of these fanboy, troll, and FUD threads. I have worked at tech companies before where the sales guys admitted to getting on forums and spreading FUD about the competition in their spare time. Add in the Fanboys and trolls and it is a total mess. Anyone who is just a normal enthusiast interested in the technology, you might as well not bother reading any of these forum post.

I have occasionally found these post interesting from a phycological perspective though. Are they here posting because they have one of the products and feel the need to defend it through some kind of post-purchase rationalization ("I made the decision to buy it, so it must be the best")? Are they Nvidia or AMD employees who hold a bunch of company stock? Are they just trolls stirring up trouble because they enjoy arguing?

Anyway, it am always surprised that people don't recognize marketing tactics. Nvidia comes out with a new game bundle which is really nice right before a big AMD release. This is obviously to get people to buy ther product now instead of waiting to see what AMD releases. If you buy a couple hundred dollar gpu now, are you going to upgrade to a new card a month later, even if it is significantly better?

May 19, 2015 | 10:41 PM - Posted by Anonymous (not verified)

HBM is going to be interesting for a lot of markets. It will make APUs as powerful as dedicated graphics, so for mobile, we will probably see single package APUs which include CPU, GPU, southbridge. The only off package interconnect would be the IO stuff. It wouldn't need a PCI-e link for graphics so it would only need some PCI-e links for storage. This will make a very powerful system in a very small size. Also, there is still nothing stopping them from routing a memory controller off the SoC package for more memory and using the HBM as a giant L4 cache the way Intel's crystalwell integrated graphics work. The package pin count would be quite low without an external memory interface, so technically They could add some memory on the PCB if the on-package HBM isn't large enough.

The HBM would not really be that useful on the CPU for consumer applications which are not streaming. Most non-streaming applications run from on-die caches with hit rates in the 99% range, which is why increasing system memory speed has not been increasing performance much. It obviously will help with streaming apps, but these are likely to be running on the on-die gpu, not the CPU.

HBM does not replace CPU caches. It will be much lower latency than system memory, but higher latency than on-die SRAM caches. HBM is still DRAM which is higher base latency than SRAM. The ability to keep a much larger number of "pages" (not sure what terminology they are using) open has advantages to latency, but only for applications which are not bottlenecked by the on-die caches. This is mostly big server/workstation applications which require random access to large data structures.

I can see why Intel would be dragging their feet on this type of memory tech. It would reduce the dependence on large on-die caches so it could reduce the demand CPUs with large on-die L3 caches. These can cost up to around $7000. I suspect a performance competitive HBM server chip will be a lot cheaper, so Intel will lose their huge margins. This plus the IGP rising to dedicated performance levels is why HBM could be such a disruptive technology.

May 20, 2015 | 11:04 AM - Posted by (>_<) (not verified)

This is only half the real storey.

AMD can use the interposer as a replacement for the system memory bus while placing HBM on-die with the APU or CPU. This eliminates the need for system ram. At a huge cost benefit gained from a smaller motherboard.

Also HBM on die with a Tablet APU would also see a huge energy savings benefit gained by eliminating Tablet RAM.

Placing 64-128 gigs of HBM on a ZEN Server CPU would give STAGGERING performance not to mention enormous energy savings!

I think that AMD has an opportunity to redefine SOC with Integrated High Bandwidth System Memory; IHBSM.

May 20, 2015 | 11:58 AM - Posted by Anonymous (not verified)

This AMD VS Nvidia shyt is just so crazy, way too many ignorant fanboys on both ends. AMD dies, Nvidia becomes more of a monopoly and can screw PC gaming over in price and innovation, we need both to push new things and compete in pricing, so this one must die level of BS needs to stop.

May 20, 2015 | 01:34 PM - Posted by Anonymous (not verified)

I like the potential of what this new tech can bring so I am waiting on AMD's Fuji XT GPU to see if AMD can actually FULLY implement this GPU to take full advantage of this. AMD\ATI has over the years been a leader of innovation in the CPU\GPU venue from a hardware perspective but they have been victim to poorly implementing their innovations to the point of falling prey to the competition (Intel, Nvidia)doing a better job of IMPLEMENTATION of their innovations.

I really hope that this time AMD does a better job of implementation of their tech. When it comes out we will see.

May 22, 2015 | 04:14 PM - Posted by MRFS (not verified)

Can this development be adapted to non-volatile DRAM e.g. Everspin's ST-MRAM, HP's memristor, and Crossbar's RRAM?

A high-density 3D NVRAM sounds like a worthy challenge
for the "bleeding edge" R&D folks.

See: http://www.technologyreview.com/featuredstory/536786/machine-dreams/

"Failure is a postponed success."
-- Fr. John Eugene O'Toole
(my seminary Latin teacher)

MRFS

May 23, 2015 | 06:58 AM - Posted by Martin Trautvetter

Anyone else thinks its hilarious for AMD to rely on software (engine/driver) optimizations to not run out of memory on a new high-end card in 2015?

"Macri doesn’t believe so; mainly because of a renewed interest in optimizing frame buffer utilization."

If the past is any indication, AMD's interests don't correlate well with their actual abilities.

"Macri admitted that in the past very little effort was put into measuring and improving the utilization of the graphics memory system, calling it “exceedingly poor.”"

I'm going to make a prediction:

AMD is going to dump this card on the market and forget about drivers for at least another 6 months, at which point they'll consider WHQL-ing whatever the poor interns in the basement have come up with at that point.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.