Introduction, How PCM Works, Reading, Writing, and Tweaks

I’ve seen a bit of flawed logic floating around related to discussions about 3D XPoint technology. Some are directly comparing the cost per die to NAND flash (you can’t - 3D XPoint likely has fewer fab steps than NAND - especially when compared with 3D NAND). Others are repeating a bunch of terminology and element names without taking the time to actually explain how it works, and far too many folks out there can't even pronounce it correctly (it's spoken 'cross-point'). My plan is to address as much of the confusion as I can with this article, and I hope you walk away understanding how XPoint and its underlying technologies (most likely) work. While we do not have absolute confirmation of the precise material compositions, there is a significant amount of evidence pointing to one particular set of technologies. With Optane Memory now out in the wild and purchasable by folks wielding electron microscopes and mass spectrometers, I have seen enough additional information come across to assume XPoint is, in fact, PCM based.

XPoint.png

XPoint memory. Note the shape of the cell/selector structure. This will be significant later.

While we were initially told at the XPoint announcement event Q&A that the technology was not phase change based, there is overwhelming evidence to the contrary, and it is likely that Intel did not want to let the cat out of the bag too early. The funny thing about that is that both Intel and Micron were briefing on PCM-based memory developments five years earlier, and nearly everything about those briefings lines up perfectly with what appears to have ended up in the XPoint that we have today.

comparison.png

Some die-level performance characteristics of various memory types. source

The above figures were sourced from a 2011 paper and may be a bit dated, but they do a good job putting some actual numbers with the die-level performance of the various solid state memory technologies. We can also see where the ~1000x speed and ~1000x endurance comparisons with XPoint to NAND Flash came from. Now, of course, those performance characteristics do not directly translate to the performance of a complete SSD package containing those dies. Controller overhead and management must take their respective cuts, as is shown with the performance of the first generation XPoint SSD we saw come out of Intel:

gap.png

The ‘bridging the gap’ Latency Percentile graph from our Intel SSD DC P4800X review.
(The P4800X comes in at 10us above).

There have been a few very vocal folks out there chanting 'not good enough', without the basic understanding that the first publicly available iteration of a new technology never represents its ultimate performance capabilities. It took NAND flash decades to make it into usable SSDs, and another decade before climbing to the performance levels we enjoy today. Time will tell if this holds true for XPoint, but given Micron's demos and our own observed performance of Intel's P4800X and Optane Memory SSDs, I'd argue that it is most certainly off to a good start!

XPoint Die.jpg

A 3D XPoint die, submitted for your viewing pleasure (click for larger version).

You want to know how this stuff works, right? Read on to find out!

FMS 2016: Micron QuantX XPoint Prototype SSD Spotted

Subject: Storage | August 11, 2016 - 12:06 PM |
Tagged: FMS, FMS 2016, XPoint, micron, QuantX, nand, ram

Earlier this week, Micron launched their QuantX branding for XPoint devices, as well as giving us some good detail on expected IOPS performance of solutions containing these new parts:

U.2.jpg

Thanks to the very low latency of XPoint, the QuantX solution sees very high IOPS performance at a very low queue depth, and the random performance very quickly scales to fully saturate PCIe 3.0 x4 with only four queued commands. Micron's own 9100 MAX SSD (reviewed here), requires QD=256 (64x increase) just to come close to this level of performance! At that same presentation, a PCIe 3.0 x8 QuantX device was able to double that throughput at QD=8, but what are these things going to look like?

DSC02634.jpg

The real answer is just like modern day SSDs, but for the time being, we have the prototype unit pictured above. This is essentially an FPGA development board that Micron is using to prototype potential controller designs. Dedicated ASICs based on the final designs may be faster, but those take a while to ramp up volume production.

DSC02636.jpg

So there it is, in the flesh, nicely packaged and installed on a complete SSD. Sure it's a prototype, but Intel has promised we will see XPoint before the end of the year, and I'm excited to see this NAND-to-DRAM performance-gap-filling tech come to the masses!

DSC02095.jpg

FMS 2016: Micron Keynote Teases XPoint (QuantX) Real-World Performance

Subject: Storage | August 9, 2016 - 03:33 PM |
Tagged: XPoint, QuantX, nand, micron

Micron just completed their keynote address at Flash Memory Summit, and as part of the presentation, we saw our first look at some raw scaled Queue Depth IOPS performance figures from devices utilizing XPoint memory:

U.2.jpg

These are the performance figures from an U.2 device with a PCIe 3.0 x4 link. Note the outstanding ramp up to full saturation of the bus at a QD of only 4. Slower flash devices require much more parallelism and a deeper queue to achieve sufficient IOPS throughput to saturate that same bus. That 'slow' device on the bottom there, I'm pretty certain, is Micron's own 9100 MAX, which was the fastest thing we had tested to date, and it's being just walked all over by this new XPoint prototype!

Ok, so that's damn fast, but what if you had an add in card with PCIe 3.0 x8?

HHHL.jpg

Ok, now that's just insane! While the queue had to climb to ~8 to reach these figures, that's 1.8 MILLION IOPS from a single HHHL add in card. That's greater than 7 GB/s worth of 4KB random performance!

latency.jpg

In addition to the crazy throughput and IOPS figures, we also see latencies running at 1/10th that of flash-based NVMe devices.

x10.jpg

..so it appears that while the cell-level performance of XPoint boasts 1000x improvements over flash, once you implement it into an actual solution that must operate within the bounds of current systems (NVMe and PCIe 3.0), we currently get only a 10x improvement over NAND flash. Given how fast NAND already is, 10x is no small improvement, and XPoint still opens the door for further improvement as the technology and implementations mature over time.

More to follow as FMS continues!

FMS 2016: Micron Launches 3D UFS SSDs, Brands 3D XPoint as QuantX

Subject: Storage | August 9, 2016 - 01:09 PM |
Tagged: XPoint, UFS, QuantX, micron, FMS 2016, FMS

The UFS standard aims to bring us lightning fast microSD cards that perform on-par with SATA SSDs. Samsung introduced theirs earlier this month, and now Micron has announced their solution:

Mobile 3D NAND UFS with specs and logo.jpg

As you can see, UFS is not just for SD cards. These are going to be able to replace embedded memory in mobile devices, displacing the horror that is eMMC with something way faster. These devices are smaller than a penny, with a die size of just over 60 mm squared and boast a 32GB capacity.

UFS.png

One version of the UFS 2.1 devices also contains Micron's first packaged offering of LPDDR4X. This low power RAM offers an additional 20% power savings over existing LPDDR4.

Also up is an overdue branding of Micron's XPoint (spoken 'cross-point') products:

MIcron_QuantX_Logo_Black.Gray_60%.png

QuantX will be the official branding of Micron products using XPoint technology. This move is similar to the one Intel made at IDF 2015, where they dubbed their solutions with the Optane moniker.

More to follow from FMS 2016. A few little birdies told me there will be some good stuff presented this morning (PST), so keep an eye out, folks!

Press blast for Micron's UFS goodness appears after the break.