Introduction and Specifications

XPoint Tested. Finally!

Introduction

XPoint. Optane. QuantX. We've been hearing these terms thrown around for two years now. A form of 3D stackable non-volatile memory that promised 10x the density of DRAM and 1000x the speed and endurance of NAND. These were bold statements, and over the following months, we would see them misunderstood and misconstrued by many in the industry. These misconceptions were further amplified by some poor demo choices on the part of Intel (fortunately countered by some better choices made by Micron). Fortunately cooler heads prevailed as Jim Handy and other industry analysts helped explain that a 1000x improvement at the die level does not translate to the same improvement at the device level, especially when the first round of devices must comply with what will soon become a legacy method of connecting a persistent storage device to a PC.

Did I just suggest that PCIe 3.0 and the NVMe protocol – developed just for high-speed storage, is already legacy tech? Well, sorta.

That 'Future NVM' bar at the bottom of that chart there was a 2-year old prototype iteration of what is now Optane. Note that while NVMe was able to shrink down the yellow bar a bit, as you introduce faster and faster storage, the rest of the equation (meaning software, including the OS kernel) starts to have a larger and larger impact on limiting the ultimate speed of the device.

NAND Flash simplified schematic (via Wikipedia)

Before getting into the first retail product to push all of these links in the storage chain to the limit, let's explain how XPoint works and what makes it faster. Taking random writes as an example, NAND Flash (above) must program cells in pages and erase cells in blocks. As modern flash has increased in capacity, the sizes of those pages and blocks have scaled up roughly proportionally. At present day we are at pages >4KB and block sizes in the megabytes. When it comes to randomly writing to an already full section of flash, simply changing the contents of one byte on one page requires the clearing and rewriting of the entire block. The difference between what you wanted to write and what the flash had to rewrite to accomplish that operation is called the write amplification factor. It's something that must be dealt with when it comes to flash memory management, but for XPoint it is a completely different story:

XPoint is bit addressible. The 'cross' structure means you can select very small groups of data via Wordlines, with the ultimate selection resolving down to a single bit.

Since the programmed element effectively acts as a resistor, its output is read directly and quickly. Even better – none of that write amplification nonsense mentioned above applies here at all. There are no pages or blocks. If you want to write a byte, go ahead. Even better is that the bits can be changed regardless of their former state, meaning no erase or clear cycle must take place before writing – you just overwrite directly over what was previously stored. Is that 1000x faster / 1000x more write endurance than NAND thing starting to make more sense now?

Ok, with all of the background out of the way, let's get into the meat of the story. I present the P4800X:

Yes I know, don't tell me, that's not a photo that I took. Turns out that Intel only has enough of these to currently sample those who are actually developing software that can take full advantage of the new tech (like VMware). When I was at Intel's Folsom campus a few weeks back, I was shown a server loaded with a P4800X and a P3700 for comparison. For the past few weeks, I have had remote access to this server and have tested the P4800X with extreme prejudice.

(Editors Note: It is worth pointing out that this testing method is not ideal and is not something we would have recommend or suggested to Intel. However, with the alternative being not testing the product at all, we decided it was worth telling the story of Optane to our readers regardless of the testing process involved. As a sanity check, since Intel had P3700 in the remote system we double checked performance on it and a local system with the exact same processor and P3700 and results were within a 1% margin, giving us as good of an indication as any that nothing "funny" was going on with the test system.)

Specifications

Despite repeated requests, Intel was unwilling to share the complete datasheet with us. Above are the specs listed on their abbreviated product brief.

« PreviousNext »