TinkerTry Gets a Real Look at the Intel Optane SSD DC P4800X

Subject: Storage | August 14, 2017 - 08:09 AM |
Tagged: P4800X, XPoint, NVMe, HHHL, Optane, Intel, ssd, DC

We reviewed the Intel P4800X - Intel's first 3D XPoint SSD, back in April of this year. The one thing missing from that review was product pictures. Sure we had stock photos, but we did not have the product in hand due to the extremely limited number of samples and the need for Intel to be able to make more real-time updates to the hardware based on our feedback during the testing process (reviewers making hardware better FTW!). After the reviews were done, sample priority shifted to the software vendors who needed time to further develop their code bases to take better advantage of the very low latency that Optane can offer. One of those companies is VMware, and one of our friends from over there was able to get some tinker time with one of their samples.

Intel-Optane-DC-P4800X-Series-SSD-in-oem-box-view-by-TinkerTry-Aug-06-2017.JPG

Paul whipped up a few videos showing the installation process as well as timing a server boot directly from the P4800X (something we could not do in our review since we were testing on a remote server). I highly encourage those interested in the P4800X (and the upcoming consumer versions of the same) to check out the article on TinkerTry. I also recommend those wanting to know what Optane / XPoint is and how it works to check out our article here.

X points to the spot; in 3D!

Subject: Storage | July 18, 2017 - 07:31 PM |
Tagged: XPoint, srt, rst, Optane Memory, Optane, Intel, hybrid, CrossPoint, cache, 32GB, 16GB

It has been a few months since Al looked at Intel's Optane and its impressive performance and price.  This is why it seems appropriate to revist the 2280 M.2 stick with a PCIe 3.0 x2 interface.  It is not just the performance which is interesting but the technology behind Optane and the limitations.  For anyone looking to utilize Optane is is worth reminding you of the compatibility limitations Intel requires, only Kaby Lake processors with Core i7, i5 or i3 heritage.  If you do qualify already or are planning a system build, you can revisit the performance numbers over at Kitguru.

Intel-Optane-32GB-Memory-Review-on-KitGuru-INTRODUCTION-650.jpg

"Optane is Intel’s brand name for their 3D XPoint memory technology. The first Optane product to break cover was the Optane PC P4800X, a very high-performance SSD aimed at the Enterprise segment. Now we have the second product using the technology, this time aimed at the consumer market segment – the Intel Optane Memory module."

Here are some more Memory articles from around the web:

Memory

Source: Kitguru

Introduction, How PCM Works, Reading, Writing, and Tweaks

I’ve seen a bit of flawed logic floating around related to discussions about 3D XPoint technology. Some are directly comparing the cost per die to NAND flash (you can’t - 3D XPoint likely has fewer fab steps than NAND - especially when compared with 3D NAND). Others are repeating a bunch of terminology and element names without taking the time to actually explain how it works, and far too many folks out there can't even pronounce it correctly (it's spoken 'cross-point'). My plan is to address as much of the confusion as I can with this article, and I hope you walk away understanding how XPoint and its underlying technologies (most likely) work. While we do not have absolute confirmation of the precise material compositions, there is a significant amount of evidence pointing to one particular set of technologies. With Optane Memory now out in the wild and purchasable by folks wielding electron microscopes and mass spectrometers, I have seen enough additional information come across to assume XPoint is, in fact, PCM based.

XPoint.png

XPoint memory. Note the shape of the cell/selector structure. This will be significant later.

While we were initially told at the XPoint announcement event Q&A that the technology was not phase change based, there is overwhelming evidence to the contrary, and it is likely that Intel did not want to let the cat out of the bag too early. The funny thing about that is that both Intel and Micron were briefing on PCM-based memory developments five years earlier, and nearly everything about those briefings lines up perfectly with what appears to have ended up in the XPoint that we have today.

comparison.png

Some die-level performance characteristics of various memory types. source

The above figures were sourced from a 2011 paper and may be a bit dated, but they do a good job putting some actual numbers with the die-level performance of the various solid state memory technologies. We can also see where the ~1000x speed and ~1000x endurance comparisons with XPoint to NAND Flash came from. Now, of course, those performance characteristics do not directly translate to the performance of a complete SSD package containing those dies. Controller overhead and management must take their respective cuts, as is shown with the performance of the first generation XPoint SSD we saw come out of Intel:

gap.png

The ‘bridging the gap’ Latency Percentile graph from our Intel SSD DC P4800X review.
(The P4800X comes in at 10us above).

There have been a few very vocal folks out there chanting 'not good enough', without the basic understanding that the first publicly available iteration of a new technology never represents its ultimate performance capabilities. It took NAND flash decades to make it into usable SSDs, and another decade before climbing to the performance levels we enjoy today. Time will tell if this holds true for XPoint, but given Micron's demos and our own observed performance of Intel's P4800X and Optane Memory SSDs, I'd argue that it is most certainly off to a good start!

XPoint Die.jpg

A 3D XPoint die, submitted for your viewing pleasure (click for larger version).

You want to know how this stuff works, right? Read on to find out!

Intel Persistent Memory Using 3D XPoint DIMMs Expected Next Year

Subject: General Tech, Memory, Storage | May 26, 2017 - 10:14 PM |
Tagged: XPoint, Intel, HPC, DIMM, 3D XPoint

Intel recently teased a bit of new information on its 3D XPoint DIMMs and launched its first public demonstration of the technology at the SAP Sapphire conference where SAP’s HANA in-memory data analytics software was shown working with the new “Intel persistent memory.” Slated to arrive in 2018, the new Intel DIMMs based on the 3D XPoint technology developed by Intel and Micron will work in systems alongside traditional DRAM to provide a pool of fast, low latency, and high density nonvolatile storage that is a middle ground between expensive DDR4 and cheaper NVMe SSDs and hard drives. When looking at the storage stack, the storage density increases along with latency as it gets further away from the CPU. The opposite is also true, as storage and memory gets closer to the processor, bandwidth increases, latency decreases, and costs increase per unit of storage. Intel is hoping to bridge the gap between system DRAM and PCI-E and SATA storage.

Intel persistent memory DIMM.jpg

According to Intel, system RAM offers up 10 GB/s per channel and approximately 100 nanoseconds of latency. 3D XPoint DIMMs will offer 6 GB/s per channel and about 250 nanoseconds of latency. Below that is the 3D XPoint-based NVMe SSDs (e.g. Optane) on a PCI-E x4 bus where they max out the bandwidth of the bus at ~3.2 GB/s and 10 microseconds of latency. Intel claims that non XPoint NVMe NAND solid state drives have around 100 microsecomds of latency, and of course, it gets worse from there when you go to NAND-based SSDs or even hard drives hanging of the SATA bus.

Intel’s new XPoint DIMMs have persistent storage and will offer more capacity that will be possible and/or cost effective with DDR4 DRAM. In giving up some bandwidth and latency, enterprise users will be able to have a large pool of very fast storage for storing their databases and other latency and bandwidth sensitive workloads. Intel does note that there are security concerns with the XPoint DIMMs being nonvolatile in that an attacker with physical access could easily pull the DIMM and walk away with the data (it is at least theoretically possible to grab some data from RAM as well, but it will be much easier to grab the data from the XPoint sticks. Encryption and other security measures will need to be implemented to secure the data, both in use and at rest.

Intel Slide XPoint Info.jpg

Interestingly, Intel is not positioning the XPoint DIMMs as a replacement for RAM, but instead as a supplement. RAM and XPoint DIMMs will be installed in different slots of the same system and the DDR4 RAM will be used for the OS and system critical applications while the XPoint pool of storage will be used for storing data that applications will work on much like a traditional RAM disk but without needing to load and save the data to a different medium for persistent storage and offering a lot more GBs for the money.

While XPoint is set to arrive next year along with Cascade Lake Xeons, it will likely be a couple of years before the technology takes off. Supporting it is going to require hardware and software support for the workstations and servers as well as developers willing to take advantage of it when writing their specialized applications. Fortunately, Intel started shipping the memory modules to its partners for testing earlier this year. It is an interesting technology and the DIMM solution and direct CPU interface will really let the 3D XPoint memory shine and reach its full potential. It will primarily be useful for the enterprise, scientific, and financial industries where there is a huge need for faster and lower latency storage that can accommodate massive (multiple terabyte+) data sets that continue to get larger and more complex. It is a technology that likely will not trickle down to consumers for a long time, but I will be ready when it does. In the meantime, I am eager to see what kinds of things it will enable the big data companies and researchers to do! Intel claims it will not only be useful at supporting massive in-memory databases and accelerating HPC workloads but for things like virtualization, private clouds, and software defined storage.

What are your thoughts on this new memory tier and the future of XPoint?

Also read:

Source: Intel

Spent all your money on a new CPU and couldn't afford an SSD? Intel Optane Memory is here

Subject: Storage | April 24, 2017 - 05:20 PM |
Tagged: XPoint, srt, rst, Optane Memory, Optane, Intel, hybrid, CrossPoint, cache, 32GB, 16GB

At $44 for 16GB or $77 for a 32GB module Intel's Optane memory will cost you less in total for an M.2 SSD, though a significantly higher price per gigabyte.  The catch is that you need to have a Kaby Lake Core system to be able to utilize Optane, which means you are unlikely to be using a HDD.  Al's test show that Optane will also benefit a system using an SSD, reducing latency noticeably although not as significantly as with a HDD.

The Tech Report tested it differently, by sourcing a brand new desktop system with Kaby Lake Core APU that did not ship with an SSD.  Once installed, the Optane drive enabled the system to outpace an affordable 480GB SSD in some scenarios; very impressive for a HDD.  They also did peek at the difference Optane makes when paired with aforementioned affordable SSD in their full review.

requirements.png

"Intel's Optane Memory tech purports to offer most of the responsiveness of an SSD to systems whose primary storage device is a good old hard drive. We put a 32GB stick of Optane Memory to the test to see whether it lives up to Intel's claims."

Here are some more Storage reviews from around the web:

Storage

 

Subject: Storage
Manufacturer: Intel

Introduction, Specifications, and Requirements

Introduction:

170421-115336a.jpg

Finally! Optane Memory sitting in our lab! Sure, it’s not the mighty P4800X we remotely tested over the past month, but this is right here, sitting on my desk. It’s shipping, too, meaning it could be sitting on your desk (or more importantly, in your PC) in just a matter of days.

Intel-3D-Xpoint.png

The big deal about Optane is that it uses XPoint Memory, which has fast-as-lightning (faster, actually) response times of less than 10 microseconds. Compare this to the fastest modern NAND flash at ~90 microseconds, and the differences are going to add up fast. What’s wonderful about these response times is that they still hold true even when scaling an Optane product all the way down to just one or two dies of storage capacity. When you consider that managing fewer dies means less work for the controller, we can see latencies fall even further in some cases (as we will see later).

Read on for our full review of Optane Memory!

Intel Officially Launches Optane Memory, Shows Performance

Subject: Storage | March 27, 2017 - 12:16 PM |
Tagged: XPoint, Optane Memory, Optane, M.2, Intel, cache, 3D XPoint

We are just about to hit two years since Intel and Micron jointly launched 3D XPoint, and there have certainly been a lot of stories about it since. Intel officially launched the P4800X last week, and this week they are officially launching Optane Memory. The base level information about Optane Memory is mostly unchanged, however, we do have a slide deck we are allowed to pick from to point out some of the things we can look forward to once the new tech starts hitting devices you can own.

Optane Memory-6.png

Alright, so this is Optane Memory in a nutshell. Put some XPoint memory on an M.2 form factor device, leverage Intel's SRT caching tech, and you get a 16GB or 32GB cache laid over your system's primary HDD.

Optane Memory-15.png

To help explain what good Optane can do for typical desktop workloads, first we need to dig into Queue Depths a bit. Above are some examples of the typical QD various desktop applications run at. This data is from direct IO trace captures of systems in actual use. Now that we've established that the majority of desktop workloads operate at very low Queue Depths (<= 4), lets see where Optane performance falls relative to other storage technologies:

Optane Memory-22.png

There's a bit to digest in this chart, but let me walk you through it. The ranges tapering off show the percentage of IOs falling at the various Queue Depths, while the green, red, and orange lines ramping up to higher IOPS (right axis) show relative SSD performance at those same Queue Depths. The key to Optane's performance benefit here is that it can ramp up to full performance at very low QD's, while the other NAND-based parts require significantly higher parallel requests to achieve full rated performance. This is what will ultimately lead to a much snappier responsiveness for, well, just about anything hitting the storage. Fun fact - there is actually a HDD on that chart. It's the yellow line that you might have mistook as the horizontal axis :).

Optane Memory-11.png

As you can see, we have a few integrators on board already. Official support requires a 270 series motherboard and Kaby Lake CPU, but it is possible that motherboard makers could backport the required NVMe v1.1 and Intel RST 15.5 requirements into older systems.

Optane Memory-7.png

For those curious, if caching is the only way power users will be able to go with Optane, that's not the case. Atop that pyramid there sits an 'Intel Optane SSD', which should basically be a consumer version of the P4800X. It is sure to be an incredibly fast SSD, but that performance will most definitely come at a price!

We should be testing Optane Memory shortly and will finally have some publishable results of this new tech as soon as we can!

Source: Intel

Intel Officially Kicks Off Optane Launch with SSD DC P4800X

Subject: Storage | March 19, 2017 - 12:21 PM |
Tagged: XPoint, SSD DC P4800X, Optane Memory, Optane, Intel, client, 750GB, 3D XPoint, 375GB, 1.5TB

Intel brought us out to their Folsom campus last week for some in-depth product briefings. Much of our briefing is still under embargo, but the portion that officially lifts this morning is the SSD DC P4800X:

Intel_SSD_4800_FlatFront_OnWhite_RGB_Small.jpg

optane-4.png

optane-9.png

MSRP for the 375GB model is estimated at $1520 ($4/GB), which is rather spendy, but given that the product has shown it can effectively displace RAM in servers, we should be comparing the cost/GB with DRAM and not NAND. It should also be noted this is also nearly half the cost/GB of the X25-M at its launch. Capacities will go all the way up to 1.5TB, and U.2 form factor versions are also on the way.

For those wanting a bit more technical info, the P4800X uses a 7-channel controller, with the 375GB model having 4 dies per channel (28 total). Overprovisioning does not do for Optane what it did for NAND flash, as XPoint can be rewritten at the byte level and does not need to be programmed in (KB) pages and erased in larger (MB) blocks. The only extra space on Optane SSDs is for ECC, firmware, and a small spare area to map out any failed cells.

Those with a keen eye (and calculator) might have noted that the early TBW values only put the P4800X at 30 DWPD for a 3-year period. At the event, Intel confirmed that they anticipate the P4800X to qualify at that same 30 DWPD for a 5-year period by the time volume shipment occurs.

Read on for more about the SSD DC P4800X (and other upcoming products!)

Intel Quietly Launches Official Optane Memory Site

Subject: Storage | February 15, 2017 - 08:58 PM |
Tagged: XPoint, ssd, Optane, memory, Intel, cache

We've been hearing a lot about Intel's upcoming Optane memory over the past two years, but the information had all been in the form of press announcements and leaked roadmap slides.

optane-memory-marquee-16x9.png.rendition.intel_.web_.1072.603.png

We now have an actual Optane landing page on the Intel site that discusses the first iteration of 'Intel Optane Memory', which appears to be the 8000p Series that we covered last October and saw as an option on some upcoming Lenovo laptops. The site does not cover the upcoming enterprise parts like the 375GB P4800X, but instead, focuses on the far smaller 16GB and 32GB 'System Accelerator' M.2 modules.

intel-optane-memory-8000p.jpg

Despite using only two lanes of PCIe 3.0, these modules turn in some impressive performance, but the capacities when using only one or two (16GB each) XPoint dies preclude an OS install. Instead, these will be used, presumably in combination with a newer form of Intel's Rapid Storage Technology driver, as a caching layer meant as an HDD accelerator:

While the random write performance and endurance of these parts blow any NAND-based SSD out of the water, the 2-lane bottleneck holds them back compared to high-end NVMe NAND SSDs, so we will likely see this first consumer iteration of Intel Optane Memory in OEM systems equipped with hard disks as their primary storage. A very quick 32GB caching layer should help speed things up considerably for the majority of typical buyers of these types of mobile and desktop systems, while still keeping the total cost below that for a decent capacity NAND SSD as primary storage. Hey, if you can't get every vendor to switch to pure SSD, at least you can speed up that spinning rust a bit, right?

Source: Intel

A Closer Look at Intel's Optane SSD DC P4800X Enterprise SSD Performance

Subject: Storage | February 10, 2017 - 04:22 PM |
Tagged: Optane, XPoint, P4800X, 375GB

Over the past few hours, we have seen another Intel Optane SSD leak rise to the surface. While we previously saw a roadmap and specs for a mobile storage accelerator platform, this time we have some specs for an enterprise part:

optane-leak.png

The specs are certainly impressive. While they don't match the maximum theoretical figures we heard at the initial XPoint announcement, we do see an endurance rating of 30 DWPD (drive writes per day), which is impressive given competing NAND products typically run in the single digits for that same metric. The 12.3 PetaBytes Written (PBW) rating is even more impressive given the capacity point that rating is based on is only 375GB (compare with 2000+ GB of enterprise parts that still do not match that figure).

Now I could rattle off the rest of the performance figures, but those are just numbers, and fortunately we have ways of showing these specs in a more practical manner:

rnd.png

Assuming the P4800X at least meets its stated specifications (very likely given Intel's track record there), and also with the understanding that XPoint products typically reach their maximum IOPS at Queue Depths far below 16, we can compare the theoretical figures for this new Optane part to the measured results from the two most recent NAND-based enterprise launches. To say the random performance makes leaves those parts in the dust is an understatement. 500,000+ IOPS is one thing, but doing so at lower QD's (where actual real-world enterprise usage actually sits) just makes this more of an embarrassment to NAND parts. The added latency of NAND translates to far higher/impractical QD's (256+) to reach their maximum ratings.

server workload QD.png

Intel research on typical Queue Depths seen in various enterprise workloads. Note that a lower latency device running the same workload will further 'shallow the queue', meaning even lower QD.

Another big deal in the enterprise is QoS. High IOPS and low latency are great, but where the rubber meets the road here is consistency. Enterprise tests measure this in varying degrees of "9's", which exponentially approach 100% of all IO latencies seen during a test run. The plot method used below acts to 'zoom in' on the tail latency of these devices. While a given SSD might have very good average latency and IOPS, it's the outliers that lead to timeouts in time-critical applications, making tail latency an important item to detail.

qos-r.png

qos-w.png

I've taken some liberties in my approximations below the 99.999% point in these plots. Note that the spec sheet does claim typical latencies "<10us", which falls off to the left of the scale. Not only are the potential latencies great with Optane, the claimed consistency gains are even better. Translating what you see above, the highest percentile latency IOs of the P4800X should be 10x-100x (log scale above) faster than Intel's own SSD DC P3520. The P4800X should also easily beat the Micron 9100 MAX, even despite its IOPS being 5x higher than the P3520 at QD16. These lower latencies also mean we will have to add another decade to the low end of our Latency Percentile plots when we test these new products.

Well, there you have it. The cost/GB will naturally be higher for these new XPoint parts, but the expected performance improvements should make it well worth the additional cost for those who need blistering fast yet persistent storage.