Subject: Storage | July 18, 2017 - 07:31 PM | Jeremy Hellstrom
Tagged: XPoint, srt, rst, Optane Memory, Optane, Intel, hybrid, CrossPoint, cache, 32GB, 16GB
It has been a few months since Al looked at Intel's Optane and its impressive performance and price. This is why it seems appropriate to revist the 2280 M.2 stick with a PCIe 3.0 x2 interface. It is not just the performance which is interesting but the technology behind Optane and the limitations. For anyone looking to utilize Optane is is worth reminding you of the compatibility limitations Intel requires, only Kaby Lake processors with Core i7, i5 or i3 heritage. If you do qualify already or are planning a system build, you can revisit the performance numbers over at Kitguru.
"Optane is Intel’s brand name for their 3D XPoint memory technology. The first Optane product to break cover was the Optane PC P4800X, a very high-performance SSD aimed at the Enterprise segment. Now we have the second product using the technology, this time aimed at the consumer market segment – the Intel Optane Memory module."
Here are some more Memory articles from around the web:
- G.SKILL TridentZ RGB 3600 MHz C16 DDR4 @ techPowerUp
- GSKill Trident Z 4133Mhz RGB CL19 DDR4 Dual Channel Memory Review @ Hardware Asylum
- Ballistix Elite 3466 MHz DDR4 @ techPowerUp
Introduction, How PCM Works, Reading, Writing, and Tweaks
I’ve seen a bit of flawed logic floating around related to discussions about 3D XPoint technology. Some are directly comparing the cost per die to NAND flash (you can’t - 3D XPoint likely has fewer fab steps than NAND - especially when compared with 3D NAND). Others are repeating a bunch of terminology and element names without taking the time to actually explain how it works, and far too many folks out there can't even pronounce it correctly (it's spoken 'cross-point'). My plan is to address as much of the confusion as I can with this article, and I hope you walk away understanding how XPoint and its underlying technologies (most likely) work. While we do not have absolute confirmation of the precise material compositions, there is a significant amount of evidence pointing to one particular set of technologies. With Optane Memory now out in the wild and purchasable by folks wielding electron microscopes and mass spectrometers, I have seen enough additional information come across to assume XPoint is, in fact, PCM based.
XPoint memory. Note the shape of the cell/selector structure. This will be significant later.
While we were initially told at the XPoint announcement event Q&A that the technology was not phase change based, there is overwhelming evidence to the contrary, and it is likely that Intel did not want to let the cat out of the bag too early. The funny thing about that is that both Intel and Micron were briefing on PCM-based memory developments five years earlier, and nearly everything about those briefings lines up perfectly with what appears to have ended up in the XPoint that we have today.
Some die-level performance characteristics of various memory types. source
The above figures were sourced from a 2011 paper and may be a bit dated, but they do a good job putting some actual numbers with the die-level performance of the various solid state memory technologies. We can also see where the ~1000x speed and ~1000x endurance comparisons with XPoint to NAND Flash came from. Now, of course, those performance characteristics do not directly translate to the performance of a complete SSD package containing those dies. Controller overhead and management must take their respective cuts, as is shown with the performance of the first generation XPoint SSD we saw come out of Intel:
The ‘bridging the gap’ Latency Percentile graph from our Intel SSD DC P4800X review.
(The P4800X comes in at 10us above).
There have been a few very vocal folks out there chanting 'not good enough', without the basic understanding that the first publicly available iteration of a new technology never represents its ultimate performance capabilities. It took NAND flash decades to make it into usable SSDs, and another decade before climbing to the performance levels we enjoy today. Time will tell if this holds true for XPoint, but given Micron's demos and our own observed performance of Intel's P4800X and Optane Memory SSDs, I'd argue that it is most certainly off to a good start!
A 3D XPoint die, submitted for your viewing pleasure (click for larger version).
Subject: General Tech, Memory, Storage | May 26, 2017 - 10:14 PM | Tim Verry
Tagged: XPoint, Intel, HPC, DIMM, 3D XPoint
Intel recently teased a bit of new information on its 3D XPoint DIMMs and launched its first public demonstration of the technology at the SAP Sapphire conference where SAP’s HANA in-memory data analytics software was shown working with the new “Intel persistent memory.” Slated to arrive in 2018, the new Intel DIMMs based on the 3D XPoint technology developed by Intel and Micron will work in systems alongside traditional DRAM to provide a pool of fast, low latency, and high density nonvolatile storage that is a middle ground between expensive DDR4 and cheaper NVMe SSDs and hard drives. When looking at the storage stack, the storage density increases along with latency as it gets further away from the CPU. The opposite is also true, as storage and memory gets closer to the processor, bandwidth increases, latency decreases, and costs increase per unit of storage. Intel is hoping to bridge the gap between system DRAM and PCI-E and SATA storage.
According to Intel, system RAM offers up 10 GB/s per channel and approximately 100 nanoseconds of latency. 3D XPoint DIMMs will offer 6 GB/s per channel and about 250 nanoseconds of latency. Below that is the 3D XPoint-based NVMe SSDs (e.g. Optane) on a PCI-E x4 bus where they max out the bandwidth of the bus at ~3.2 GB/s and 10 microseconds of latency. Intel claims that non XPoint NVMe NAND solid state drives have around 100 microsecomds of latency, and of course, it gets worse from there when you go to NAND-based SSDs or even hard drives hanging of the SATA bus.
Intel’s new XPoint DIMMs have persistent storage and will offer more capacity that will be possible and/or cost effective with DDR4 DRAM. In giving up some bandwidth and latency, enterprise users will be able to have a large pool of very fast storage for storing their databases and other latency and bandwidth sensitive workloads. Intel does note that there are security concerns with the XPoint DIMMs being nonvolatile in that an attacker with physical access could easily pull the DIMM and walk away with the data (it is at least theoretically possible to grab some data from RAM as well, but it will be much easier to grab the data from the XPoint sticks. Encryption and other security measures will need to be implemented to secure the data, both in use and at rest.
Interestingly, Intel is not positioning the XPoint DIMMs as a replacement for RAM, but instead as a supplement. RAM and XPoint DIMMs will be installed in different slots of the same system and the DDR4 RAM will be used for the OS and system critical applications while the XPoint pool of storage will be used for storing data that applications will work on much like a traditional RAM disk but without needing to load and save the data to a different medium for persistent storage and offering a lot more GBs for the money.
While XPoint is set to arrive next year along with Cascade Lake Xeons, it will likely be a couple of years before the technology takes off. Supporting it is going to require hardware and software support for the workstations and servers as well as developers willing to take advantage of it when writing their specialized applications. Fortunately, Intel started shipping the memory modules to its partners for testing earlier this year. It is an interesting technology and the DIMM solution and direct CPU interface will really let the 3D XPoint memory shine and reach its full potential. It will primarily be useful for the enterprise, scientific, and financial industries where there is a huge need for faster and lower latency storage that can accommodate massive (multiple terabyte+) data sets that continue to get larger and more complex. It is a technology that likely will not trickle down to consumers for a long time, but I will be ready when it does. In the meantime, I am eager to see what kinds of things it will enable the big data companies and researchers to do! Intel claims it will not only be useful at supporting massive in-memory databases and accelerating HPC workloads but for things like virtualization, private clouds, and software defined storage.
What are your thoughts on this new memory tier and the future of XPoint?
Subject: Storage | April 24, 2017 - 05:20 PM | Jeremy Hellstrom
Tagged: XPoint, srt, rst, Optane Memory, Optane, Intel, hybrid, CrossPoint, cache, 32GB, 16GB
At $44 for 16GB or $77 for a 32GB module Intel's Optane memory will cost you less in total for an M.2 SSD, though a significantly higher price per gigabyte. The catch is that you need to have a Kaby Lake Core system to be able to utilize Optane, which means you are unlikely to be using a HDD. Al's test show that Optane will also benefit a system using an SSD, reducing latency noticeably although not as significantly as with a HDD.
The Tech Report tested it differently, by sourcing a brand new desktop system with Kaby Lake Core APU that did not ship with an SSD. Once installed, the Optane drive enabled the system to outpace an affordable 480GB SSD in some scenarios; very impressive for a HDD. They also did peek at the difference Optane makes when paired with aforementioned affordable SSD in their full review.
"Intel's Optane Memory tech purports to offer most of the responsiveness of an SSD to systems whose primary storage device is a good old hard drive. We put a 32GB stick of Optane Memory to the test to see whether it lives up to Intel's claims."
Here are some more Storage reviews from around the web:
- Intel Optane Memory Review - 1.4GB/s Speed & 300K IOPS for $44 @ The SSD Review
- The Intel Optane Memory Module Review @ Hardware Canucks
- Kingston DCP1000 NVMe SSD Reaches 7GB/s @ Kitguru
- WD Blue 1,000 GiB SSD @ Hardware Secrets
- Synology DiskStation DS916+ 4-Bay NAS @ Kitguru
- Drobo 5N2 NAS @ Kitguru
- Kingston Ultimate GT 2TB Flash Drive @ The SSD Review
- Toshiba X300 6TB HDD @ Kitguru
Introduction, Specifications, and Requirements
Finally! Optane Memory sitting in our lab! Sure, it’s not the mighty P4800X we remotely tested over the past month, but this is right here, sitting on my desk. It’s shipping, too, meaning it could be sitting on your desk (or more importantly, in your PC) in just a matter of days.
The big deal about Optane is that it uses XPoint Memory, which has fast-as-lightning (faster, actually) response times of less than 10 microseconds. Compare this to the fastest modern NAND flash at ~90 microseconds, and the differences are going to add up fast. What’s wonderful about these response times is that they still hold true even when scaling an Optane product all the way down to just one or two dies of storage capacity. When you consider that managing fewer dies means less work for the controller, we can see latencies fall even further in some cases (as we will see later).
Subject: Storage | March 27, 2017 - 12:16 PM | Allyn Malventano
Tagged: XPoint, Optane Memory, Optane, M.2, Intel, cache, 3D XPoint
We are just about to hit two years since Intel and Micron jointly launched 3D XPoint, and there have certainly been a lot of stories about it since. Intel officially launched the P4800X last week, and this week they are officially launching Optane Memory. The base level information about Optane Memory is mostly unchanged, however, we do have a slide deck we are allowed to pick from to point out some of the things we can look forward to once the new tech starts hitting devices you can own.
Alright, so this is Optane Memory in a nutshell. Put some XPoint memory on an M.2 form factor device, leverage Intel's SRT caching tech, and you get a 16GB or 32GB cache laid over your system's primary HDD.
To help explain what good Optane can do for typical desktop workloads, first we need to dig into Queue Depths a bit. Above are some examples of the typical QD various desktop applications run at. This data is from direct IO trace captures of systems in actual use. Now that we've established that the majority of desktop workloads operate at very low Queue Depths (<= 4), lets see where Optane performance falls relative to other storage technologies:
There's a bit to digest in this chart, but let me walk you through it. The ranges tapering off show the percentage of IOs falling at the various Queue Depths, while the green, red, and orange lines ramping up to higher IOPS (right axis) show relative SSD performance at those same Queue Depths. The key to Optane's performance benefit here is that it can ramp up to full performance at very low QD's, while the other NAND-based parts require significantly higher parallel requests to achieve full rated performance. This is what will ultimately lead to a much snappier responsiveness for, well, just about anything hitting the storage. Fun fact - there is actually a HDD on that chart. It's the yellow line that you might have mistook as the horizontal axis :).
As you can see, we have a few integrators on board already. Official support requires a 270 series motherboard and Kaby Lake CPU, but it is possible that motherboard makers could backport the required NVMe v1.1 and Intel RST 15.5 requirements into older systems.
For those curious, if caching is the only way power users will be able to go with Optane, that's not the case. Atop that pyramid there sits an 'Intel Optane SSD', which should basically be a consumer version of the P4800X. It is sure to be an incredibly fast SSD, but that performance will most definitely come at a price!
We should be testing Optane Memory shortly and will finally have some publishable results of this new tech as soon as we can!
Subject: Storage | March 19, 2017 - 12:21 PM | Allyn Malventano
Tagged: XPoint, SSD DC P4800X, Optane Memory, Optane, Intel, client, 750GB, 3D XPoint, 375GB, 1.5TB
Intel brought us out to their Folsom campus last week for some in-depth product briefings. Much of our briefing is still under embargo, but the portion that officially lifts this morning is the SSD DC P4800X:
MSRP for the 375GB model is estimated at $1520 ($4/GB), which is rather spendy, but given that the product has shown it can effectively displace RAM in servers, we should be comparing the cost/GB with DRAM and not NAND. It should also be noted this is also nearly half the cost/GB of the X25-M at its launch. Capacities will go all the way up to 1.5TB, and U.2 form factor versions are also on the way.
For those wanting a bit more technical info, the P4800X uses a 7-channel controller, with the 375GB model having 4 dies per channel (28 total). Overprovisioning does not do for Optane what it did for NAND flash, as XPoint can be rewritten at the byte level and does not need to be programmed in (KB) pages and erased in larger (MB) blocks. The only extra space on Optane SSDs is for ECC, firmware, and a small spare area to map out any failed cells.
Those with a keen eye (and calculator) might have noted that the early TBW values only put the P4800X at 30 DWPD for a 3-year period. At the event, Intel confirmed that they anticipate the P4800X to qualify at that same 30 DWPD for a 5-year period by the time volume shipment occurs.
Subject: Storage | February 15, 2017 - 08:58 PM | Allyn Malventano
Tagged: XPoint, ssd, Optane, memory, Intel, cache
We now have an actual Optane landing page on the Intel site that discusses the first iteration of 'Intel Optane Memory', which appears to be the 8000p Series that we covered last October and saw as an option on some upcoming Lenovo laptops. The site does not cover the upcoming enterprise parts like the 375GB P4800X, but instead, focuses on the far smaller 16GB and 32GB 'System Accelerator' M.2 modules.
Despite using only two lanes of PCIe 3.0, these modules turn in some impressive performance, but the capacities when using only one or two (16GB each) XPoint dies preclude an OS install. Instead, these will be used, presumably in combination with a newer form of Intel's Rapid Storage Technology driver, as a caching layer meant as an HDD accelerator:
While the random write performance and endurance of these parts blow any NAND-based SSD out of the water, the 2-lane bottleneck holds them back compared to high-end NVMe NAND SSDs, so we will likely see this first consumer iteration of Intel Optane Memory in OEM systems equipped with hard disks as their primary storage. A very quick 32GB caching layer should help speed things up considerably for the majority of typical buyers of these types of mobile and desktop systems, while still keeping the total cost below that for a decent capacity NAND SSD as primary storage. Hey, if you can't get every vendor to switch to pure SSD, at least you can speed up that spinning rust a bit, right?
Subject: Storage | February 10, 2017 - 04:22 PM | Allyn Malventano
Tagged: Optane, XPoint, P4800X, 375GB
Over the past few hours, we have seen another Intel Optane SSD leak rise to the surface. While we previously saw a roadmap and specs for a mobile storage accelerator platform, this time we have some specs for an enterprise part:
The specs are certainly impressive. While they don't match the maximum theoretical figures we heard at the initial XPoint announcement, we do see an endurance rating of 30 DWPD (drive writes per day), which is impressive given competing NAND products typically run in the single digits for that same metric. The 12.3 PetaBytes Written (PBW) rating is even more impressive given the capacity point that rating is based on is only 375GB (compare with 2000+ GB of enterprise parts that still do not match that figure).
Now I could rattle off the rest of the performance figures, but those are just numbers, and fortunately we have ways of showing these specs in a more practical manner:
Assuming the P4800X at least meets its stated specifications (very likely given Intel's track record there), and also with the understanding that XPoint products typically reach their maximum IOPS at Queue Depths far below 16, we can compare the theoretical figures for this new Optane part to the measured results from the two most recent NAND-based enterprise launches. To say the random performance makes leaves those parts in the dust is an understatement. 500,000+ IOPS is one thing, but doing so at lower QD's (where actual real-world enterprise usage actually sits) just makes this more of an embarrassment to NAND parts. The added latency of NAND translates to far higher/impractical QD's (256+) to reach their maximum ratings.
Intel research on typical Queue Depths seen in various enterprise workloads. Note that a lower latency device running the same workload will further 'shallow the queue', meaning even lower QD.
Another big deal in the enterprise is QoS. High IOPS and low latency are great, but where the rubber meets the road here is consistency. Enterprise tests measure this in varying degrees of "9's", which exponentially approach 100% of all IO latencies seen during a test run. The plot method used below acts to 'zoom in' on the tail latency of these devices. While a given SSD might have very good average latency and IOPS, it's the outliers that lead to timeouts in time-critical applications, making tail latency an important item to detail.
I've taken some liberties in my approximations below the 99.999% point in these plots. Note that the spec sheet does claim typical latencies "<10us", which falls off to the left of the scale. Not only are the potential latencies great with Optane, the claimed consistency gains are even better. Translating what you see above, the highest percentile latency IOs of the P4800X should be 10x-100x (log scale above) faster than Intel's own SSD DC P3520. The P4800X should also easily beat the Micron 9100 MAX, even despite its IOPS being 5x higher than the P3520 at QD16. These lower latencies also mean we will have to add another decade to the low end of our Latency Percentile plots when we test these new products.
Well, there you have it. The cost/GB will naturally be higher for these new XPoint parts, but the expected performance improvements should make it well worth the additional cost for those who need blistering fast yet persistent storage.
Subject: Memory | February 3, 2017 - 08:42 PM | Tim Verry
Tagged: XPoint, server, Optane, Intel Optane, Intel, big data
Last week Hexus reported that Intel has begun shipping Optane memory modules to its partners for testing. This year should see the launch of both these enterprise products designed for servers as well as tiny application accelerator M.2 solid state drives based on the Intel and Micron joint 3D memory venture. The modules that Intel is shipping are the former type of Optane memory and will be able to replace DDR4 DIMMs (RAM) with a memory solution that is not as fast but is cheaper and has much larger storage capacities. The Optane modules are designed to slot into DDR4 type memory slots on server boards. The benefit for such a product lies in big data and scientific workloads where massive datasets will be able to be held in primary memory and the processor(s) will be able to access the data sets at much lower latencies than if it had to reach out to mass storage on spinning rust or even SAS or PCI-E solid state drives. Being able to hold all the data being worked on in one pool of memory will be cheaper with Optane as well as it is allegedly priced closer to NAND than RAM and the cost of RAM adds up extremely quickly when you need many terabytes of it (or more!). Various technologies attempting to bring higher capacity non volatile and/or flash-based storage in memory module form have been theorized or in the works in various forms for years now, but it appears that Intel will be the first ones to roll out actual products.
It will likely be years before the technology trickles down to consumer desktops and notebooks, so slapping what would effectively be a cheap RAM disk into your PC is still a ways out. Consumers will get a small taste of the Optane memory in the form of tiny storage drives that were rumored for a first quarter 2017 release following its Kaby Lake Z270 motherboards. Previous leaks suggest that the Intel Optane Memory 8000P would come in 16 GB and 32 GB capacities in a M.2 form factor. With a single 128-bit (16 GB) die Intel is able to hit speeds that current NAND flash based SSDs can only hit with multiple dies. Specifically the 16GB Optane application accelerator drive is allegedly capable of 285,000 random 4K IOPS, 70,000 random write 4K IOPS, Sequential 128K reads of 1400 MB/s, and sequential 128K writes of 300 MB/s. The 32GB Optane drive is a bit faster at 300,000 4K IOPS, 120,000 4K IOPS, 1600 MB/s, and 500 MB/s respectively.
Unfortunately, I do not have any numbers on how fast the Optane memory that will slot into the DDR4 slots will be, but seeing as two dies already max out the x2 PCI-E link they use in the M.2 Optane SSD, a dual sided memory module packed with rows of Optane dies on the significantly wider memory bus is very promising. It should lie somewhere closer to (but slower than) DDR4 but much faster than NAND flash while still being non volatile (it doesn't need constant power to retain the data).
I am interested to see what the final numbers are for Intel's Optane RAM and Optane storage drives. The company has certainly dialed down the hype for the technology as it approached fruition though that may be more to do with what they are able to do right now versus what the 3D XPoint memory technology itself is potentially capable of enabling. I look forward to what it will enable in the HPC market and eventually what will be possible for the desktop and gaming markets.
What are your thoughts on Intel and Micron's 3D XPoint memory and Intel's Optane implementation (Micron's implementation is QuantX)?
- IDF 2016: Intel To Demo Optane XPoint, Announces Optane Testbed for Enterprise Customers
- Intel Optane (XPoint) First Gen Product Specifications Leaked
- Intel Z270 Express and H270 Express Chipsets Support Kaby Lake, More PCI-E 3.0 Lanes