IDF Shenzhen: Intel Demos 3D XPoint Optane File Copy at 2 GB/s

Subject: Storage | April 14, 2016 - 03:56 PM |
Tagged: Optane, NVMe, Intel, idf

At IDF Shenzhen, Intel talked more about 3D XPoint (spoken cross-point). Initially launched in July of last year, 3D XPoint is essentially a form of phase change memory which has speeds closer to that of DRAM.

View Full Size

It can be addressed at the byte level, unlike flash which transfers in pages (~8KB) and erases in blocks (~6MB). There have been a few demos since the initial launch, and this morning there was another:

View Full Size

It is great to see XPoint / Optane technology being demonstrated again, but as far as demos go, this was not the best / fairest example that Intel could have put together. First of all, the 'NAND SSD' they are using is a Thunderbolt 3 connected external, which was clearly bottlenecked badly somewhere else in the chain (when was the last time you saw a 6 Gbit SATA SSD limited to only 283 MB/s?). Also, using SATA for the NAND example while using PCIe x4 NVMe for the Optane example seems a bit extreme to me.

The Optane side of the demo is seen going 1.94 GB/s. That is an impressive figure for sure, but it is important to note that a faster Intel 'NAND SSD' product has already been shipping for over a year:

View Full Size

Yes, the P3700 (reviewed by us here), can reach the speeds seen in this demo, as evidenced by this ATTO run on one of our 1.6TB samples:

View Full Size

Looking at the P3700 specs, we can see that the 2TB model performs even better and would likely beat the Optane SSD used in today's demo:

View Full Size

Further, in the IDF 2015 demo (where they launched the Optane brand), Intel showed off Optane's random IO performance:

View Full Size

This demo showed 464,300 4K random IOPS, and if you do the math, that works out to 1.9 GB/s *worth of random IO*, which is far more impressive than sequentials that basically match that of the current generation NVMe product of the same form factor and interface.

I'm still happy to see these demos happen, as it means we are absolutely going to see 3D XPoint in our hands sooner than later. That said, I'd also like to see demos that better demonstrate the strengths of the technology, because if today's demo was comparing apples to apples, it would have shown a P3700 matching the speed of Optane, which does not make the previously stated 1000x speed improvement nearly as obvious as it should be presented.


April 14, 2016 | 05:24 PM - Posted by vicky (not verified)

wait.....
so a cross point wire has to both be able to appply voltage to read below, while also recieving a signal from above?

how's that gonna work? it was single spin conductors, so you can't really both get voltage and apply voltage at the same time....

April 14, 2016 | 05:57 PM - Posted by Allyn Malventano

Sure you can. You can apply a low current to 'sense' the resistance of the junction, and later apply a higher 'program' current to cause the phase change to occur (by local heating, for example). The specific ramp / profile of the voltage/current applied to write is what determines the value stored in that bit for later reading.

April 15, 2016 | 08:28 AM - Posted by vicy (not verified)

yes,i understood that it can, but it now means that this is not very efficient you can write and read the cell above and below at the same time

which means it's not very stackable in this design.... because you'd need to wait for every call cycle to get 1 cell on a coulomb

April 14, 2016 | 06:52 PM - Posted by Anonymous (not verified)

It is hard to tell whether matching the sequential speed of a high-end enterprise SSD is impressive without knowing more about the form factor of the device. The x-point device should be able to achieve such speeds with a much smaller number of channels. This may be like comparing the sequential speed of a single SSD to an array of hard disk. You can achieve similar speeds with enough disk, but it isn't comparable.

April 15, 2016 | 12:44 AM - Posted by Allyn Malventano

I'm fairly certain that the Optane device was connected via NVMe PCIe x4 (over TB3), which is the same link that a P3700 uses. Agreed on possibly fewer controller channels, but this is detail that must be included when a demo shows results that are slower than it should be (to those who know how fast it is capable of). 

April 15, 2016 | 03:01 AM - Posted by Anonymous (not verified)

On one hand I'm a bit disappointed XPoint isn't going to be much of a leap in storage performance.
On the other hand I'm glad Intel isn't set to acquire an effective monopoly on another class of computing component.

April 15, 2016 | 03:08 AM - Posted by JesperA (not verified)

This is most likely an engineering sample so we should not really have any final opinion about XPoint based on this test.

XPoint will most likely actually be a big leap in storage performance, this is not even gen 1 yet and still it has proven to be really performant in some tests, imagine gen 2, 3 and beyond, XPoint is a great leap for a gen 1 type device for a completely new storage architecture.

People seem to be to focused on performance, i on the other hand is really looking forward to XPoint due to its durability compared to flash.

April 15, 2016 | 10:34 AM - Posted by Anonymous (not verified)

It's not the first time a company really wanted to show they are taking huge leaps in speed when in actuality they are not. I'm saddened that they didn't talk much about the actual longevity benefits. I suppose that important aspect isn't sexy enough. The speeds CAN be impressive, but their setups were obviously skewed to anyone's technical eye.
When are we really going to see this tech available and for what sizes and how much?
It's still a step ahead, even if it's not as big a difference than they would like to portray.

April 15, 2016 | 04:30 PM - Posted by MRFS (not verified)

As Allyn is superbly qualified to explain in detail,
there are several inter-dependent considerations e.g.
power consumption, capacity(s), longevity and reliability.

I'll take a "leap ahead" here and speculate that
Optane will put pressure on the DMI 3.0 link
to work at the 16 GHz clock rate in the PCIe 4.0 spec,
and hopefully widen the DMI link to x8 or x16 lanes.

As it is right now, a single NVMe SSD has the
same upstream bandwidth as the DMI 3.0 link.

Intel chipsets should accommodate a minimum
of four x4 NVMe SSDs in any of several modern
RAID modes or JBOD: 4 @ x4 = x16 edge connector.

Also, I would like to see competitors like HP
acquire leaders like Everspin, to give Intel
a run for their money. As things stand today,
Everspin's ST-MRAM is too expense and densities
are still too low.

April 15, 2016 | 04:59 PM - Posted by Anonymous (not verified)

Totally agree...this demo was severely lacking in many ways. When it's all said and done, I expect it to need PCIE 3.0 16x level bandwidth...muahaha

April 15, 2016 | 05:45 PM - Posted by MRFS (not verified)

Here's a short pcper.com article on PCIe 4.0:
http://www.pcper.com/news/General-Tech/PCIe-40-2x-Bandwidth-30-30-20-and...

See also:
https://pcisig.com/faq

p.s. Intel can afford to take the "long view"
and vector their R&D in anticipation of PCIe 4.0.

For myself, I'm very happy that serious engineering
attention is now being given to upstream bandwidth
for high-speed storage, after so many years of
progress with multiple GPUs e.g. SLI and Crossfire.

On the other hand, I'm very disappointed that
it took this long to receive serious attention.

The SATA clock rate should have been increased
to sync with PCIe 3.0 i.e. 8 GHz minimum; and,
a SATA-IV spec should also allow for the
128b/130b "jumbo frame" now in the PCIe 3.0 standard.

As proof that the SATA standards group is lagging,
compare USB 3.1's 10G clock and 128b/132b jumbo frame.

If SAS can jump to 12G, so can SATA.

Even better, turn DIY / Enthusiasts like us loose --
by allowing "overclocking data storage subsystems":
http://www.supremelaw.org/patents/overclocking.storage.subsystems.versio...

If DRAM can support multiple JEDEC settings,
why not enable such options for high-speed storage too?

(I know: I must be dreaming -- again.)

Allyn, your comments / criticisms are ALWAYS welcome here!

April 15, 2016 | 08:33 PM - Posted by Anonymous (not verified)

Why claiming 3D XPoint is a phase changing memory while Intel never explicitly said so? The 3D PowerPoint marketing only say that the technology is more durable (than what?). From a pragmatic POV, I consider 3D XPoint technology to be only a NAND PCI-E SSD which the performance rely on the PCI-E bus rather than the NAND chips. Actually there is nothing new under the sun...

April 16, 2016 | 02:32 PM - Posted by JesperA (not verified)

XPoint have NO parts incommon with NAND, maybe do some slight research before stating such absurd things. Even the main pic to this article give you a hint that this has nothing to do with flash.

Your last sentence is especially wrong, there has been no massproduction of this kind of device before, companies have touted RRAM, memristor, PCM kind of devices for ages but his is the first time a product of this kind will come out to even consumers.

April 17, 2016 | 10:39 AM - Posted by Allyn Malventano

The speed issues at present are mainly due to controller tech not being where it needs to be. Also, XPoint is about as opposite from NAND as you can get really. My frustration with their demo stems from their poor choice of workload based on the current limitations. 

April 16, 2016 | 03:56 PM - Posted by Anonymous (not verified)

"Intel XPoint emperor has no clothes, only soiled diapers

Micron's deafening XPoint silence"

The Register articls states:

"Latest XPoint stats

An examination of the reported Hady presentation and the IDF Shenzhen demo revealed these XPoint Optane gen 1 numbers:
•20nm process
•SLC (1 bit/cell)
•7 microsec latency, or 7,000 nanoseconds
•78,500 (70:30 random) read/write IOPS
•NVMe interface

Well, at last, real numbers. So XPoint is 1,000 times faster than SSDs, with an Intel PC3700 PCIe flash card having a latency of 85 microseconds; yeah, right, prepare to be under-freaking-whelmed by XPoint's latency.

It is only 12 times faster than a modern Intel PCIe flash card, 16 times faster than a Micron NVMe 7100 or 9100 flash drive's read latency, and a mere six times faster than said drives' write latency.

The random IOPS numbers are a miserly five times faster than the PC3700's 70:30 random read/write 15,000 IOPS.

So here is a nuke detonated under the XPoint-is-1,000-times-faster claim, which is shown to overstate the speed difference 990-fold. This is a brass neck of giraffe proportions.

With such a devastating difference between reality and claim on the latency and IOPS fronts, we are really doubtful about the 1,000 times longer endurance claim as well.

Please stop this unrealistic marketing hype around XPoint. It's shiny, brown and creamy BS; everyone knows it, and your XPoint emperor has no clothes, only soiled diapers. Enough already."(1)

(1)

http://www.theregister.co.uk/2016/04/15/intel_xpoint_emperor_has_no_clot...

April 17, 2016 | 01:26 AM - Posted by Anonymous (not verified)

Anonymous: That Register article doesn't know what he's talking about.

That XPoint demonstration is on an NVMe interface, which is a limiter before it reaches even remotely close to that 1000x max performance.

Prepare to have your socks blown when its on a proper interface with Purley platform, aka Skylake-EP. When XPoint technology goes on a DIMM interface, the performance will be such that the claims of being an eventually DIMM/Storage replacement will seem much closer to reality.

April 17, 2016 | 11:47 AM - Posted by Anonymous (not verified)

Why replacing a fast volatile memory (aka DDRx-SDRAM) nearly immortal by a far slower non-volatile memory (aka 3D XPoint) with a butterfly life expectancy? I think you don't understand the purpose of volatile memory (e.g. register, buffer, cache, etc). Maybe Intel could use this built-in obsolescence technology to sell more processors...

April 17, 2016 | 12:22 PM - Posted by Anonymous (not verified)

It's more about Intel's marketing department's fibs related to Xpoint, and The register covers a lot more of professional server/HPC market than most gaming websites. The professional websites will not be fooled as they will ask the hard questions to get at the proper information. It's Intel's marketing department that needs to be carful, as Micron the co-developer of Xpoint has made no claims that come close to what Intel's marketing has stated.

"Yet from Micron, listen as hard as we can, all we hear is a deadening silence. Why is this?

It's the way the Intel-Micron joint venture is structured. According to people not a million miles away from Intel and Micron XPoint activities, their XPoint JV is 51 per cent owned by Micron, and 49 per cent by Intel. Micron has the right to buy out Intel's share, but Intel doesn't have a reciprocal right. This asymmetry affects the marketing/PR side of the XPoint JV as well, with Intel allowed to do as it's doing and Micron effectively hobbled for some period of time." (1)

(1)
"Intel XPoint emperor has no clothes, only soiled diapers

Micron's deafening XPoint silence"

http://www.theregister.co.uk/2016/04/15/intel_xpoint_emperor_has_no_clot...

April 17, 2016 | 10:29 PM - Posted by MRFS (not verified)

Whether or not the current crop of NVMe cables
will need to be better shielded, just as IDE ribbon
cables needed a "null" conductor to dampen cross-talk
at ATA-133 speeds, the visible horizon is pointing to:

(a) a single x1 PCIe 4.0 lane oscillating at 16 GHz;
(b) a single x1 PCIe 4.0 lane using the 128b/130b frame;
(c) motherboards with 4+ U.2 ports + (a) + (b);
(d) multiple Intel DMI 4.0+ links with x8 or x16 lanes;
(e) all modern RAID modes with NVMe TRIM support;
(f) dynamic clock speeds a la SpeedStep for data channels;
(g) a "Format RAM" option in BIOS/UEFI subsystems;
(h) hybrid DIMM slots, some volatile / some non-volatile;
(i) JEDEC-compatible SO-DIMMs with NVRAM;
(j) other form factors like (i).

These are just a few features that come to mind.

April 17, 2016 | 11:18 PM - Posted by MRFS (not verified)

Here's another way of looking at the potential:
G.SKILL have recently announced DDR4-4000 SDRAM:
that's a raw clock rate of 2,000 MHz per conductor
(doubled, because DDR is double-data rate).
But, remember that SDRAM edge connectors operate
over a parallel bus transmitting 8 bytes per clock tick.
4000 @ 8 = 32 GB/sec. Compare that with DMI 3.0's
4 GB/sec. upstream bandwidth. Now, take note that
a full x16 edge connector at PCIe 4.0 is also 32 GB/sec.
If Intel can design add-in cards with very high density
Optane and x16 edge connectors oscillating at 2 GHz per lane,
they will have achieved 2-4+ TB of NON-Volatile DRAM
that runs at a speed very close to G.SKILL's DDR4-4000!
That prospect excites me no end: it means, quite simply,
mass storage that is non-volatile and just as fast as
late model DDR4 SDRAM. p.s. Is anybody planning to
attend this trade show: http://www.highpoint-tech.com/USA_new/nabshow2016.htm ? I'm literally dying to see
photos of the Highpoint "RocketStor 3830A (PCIe 3.0 x16 lane controller)".

April 18, 2016 | 07:16 AM - Posted by Anonymous (not verified)

Let me guess, you're dreaming that non-volatile memory can be as fast a volatile memory... I can only confirm that you're dreaming since capacitor is the fastest memory technology ever made by humanity!

April 18, 2016 | 01:27 PM - Posted by Anonymous (not verified)

Seems Great!I have read about this on this article too : http://techfox.us/intel-optane-ssd-tranfers-files-2-gb-per-second/

June 2, 2016 | 12:21 PM - Posted by MRFS (not verified)

My Comment today at:
http://www.tomshardware.com/news/intel-3d-xpoint-kaby-lake,31966.html

Hey, Intel, are you reading this? You just laid off thousands of employees, and now you want to lock-up a proprietary architecture, hoping future users will be impressed enough and have money enough to switch to entirely new motherboards (again)? Aren't you the guys who invented PCI-Express, with its modularity and expand-abilities? Why you haven't already started manufacturing SATA-, SAS- and NVMe-compatible 2.5" SSDs with Optane is TOTALLY BEYOND ME! It's no wonder that journalists e.g. Forbes/Business blame the layoffs on poor management. By now, you could have had millions of happy Optane users, hungry for more of the same in a variety of different hardware settings. But, instead, we have to see ridiculous re-runs of unrealistic projections of a THOUSAND TIMES FASTER. Heck, I had upstream bandwidth of 4.0 GB/s 5 YEARS ago with a PCIe 2.0 chipset and an x8 Highpoint RocketRAID 2720SGL == SAME AS the current upstream bandwidth of the Intel DMI 3.0 link. But, I guess you will refuse to read this, refuse to listen to WHAT WE WANT and stubbornly persist in telling us what we should be buying -- to keep your stockholders happy.