A little Optane for your HDD
Intel's Optane Memory caching solution, launched in April of 2017, was a straightforward feature. On supported hardware platforms, consisting of 7th and 8th generation Core processor-based computers, users could add a 16 or 32gb Optane M.2 module to their PC and enable acceleration for their slower boot device (generally a hard drive). Beyond that, there weren't any additional options; you could only enable and disable the caching solution.
However, users who were looking for more flexibility were out of luck. If you already had a fast boot device, such as an NVMe SSD, you had no use for these Optane Memory modules, even if you a slow hard drive in their system for mass storage uses that you wanted to speed up.
At GDC this year, Intel alongside the announcement of 64GB Optane Memory modules, announced that they are bringing support for secondary drive acceleration to the Optane Memory application.
Now that we've gotten our hands on this new 64GB module and the appropriate software, it's time to put it through its paces and see if it was worth the wait.
The full test setup is as follows:
|Test System Setup|
Intel Core i7-8700K
|Motherboard||Gigabyte H370 Aorus Gaming 3|
16GB Crucial DDR4-2666 (running at DDR4-2666)
Intel SSD Optane 800P
Intel Optane Memory 64GB and 1TB Western Digital Black
|Graphics Card||NVIDIA GeForce GTX 1080Ti 11GB|
|Graphics Drivers||NVIDIA 397.93|
|Power Supply||Corsair RM1000x|
|Operating System||Windows 10 Pro x64 RS4|
In coming up with test scenarios to properly evaluate drive caching on a secondary, mass storage device, we had a few criteria. First, we were looking for scenarios that require lots of storage, meaning that they wouldn't fit on a smaller SSD. In addition to requiring a lot of storage, the applications must also rely on fast storage.
NVMe RAID and StoreMI
With Ken testing all of the new AMD X470 goodness that we had floating around the office here at PCPer, I snuck in some quick storage testing to get a look at just how the new platform handled a typical power user NVMe RAID configuration. We will be testing a few different platform configurations:
- ASUS Z270 w/ 7700K
- 1x SSD behind chipset (PCH)
- 2x SSD (RAID-0) behind chipset (PCH)
- 1x SSD directly connected to CPU
- AMD X470 w/ 2600X
- 1x SSD via RAIDXpert bottom driver
- 2x SSD (RAID-0) via RAIDXpert
- 1x SSD via MS InBox NVMe driver
For the AMD system we tested, all M.2 ports were direct connected to the CPU. This should be the case for most systems since the AMD chipset has only a PCIe 2.0 x4 link which would cut most NVMe SSD bandwidth in half if passed through it. The difference on AMD is that installing the RAIDXpert software also installs a 'bottom driver' which replaces the Windows NVMe driver, while Intel's RST platform handles this process more in the chipset hardware (but is limited to PCIe 3.0 x4 DMI bandwidth). Now onto the results:
Random Read IOPS
For random IO, we see expected scaling from AMD, but do note that IOPS comes in ~40% lower than the same configuration on Intel's platform. This is critical as much of the IO seen in general use is random reads at lower queue depths. We'd like to see AMD doing better here, especially in the case where a single SSD was operating without the interference of the RAIDXpert driver, which was better, but still not able to match Intel.
Random Read Latency
This latency chart should better explain the IOPS performance seen above. Note that the across the board latency increases by ~10us on the X470 platform, followed by another ~20us when switching to the RAIDXpert driver. That combined ~30us is 50% of the 60us QD1 latency seen the Z270 platform (regardless of configuration).
Ok, now we see the AMD platform stretch its legs a bit. Since Intel NVMe RAID is bottlenecked by its DMI link while AMD has all NVMe SSDs directly connected to the CPU, AMD is able to trounce Intel on sequentials, but there is a catch. Note the solid red line, which means no RAIDXpert software. That line tracks as it should, leveling off horizontally at a maximum for that SSD. Now look at the two dashed red lines and note how they fall off at ~QD8/16. It appears the RAIDXpert driver is interfering and limiting the ultimate throughput possible. This was even the case for a single SSD passing through the RAIDXpert bottom driver (configured as a JBOD volume).
AMD has also launched their answer to Intel RST caching. StoreMI is actually a more flexible solution that offers some unique advantages over Intel. Instead of copying a section of HDD data to the SSD cache, StoreMI combines the total available storage space of both the HDD and SSD, and is able to seamlessly shuffle the more active data blocks to the SSD. StoreMI also offers more cache capacity than Intel - up to 512GB SSD caches are possible (60GB limit on Intel). Lastly, the user can opt to donate 2GB of RAM as an additional caching layer.
AMD claims the typical speedups that one would expect with an SSD caching a much slower HDD. We have done some testing with StoreMI and can confirm the above slide's claims. Actively used applications and games end up running at close to SSD speeds (after the first execution, which comes from the HDD). StoreMI is not yet in a final state, but that is expected within the next week or two. We will revisit that topic with hard data once we have the final shipping product on-hand.