Deathwish RAID racing; hit single channel DDR4 transfer rates with WD Black NVMe drives

Subject: Storage | June 19, 2018 - 04:13 PM |
Tagged: wd black nvme, RAID-0, raid, kingston, Hyper M.2 X16 Card, deathwish, ddr4-2400, asus

This will cost you a bit to set up but will provide you with almost unbelievable transfer rates.  Simply combine eight 1 TB WD Black NVMe SSDs at roughly $400 a pop with a pair of ASUS' Hyper M.2 expansion cards at $60 each and build up a deathwish RAID of doom!  TechARP just posted a look at how Andrew Vo managed to pull this off. 

As pointed out by several readers who ... well, actually watched the video instead of just reading the article ... this was done on Threadripper, which makes far more sense than a PCIe lane starved Intel system.   Ignore me and make your Threadripper roar.

Unfortunately this trick will not work the same on AMD platforms, it is limited to Intel Skylake or Coffee Lake with VROC support.  It will be interesting to see how a properly configured Threadripper system would compare.

View Full Size

"To hit 19 GB/s, you need to create a RAID 0 array of those eight 1 TB WD Black NVMe SSDs, but you can’t use the motherboard’s RAID feature because you would be limited by the 32 Gbps/4GB/s DMI bottleneck."

Here are some more Storage reviews from around the web:

Storage

 

Source: TechARP

June 19, 2018 | 05:52 PM - Posted by Paul A. Mitchell (not verified)

Jeremy, In Andrew's video, doesn't he say he is using a Threadripper system for this demonstration?

June 19, 2018 | 05:54 PM - Posted by Paul A. Mitchell (not verified)

p.s. Here's that YouTube video:

https://www.youtube.com/watch?v=X0bO0V-POS4

Right at the beginning: "Here we have our AMD Threadripper system."

June 19, 2018 | 06:43 PM - Posted by Jeremy Hellstrom

Well smeg, serves me right for taking the doctor at his word that VROC was needed.  This makes much more sense now.

June 19, 2018 | 06:10 PM - Posted by saget (not verified)

i'm a little confused about the platform compatibility of this setup...

the guy in the video specifically states that the system used to create the raid benchmark is an AMD threadripper setup???

June 19, 2018 | 06:20 PM - Posted by Prodeous

This is definitely a Threadripper system. Not Intel Just look at the waterblock.. pure TR4 cooler

June 19, 2018 | 07:07 PM - Posted by Paul A. Mitchell (not verified)

It would help some of us if more details about that TR system were available. There are two relevant reasons that come to mind:

(1) another YouTube video by der8auer measures 2 of those ASUS AICs with 8 x Samsung 960 Pro NVMe SSDs; and,

(2) if you study his video very carefully, he tweaks a BIOS/UEFI setting in the ASUS Zenith Extreme which appears to enable interleaving of the 2 PCIe slots that host those ASUS AICs.

Here's that YouTube video:

https://www.youtube.com/watch?v=9CoAyjzJWfw&t=2s

Start at 8:00 on the counter, and see "memory interleaving".

I don't know for sure if that setting refers only to DRAM. He does mention that setting in the context of needing to overclock the CPU and DRAM, in order to reach the READ speed that he measured in that video.

June 19, 2018 | 08:02 PM - Posted by James

I would think memory interleaving would refer to how DRAM. I know it has some settings for this.

June 19, 2018 | 08:22 PM - Posted by Paul A. Mitchell (not verified)

> memory interleaving would refer to how DRAM

Thanks! That makes more sense than interleaving PCIe slots.

June 19, 2018 | 07:31 PM - Posted by Paul A. Mitchell (not verified)

Here's a much more scientific comparison of apples-to-apples and apples-to-oranges, using the ASRock Ultra Quad M.2 card with different hardware and software combinations:

https://www.tweaktown.com/reviews/8542/asrock-ultra-quad-2-card-16-lane-...

On Page 10 of that review, AMD's RAIDXpert gets criticized severely, however:

"... proof positive of AMD's incompetence when it comes to RAID of any flavor."

That result was really unfortunate, particularly for any Prosumers who want to do a fresh OS install to 4 x NVMe SSDs hosted by such "4x4" AICs.

June 21, 2018 | 07:22 AM - Posted by psuedonymous

Sadly not much of a surprise, in PCPer's own testing (https://www.pcper.com/reviews/Storage/Quick-Look-AMD-Ryzen-X470-NVMe-Sto...) Intel's RAID /over the PCH link/ outperformed AMD's RAID on PCU in all metrics other than raw sequential throughput.

June 19, 2018 | 08:30 PM - Posted by Paul A. Mitchell (not verified)

Here's another idea for Intel users:

a 4x4 AIC can be installed in the first PCIe 3.0 slot, and
an x8 video card can be installed in one of the other
PCIe 3.0 slots.

I believe Ken did this same thing, recently:
I remember his discussing this in a recent Podcast.

Here's an AMD workstation video card with an x16 edge connector that is actually populated with only x8 electrical contacts (see photo here):

https://www.newegg.com/Product/Product.aspx?Item=N82E16814105071&Tpk=N82...

Threadripper systems support a lot more PCIe lanes,
so the above setup may not be necessary.

June 21, 2018 | 07:25 AM - Posted by psuedonymous

On Intel consumer ('LGA115x') platforms, you can do x8/x4/x4 bifurcation, but you cannot do x8/x2/x2/x2/x2 or even x4/x4/x4/x4 bifurcation. You can do 2 NVME drives + a GPU (as Ken tested: https://www.pcper.com/reviews/Storage/Intel-CPU-attached-NVMe-RAID-Uncov...) but not 4 drives and a GPU. 3 drives and no GPU may be possible depending on lane assignments (one drive would need to get the x8 assignment, then the next drive be offset by 4 lanes).

June 19, 2018 | 10:28 PM - Posted by James

You need a relatively high end platform for this because it needs 2 full x16 slots; an x16 operating at x8 or x4 will not work. So if you want a video card, you probably need a platform that supports 3 full x16 slots plus bifurcation to 4 x4. You might be able to put the video card in an x8 slot. The intel comment may be about not being able to do it with ROC; it may require VROC to expand to that many devices? Also, systems without VROC probably don’t have enough pci express anyway.

It does sound like AMD’s drivers need a bit more work, but the ROC (raid on cpu) features are very new things. I would expect a few hiccups. Pcper just recently tried to test intel NVMe raid on Z370 chipsets. That had a few issues also. I believe they ended up removing the gpu and using the gpu x16 slot because of some pci express issue. I think NVMe raid on Intel parts probably got pushed up for an earlier release, like a bunch of other intel projects. I don’t think they expected to have to sell a 6 core mainstream part this soon, so their entire product line-up, including chipsets got disrupted by AMD Ryzen release. At least AMD competition is pushing intel to release this stuff to the mainstream market. Intel actively does not want to give that kind of bandwidth to consumer level customers. They would much rather keep it behind a slow chipset, such that you can’t have fast ethernet and fast storage, or fast whatever without upgrading to a higher end Xeon level platform.

June 19, 2018 | 11:25 PM - Posted by Paul A. Mitchell (not verified)

Good points. On the other hand, as competition and mass production result in reducing the retail prices of NVMe M.2 SSDs, the few remaining rationales for limiting these "4x4" AICs to the high end should evaporate.

We have several workstations that host Windows on a RAID-0 of 4 x 6G SATA SSDs: the primary NTFS partition is ~50GB, and the remainder is formatted for data storage.

This setup continues to work very well for our everyday computing needs.

If AMD would simply fix existing problems with their RAIDXpert software, I would seriously consider a TR system with the ASRock X399M and the ASRock Ultra Quad M.2 AIC.

When we asked ASRock's tech support for help, they responded the same day with instructions for doing a fresh Windows install on that AIC in an ASRock X399M motherboard.

We archived the relevant files here on the Internet:
http://supremelaw.org/systems/asrock/X399/

June 20, 2018 | 01:28 PM - Posted by msroadkill612

There is a neat solution for you then. Simply use the 3x native nvme ports on most TR moboS as a triple raid 0.

You have an interesting rig there. A take home for me is that it is testament to the reliability of ssdS, that such a perceived "risky" configuration has been reliable.

imo, the risk of raid 0 ssd vs HDD are not in the same league at all. HDD ahas high failure, and raid 0 makes it much more so, so it has a bad name.

what you do, seems a good solution to a fast, cheap and capacious workspace? (you called it storage).

do similar with nvme, and it will scream vs sata.

If you do the rough math on the 970 evo e.g., there isnt much difference between a 1TB, and 4x 250GB nvme in price, so raidis only a small premium.

June 20, 2018 | 07:25 PM - Posted by Paul A. Mitchell (not verified)

Thanks for your excellent advice.
FYI: I've been studying and using RAID-0 arrays
for a long time now (several years), and
my own empirical experience has not encountered
the failure rate that many other forum users
predicted. And, despite the vocal opposition
I encountered to this idea, a RAID-0 array
of 4 x 6G SSDs was an obvious way to do
my own brand of "wear leveling", while
at the same time experiencing much faster
sequential speeds. Also, we have come
to love RamDisk Plus from superspeed.com:
that software has an option to SAVE and RESTORE
the ramdisk contents at SHUTDOWN and STARTUP,
respectively. As our ramdisk has grown in size,
having fast non-volatile storage really helps
to accelerate those routine SAVEs and RESTOREs
of our ramdisk's contents. A TR system with
32GB of DDR4 will nicely complement a RAID-0 array
of 3x or 4x NVMe SSDs e.g. Samsung 970 EVO.
I really like the engineering elegance that
obtains from 4 x NVMe M.2 SSDs @ x4 PCIe 3.0 lanes
= one x16 edge connector: 4x4=16 :)

June 21, 2018 | 12:35 AM - Posted by msroadkill612

Cool.I dont presume to tell you your business, just sharing snippets I have gleaned in the hope it helps. I am time rich, not smart :)

Your new NV array is ~5-7 times faster or more, so maybe u dont need ramdisk?

You are bang in the middle of a new paradigm that seems remarkably un remarked on.

The once huge distinction between memory and storage has been blurred.

Prev gen desktops were 150MB/s hdd vs 25,000 MB/s DDR3 dual channel (166x slower).

Typical AM4 w/ evo 970 is 3500/2400 MB/s Read/write vs ~35,000 MB/s DDR4 3200 ram. (10x slower, or 4x raid = 2.5x slower than memory, as well able to saturate an 8x pcie slot as memory).

~unlimited ~memory seems an exciting resource for future apps. Its not memory, but cleverly used it can free up memory w/o being noticed too much.

Its irrelevant to those who have plenty of ram, but for offices with 32GB who need to upgrade for an app which occasionally demands more, some pseudo virtual ram could save cost and hassle.

I too suspect an inner harmony to pairs and 4x esp. I ~neurotically dont quite trust odd numbers not to be a pest some how (as in suggested 3x nvme TR native raid - but just a 2x native nvme array would still be 3-4x faster than currently).

Yep, the Samsung 970 250GB EVO @ ~$120 is a no brainer for you.

970 Evo is Head and shoulders above other brands in so many metrics, and at only a small premium.

Asrock also have 4x nvme cards, and they seem better engineered than the Asus.

Note you have 48 lanes free on TR after onboard nvme ports etc., but you may only have 2x full 16lane (16GB bandwidth) cards on TR.

A gpu and a 16lane 4x nvme card means e.g. u cant have a 2nd full strength gpu.

using the 3x native nvme as an array would still allow 2x full gpu cards

I was a bit shocked to hear recently that adding the raid drivers
to make a small nvme array on X370 am4, results in a 40% latency hit for the nvme, which raid 0 makes up for in bandwidth, but still ...

It may still be handy to keep a single native nvme drive available as a v low latency work space (another $120 - meh). Minimise interruptions to the array's main tasks. Even sata ssdS have impressive iops specs and may take heat off the array in some apps.

raid 0 does not improve iops, only throughput.

A weird suggestion is if you have trust issues (and who could blame you), there is always the safe all rounder of raid 10.

it gives you ~2x nvme write speed, 2x (and potentially 4x) read speed, total redundancy and very fast recovery (or even keep using it) should a drive fail.

Once your drives are bedded in to your satisfaction, switch to 4x raid 0 like you have now.

cheers

June 21, 2018 | 03:01 AM - Posted by msroadkill612

fyi:

""

If anyone is interested, ASRock replied to our query with simple instructions for doing a fresh install of Windows 10 to an ASRock Ultra Quad M.2 card installed in an AMD X399 motherboard. We uploaded that .pdf file to the Internet here:
http://supremelaw.org/systems/asrock/X399/

June 21, 2018 | 11:47 AM - Posted by Paul A. Mitchell (not verified)

If money were not a major issue, then I can easily see some major advantages to be had from the redundancy provided by the following:

* install OS on RAID-0 hosted by one ASRock Ultra Quad M.2 card (ideally 4 x NVMe M.2 SSDs)

* "migrate OS" to a second RAID-0 hosted by a second ASRock Ultra Quad M.2 card

This might be a little overkill; nevertheless, this setup would provide both speed AND redundancy.

For research purposes, one of those AICs could be populated with Samsung M.2 SSDs, and the other could be populated with Intel Optane M.2 SSDs.

There is an enterprise M.2 22110 Optane in the works which uses an x4 PCIe interface.

And, one of the extra features of the ASRock AIC is the auxiliary 12V input power connector; so, power consumption should not be a problem with that AIC:

https://www.asrock.com/mb/spec/product.asp?Model=ULTRA%20QUAD%20M.2%20CARD

On this same theme, I've also suggested that ASUS offer a future DIMM.2 AIC with room for 4 x NVMe M.2 SSDs. Their current DIMM.2 card has room for only 2 x M.2 at present. See photos of the ASUS Zenith Extreme: the DIMM.2 slot is directly adjacent to the DIMM bank to the right of the CPU socket.

June 20, 2018 | 07:56 PM - Posted by Paul A. Mitchell (not verified)

Along similar lines, here Chris Ramseyer reviews the Highpoint SSD7120 with 4 x Optane 900P 2.5" SSDs:

https://www.tomshardware.com/reviews/highpoint-ssd7120-raid,5509.html

Although there are obvious extra costs for the cabling and backplane enclosure, the cooling provided by that enclosure should prevent M.2 throttling.

Highpoint keeps telling us that their SSD7120 will be bootable at some future date, but they do not give us any dates when that feature will be available with that AIC.

Their SSD7110 is bootable, and several months back another forum user found that the driver for that AIC also made the Highpoint SSD7101 bootable. Somehow, Highpoint found out about this, and removed that feature from the SSD7101's device driver (afaik).

June 21, 2018 | 10:23 AM - Posted by Paul A. Mitchell (not verified)

> It may still be handy to keep a single native nvme drive available as a v low latency work space

Yes! In fact, after much trial-and-error, over several years, I've come to settle on this general approach:

(a) fresh install Windows to a JBOD drive wired to a native SATA port on the motherboard; this OS will become a backup after the following steps are done ...

(b) install all extra software to that C: partition;

(c) after (b) is finished to full satisfaction, then "migrate OS" to a RAID-0 array, using Partition Wizard freeware.

The only current problem with (c) is that it requires the target drive to be unformatted; as such, it cannot clone a drive image of a 50GB C: partition "in place".

So, I do "migrate OS" to completion; then, I run Partition Wizard again to shrink the C: drive that "migrate OS" created, and then format the remainder as a data partition e.g. drive letter E: .

The big advantage of (a) + (c) is that a second OS is available to boot with a quick change to "Boot Drive" in the BIOS.

I've used this option with success, because Symantec GHOST runs much faster as a Windows application, as compared to running that recovery task from an optical disc.

Same is true of Acronis True Image Western Digital Edition.

And, whenever C: is corrupted with malware etc., it's a relatively quick and easy task to restore a working C: to the RAID-0 array hosting the OS by default.

And, one of the reasons why I prefer the ramdisk to the RAID-0 partition, is the reduction in wear that results from doing routine database work on the ramdisk.

For example, with over 130,000 discrete files which mirror our website, navigating that file system and doing routine searching both occur MUCH faster e.g. in Command Prompt:

attrib usps.tracking*.htm /s

executes extremely quick on the ramdisk, but much slower on all other non-ramdisk partitions.

Similarly, I always move the Firefox cache to the ramdisk, because I use Firefox so much on a daily basis.

Hope this helps: many thanks for your valuable additions!

June 21, 2018 | 10:40 AM - Posted by Paul A. Mitchell (not verified)

p.s. With RamDisk Plus installed, then our primary database (the website mirror) should be stored in 3 redundant places:

(1) ramdisk
(2) non-volatile RAID-0 array
(3) copy of ramdisk written at SHUTDOWN
(4) copies of database on backup storage server(s)

And, again using Command Prompt, it's very quick and easy to confirm that (1) and (2) above are identical after a routine STARTUP:

xcopy R:\folder E:\folder /s/e/v/d/l
xcopy E:\folder R:\folder /s/e/v/d/l

Where,

R: = ramdisk drive letter
E: = RAID-0 drive letter

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.