A paper, titled “The Bleak Future of NAND Flash Memory” was recently jointly published by the University of California and Microsoft Research. It has been picked up by many media outlets who all seem to be beating the same morbid drum, spinning tales of a seemingly apocalyptic end to the reign of flash-based storage devices. While I agree with some of what these authors have to say, I have reservations about the methods upon which the paper is based.
TLC and beyond?
The paper kicks off by declaring steep increases in latency and drops in lifetime associated with increases in bits-per-cell. While this is true, flash memory manufacturers are not making large pushes to increase bits-per-cell beyond the standard MLC (2 bits per cell) tech. Sure some have dabbled in 3-bit MLC, also called Triple Level Cell (TLC) which is a bit of a misnomer since storing three bits in a cell actually requires eight voltage level bands, not three as the name implies. Moving from SLC to MLC doubles density, but the diminishing returns increase sharply after that – MLC to TLC only increases capacity by a another 1.5x, but sees a 2-4x reduction in performance and endurance. In light of this, there is little demand for TLC flash, and where there is, it’s clear by the usage cases that it is not meant for anything beyond light usage. There's nothing wrong with the paper going down this road, but the reality is that increasing bits per cell is not the envelope being pushed by the flash memory industry.
Wait a second – where is 25nm MLC?
Looking at the above we see a glaring omission – 25nm MLC flash, which has been around for close to two years now, and constitutes the majority of shipping flash memory parts currently in production. SLC was also omitted, but I can see the reason for this – it’s hard to get your hands on 25nm SLC these days. Why? Because MLC technology has been improved upon to the point where ‘enterprise MLC’ (eMLC) is rapidly replacing SLC even despite the supposed reduction in reliability and endurance over SLC. The reasons for this are simple, and are completely sidestepped or otherwise overlooked by the paper:
- SSD controllers employ write combination and wear leveling techniques.
- Some controllers even compress data on-the-fly as to further reduce writes and provisioning.
- Controller-level Error Correction (ECC) has improved dramatically with each process shrink.
- SSD controllers can be programmed to compensate for the drift of data stored in a cell (eMLC).