Long-term performance analysis of Intel Mainstream SSDs
Background, Leveling, and Combining
A bit of background is in order before we go down this particular rabbit hole. This is especially the case if you are one of the many who are eyeballing solid state as your next hardware purchase. We begin with just how Intel has been laying the smack down on any and all competition in the SSD market. The X25-M employs a custom built 10-channel flash memory controller that can take advantage of Native Command Queuing, enabling it to slaughter any and all competition in the area of small file access (a notorious weakness of flash based storage). NCQ was originally developed to optimize hard disk access patterns, essentially giving the drive a list of what was needed and leaving it to the drive itself to figure out the optimal sequence as to minimize the total seek time (like sorting your shopping list to prevent unnecessary trips across the store).
Applied to the Intel controller, NCQ tells it what lies ahead, enabling it to prefetch data from inactive flash chips. While other manufacturers also use multichannel controllers in their drives, their logic is too immature to handle NCQ, forcing their access patterns to appear more serial than parallel. This custom Intel logic, when combined with NCQ, brings the effective access time of the X25-M down to a blistering 0.085ms, arguably one of the largest contributors to its fame.
Intel’s custom 10-channel logic (4 channels pictured).
Quick Wear Leveling Primer
Most Operating Systems tend to focus activity within small areas of the file system. While hard disks can tolerate repeated focused writes, this type of activity quickly ‘burns out’ flash memory. A typical MLC flash cell will last about 10,000 erasures. SLC fares better, but only by a factor of 10, meaning it will also eventually call it quits. With flash-aware file systems too far ahead on the horizon, manufacturers worked to solve the problem by adding special firmware and controllers that could distribute the wear evenly across the available flash. The actual technique varies by manufacturer – some work in ‘zones’ while others tend to write in a continuous loop, writing from start to end until starting over again. With such schemes in place, repeated writes to the same location by the OS are rotated around to different flash cells. The controller keeps track of the relocated data by means of an on-board lookup table. An effective wear leveling routine combined with a lookup table means that if your OS reads from sector 0, that data will not necessarily be at the physical start of the flash.
Rudimentary regionalized wear leveling (left) compared to the X25-M (right).
The Key to MLC Small Write Performance: Write Combining
While most modern flash controllers employ the wear leveling methods noted above, the X25-M takes it to the next level. By expanding its remap table beyond just that of just individual flash blocks, it can take groups of small files (smaller than the flash block size) and combine them together on the fly, merging them into single flash blocks. Using this technique, small file writes become much closer to the ‘sequential write’ speed of the flash. This is a rather ingenious way to get the best possible write performance out of writes that would usually cause an MLC SSD to choke.
This write combination also significantly reduces the effects of what Intel calls “Write Amplification”. While MLC flash can be written in small pages (~4KB), it can only be erased in blocks (~512KB). Think of each flash block as a CD-RW – you can write a little at a time to it, but when it is full you must erase the whole disc and start over. If you were to write only 1KB to a previously occupied flash block, that entire block of flash (512KB) would have to be read into memory, cleared, and its contents rewritten, just to get your 1KB of data saved to it. This inefficiency is why most MLC flash drives have such a hard time with small writes, and it is also why Intel’s Write Combing controller gives it such a clear advantage over the competition.
Write Combining in action to reduce Write Amplification. Being close to 1 is a good thing.