Some great analysis by Cullen Logan at Amazon.com appeared on LinkedIn over the weekend in response to my post “Are Enterprise SSDs a “bad” idea? Four tips and counter-tips for your consideration”:
“To put some raw data out there, consider a single 8Gb MLC NAND with 4096 blocks. If a naive approach were taken that performs an ERASE/REPROGRAM every minute, the typical 10,000 cycle limit would be reached in just 7 days (60 x 24 x 7 = 10,080 = good-bye block).
But as Steve points out there are really smart people working on this stuff. Using a perfect wear-leveling scenario that evenly distributes ERASE/REPROGRAM cycles across all 4096 blocks would result in each block being written a maximum of 3 times in that 7 day period 10,080/4096 = ~2.46. Put another way, wear leveling can extend the life of the MLC NAND to over 77 years using this simplistic example.
(10,000 x 4096)/(60×24) = 28,444 days = ~77.87 years
Other topics that should be considered is whether or not the controller utilizes partial page writes (which increases read disturbances), background operations, and many other features defined in JEDEC specs for eMMC – although eMMC is not exactly the same as SSD; they share many things in common.
While some anecdotal comments in this thread suggest that power is not something to consider with SSDs, I tend to disagree. As bus frequencies increase we must compensate by attempting to lower the voltage supplied, to help the overall power equation. In that equation voltage is squared, so any reduction in voltage is a big win. Lowering the voltage on HDDs is a larger hurdle due to mechanical parts, while wafer processing tends to lean in favor for SSDs and required voltage.
I could probably write for another hour about this, but my meta-point is that SSDs are in no way bad. As Steve mentioned ECC selection will need to evolve or perhaps become application dependent, which will result in extra bits per block, but will provide the right ECC for an application’s target BER.
My only complaint is the lack of transparency in the methods used by controller vendors in how exactly they manage bad blocks. It makes robust testing nearly impossible if not done in an oven, and even then you can’t get raw data on block failures, because the controller takes care of it by design and how it does so is secret sauce for the controller vendor.
SSDs are here to stay. Any enterprising company will have backup measures in place already for critical and perhaps non-critical data, so price will probably be a large determining factor for large-scale corporate purchases. Just my $0.02.
Thanks Steve for a great post to generate some activity.”
Posted by Cullen Logan.