An SSD Endurance Experiment: They're All Dead (2015)

martey · on Nov 10, 2016

> "Intel doesn't have confidence in the drive at that point, so the 335 Series is designed to shift into read-only mode and then to brick itself when the power is cycled."

I don't understand why Intel wouldn't just configure these drives to go into read-only mode permanently. If I realized my hard drive had become read-only and didn't suspect hard drive failure, my first inclination would be to reboot my computer, not immediately back up all data.

cuchulain · on Nov 10, 2016

The article is wrong on this point, and on Intel's intentions, as far as I can tell. Intel has a "Supernova" feature (http://itpeernetwork.intel.com/data-integrity-in-solid-state...) which will cause some drive models to brick themselves if certain conditions are met - errors in the control path, for example, which basically mean you cannot trust the drive at all. The supernova feature is only claimed for enterprise drives, and the 335 series is not an enterprise drive.

I have a lot of experience with long-running Intel SSDs of various models, including pushing them to the same kinds of extreme that the SSD endurance experiment did, and I have never observed them to self-brick simply because they reached their flash endurance point.

What I have observed is a number of firmware bugs (or possibly just the supernova feature) that caused the drive to brick on power cycle, even for drives in perfect health.

I liked the SSD endurance articles, because they went a long way to allaying fears about SSDs, but I think it's a shame they've left this point in.

stavros · on Nov 10, 2016

My data point was that I got an Intel SSD back in 2010 (one of the first affordable ones, $500 for 160 GB) and it started showing bad sectors in 2013. I immediately copied all my data off, and sent it to intel, who sent me a replacement for free. The replacement has been working fine ever since.

rsmoz · on Nov 10, 2016

> I have a lot of experience with long-running Intel SSDs of various models

Hey, could I get your help selecting an Intel SSD model? Overwhelmed by the number of SKUs.

cuchulain · on Nov 14, 2016

Sure. Can you see the email in my profile?

daemin · on Nov 10, 2016

They did only have a sample size of 1 for each different drive type, so it's not exactly an exhaustive test.

wtallis · on Nov 10, 2016

I don't know Intel's reasoning for this policy, but if there's a sound technical reason for it, I would guess that it has to do with the drive not wanting to flush its NAND mapping information from DRAM to flash that it has deemed worn out. However, the Intel 335 Series uses SandForce controllers that don't have an external DRAM buffer, so they never have much data cached or in flight. It's more likely this policy was decided upon for enterprise products and was deemed not worth revising for client products given how few customers would exhaust the drive's write endurance to be affected by this.

EDIT: And, as pointed out by cuchulain, much of the information about the intended end-of-life behavior of Intel's SSDs is unreliable; they don't publish that information on a per-model basis, so some of what you read is based on mere speculation.

RealityVoid · on Nov 10, 2016

I remember having a contradictory discussion at work about the design of a emulated eeprom driver for some embedded product. We had the flash memory hardware rated for a number of erase/write cycles. The question was what should we do when the cycle number is greater than that rated number. I said we should continue functioning and eventually raise a warning or something. But my colleagues were saying we should simply kill the hardware and brick ourselves. I was adamantly against this but they were citing safety concerns that maybe the flash could get corrupted and we weren't supposed to support that long of a lifetime anyway. We had a lot of safety mechanisms and redundancies baked in so data corruption would not happen. And my argument was mainly that, yes, if unrecoverable data corruption happens, brick it, but until the hardware forces you to close shop, the SW should continue running as long as possible. I don't know what they ended up implementing because I soon left, but I think they went with the self-bricking option.

Anyways, I just wanted to share this nice anecdote and I can't help but think that maybe somewhere, some Intel engineers had some discussion very similar to my own.

mjevans · on Nov 10, 2016

Thanks for reminding me of the EXACT reason that Intel drives were on my personal blacklist of manufacturers to NEVER buy SSDs from.

chrisbennet · on Nov 10, 2016

When I read that I also thought "That's horrible, guess I won't buy that drive." When I read further though, I discovered that all the drives in his test become unreadable ("bricked"?) when they eventually failed.

smcl · on Nov 10, 2016

Well for the others, if you really care about it you can see that sectors start getting remapped and think "ah ok time to start thinking about backing this data up and replace the drive" whereas if I understand correctly on the Intel one you pretty much immediately need to backup the data and hope that you don't need to restart or lose power before you've backed up what you need.

I'm sure it's more nuanced than that but my reaction was definitely "steer clear of the Intel drives ..." when I read this so perhaps someone can clarify.

squarefoot · on Nov 10, 2016

Backup shouldn't be something to do when the drive is exhaling its last breath or even showing first symptoms, it should be done often and in a transparent way. On a laptop the best practice is to arrange a sync with a server (NAS etc.) when one gets home. If done incrementally it requires from seconds to minutes and is fully automatic. Unfortunately making backups still isn't common practice; most users see a NAS or even an external drive as wasted money. They feel safe by "backing up" some data on a USB key only to discover how volatile and unsafe it might be when it's too late (breaks, washing machine, theft, loss, etc.)

rasz_pl · on Nov 10, 2016

It was a lie, Intel drive died due to firmware bugs, tldr its shiet despite Intel brand.

cjensen · on Nov 9, 2016

Continuous Integration systems can really burn through SSD endurance. If you have a large, compiled code base which rebuilds on every checkin, you will be creating and deleting object code constantly. Use smartmontools or HDD Guardian to keep an eye on endurance.

Our code base creates around half a gig of compilation product on every build. We used up the endurance on a consumer-level Micron SSD in about a year. No data loss occurred.

peller · on Nov 10, 2016

If it's not too big, could you just use tmpfs for compilation, and copy the final results to persistent storage if needed?

Just food for thought; your point remains valid.

toomuchtodo · on Nov 10, 2016

DevOps here: This is the solution. Its also stupid fast if everything is done in RAM (just ensure you're not swapping out to disk).

cgag · on Nov 10, 2016

People keep saying this but compiling seems super CPU bound so im confused.

nitrogen · on Nov 10, 2016

Disk seeks when lopking up the next file to compile used to be a huge bottleneck.

hvidgaard · on Nov 10, 2016

At least for C#, it is entirely CPU bound. Faster CPUs and more cores will increase the IO, but not enough to matter anything.

But the point about SSD endurance still stands, a ramdisk solves that problem.

Waterluvian · on Nov 10, 2016

Wouldn't this possibly be faster too?

Is there ever a reason to commit to disk anything you don't keep after a reboot?

userbinator · on Nov 10, 2016

Our code base creates around half a gig of compilation product on every build

500MB? That's tiny in comparison to available RAM today, so I would just say use a RAMdrive and periodically write to the SSD.

pbhjpbhj · on Nov 10, 2016

Indeed, my Ubuntu recently created multiple 22G log files several days in a row (some USB issue or other fixed by updating kernel). Wouldn't have been an issue but the disk was nearly full.

JoeAltmaier · on Nov 9, 2016

Maybe there's room for a 'file write filter' that avoids writing identical data back to the same file. To save SSD lifetime. Sounds like it would have application.

Shendare · on Nov 9, 2016

Or if the exact same intermediate files are being overwritten over and over again, wouldn't a RAMDrive be a good place for them?

benley · on Nov 10, 2016

There's also something to be said for having a build system that can correctly do incremental rebuilds and caching of outputs, which could massively ease the SSD write load.

cjensen · on Nov 10, 2016

You guys are all adorable software devs :-). You're trying to solve a problem with software that just isn't that big a deal: high endurance drives exist, or just buy a new one every year.

ComputerGuru · on Nov 10, 2016

That's cute. Require maintenance where none is actually needed. Everyone hosting a CI now needs a hardware guy, too.

Bandaid solutions (replace every x months, buy something bigger/faster, etc) are not the way to go. The hardware solution to this is not buy a high-endurance drive but to buy more RAM and set up a tmpfs build directory - or buy a ram drive and use that for build instead of you want to eliminate even that software configuration step.

Reason077 · on Nov 10, 2016

Replacing the SSDs every year might be a valid solution, too.

It may be far cheaper to do so than to spend developer time coming up with a ramdisk solution. And RAM is far more expensive, per GB, than SSDs.

Next year's drives will be cheaper and better, anyway.

konstmonst · on Nov 10, 2016

There is no "solution" to speak of, tmpfs can be mounted on any directory on linux, so there is no difference to normal build process at least on linux. Also for example I can serialise build jobs in jenkins, so the total amount of space required will be spread over time.

imtringued · on Nov 10, 2016

RAM is more expensive than SSDs but it's still very cheap at roughly $5 per GB. Higher density 32GB DIMMs cost only a bit more at $7 per GB. You can buy terabytes of RAM for a few thousand dollars.

colanderman · on Nov 10, 2016

Incremental builds are antithetical to continuous integration, for the reason that even a minor dependency issue results in a non-reproducible build.

dasmoth · on Nov 10, 2016

The CI absolutists tend to yell "reproducible builds" at this point and argue that things should be built from scratch every time.

majewsky · on Nov 10, 2016

Well, yeah. Reproducible builds are valuable. If you don't think so, then have fun trying to rollback to a commit that can only be built by building a specific set of commits leading up to it in the right order.

benley · on Nov 18, 2016

Yeah - this stuff is why I advocate so hard for Bazel. It actually gets this shit right, unlike nearly everything else.

anthk · on Nov 10, 2016

ccache

tonyplee · on Nov 10, 2016

Just wondering how it is compare to HDD: (Here's my calculation base on some assumptions, feel free to correct it if you see any errors.)

2.5PB = 2500TB = 2,500,000 GB

2,500,000 GB / (80MB /s typical HDD Speed ) = 31,250,000 seconds = 8680 Hours = 361 days.

It will take HDD 361 days to write 2.5PB at 80MB/s.

I wonder how many HDD can survive 361 days of 80MB/s non stop?

cuchulain · on Nov 10, 2016

Drive vendors are now publishing per-year write workloads for drives.

EG, datacentre-grade SATA and near-line SAS drives like the WD RE (https://www.wdc.com/en-um/products/business-internal-storage...) and Seagate Enterprise Capacity (http://www.seagate.com/au/en/enterprise-storage/hard-disk-dr...) are rated for 550TB/year.

Lower-end drives (NAS, cold-storage, desktop models) are rated less.

Seagate's overall Enterprise/Datacentre lineup (http://www.seagate.com/au/en/enterprise-storage/hard-disk-dr...) ranges from 180TB/year to 550TB/year, and elsewhere on Seagate's site they indicate that a 550TB/year is "10x more than desktop drives".

These are all just ratings though. The theory is that over a population of drives, you'll see a higher failure rate than predicted if you do higher than the rated workload per year. WDC used to have a whitepaper on it called "Why Specify Workload", but it's no longer on their site.

I have in some cases seen enterprise sata drives pushed to the kinds of workload you're talking about - 2.5PB in a year - and seen in the order of 10% fail over that time, with a drive that normally has a ~0.5% AFR.

RandomBK · on Nov 10, 2016

> I wonder how many HDD can survive 361 days of 80MB/s non stop?

I wonder how many consumer HDDs can survive that load. I would be shocked if datacenter-grade drives fail after only 361 days of continuous load.

userbinator · on Nov 10, 2016

80MB/s sequential reads or writes is probably something consumer HDDs can survive for several years. The platters are always spinning, the only difference is that now the drive is continuously reading or writing what's under the head. It's the random accesses (and associated seeks) which stresses them.

There are various comparisons out there which conclude "datacenter-grade" is largely a marketing/warranty thing; the drives themselves may be nearly identical in design.

verytrivial · on Nov 10, 2016

Then again, I wonder how much impact write amplification[1] can have on hitting that limit without actually pumping that many bits towards the drive.

[1] https://en.wikipedia.org/wiki/Write_amplification

taspeotis · on Nov 10, 2016

80 MB/s would imply sequential IO. Much harder test if it were random IO - the drive head would move all over the place all day long.

WalterBright · on Nov 10, 2016

I modified some programs of mine that generate a lot of files to read the old version of the file first, compare it with the new version in the buffer, and only write out the new file if it is actually different. This cuts way down on the write cycles to the SSD. It's faster, too!

xenadu02 · on Nov 10, 2016

Why do the SSDs all brick themselves when this happens? It seems like a huge mis-feature; HDDs are almost never recoverable when they fail but if you can't reallocate blocks on flash just go into read-only mode.

wtallis · on Nov 10, 2016

In principle, going into read-only mode should work and it should take a while for read disturb errors to corrupt the data. But there's a trade-off that if you're trying to keep servicing writes as long as possible (and retiring bad blocks as they wear out), the risk rises that an earlier-than-expected unrecoverable error will corrupt the critical data structures that keep track of the mapping between logical and physical addresses. Playing it safe means quitting early and thus giving your drive an endurance rating that suggests it is less reliable than the competition.

And it's no surprise that the aspects of SSD firmware that by nature get the least real-world testing and are the most tricky to design would be quite buggy in practice. Even ZFS doesn't try to avoid catastrophic data loss in the face of unreliable RAM.

gerdesj · on Nov 10, 2016

It's a fun article but it would have been outrageously good if say 30 examples of each SSD sourced "randomly" had been tested. OK a bit expensive.

"Over the past 18 months, we've watched modern SSDs easily write far more data than most consumers will ever need."

They tested six SSDs and got "...far more data than most consumers..." - that's the takeaway.

digikata · on Nov 10, 2016

Would you be willing to pay for a subscription to a quarterly report for that info?

reitanqild · on Nov 10, 2016

For me?

Kind of yes.

Only I have subscription overload. Every newspaper and their dog wants to sell me subscriptions but I generally don't read newspapers daily.

I'd love to have access to this data through some spotify-for-text service or Blendle or something though.

I guess I'm not alone in wanting both to pay researchers, bloggers, journalists etc etc, but based on what I read, not based on a monthly subscribtion to every company that I ever want to read something from?

digikata · on Nov 11, 2016

I keep my recurring subscriptions to a minimum too, so I understand. But, funding the procurement of statistically significant numbers of multiple models of SSD drives, running them through to end-of-life characterization, and keeping that all updated as new models come out is a higher spending profile than your typical blogger. It seems more like a business research report or recurring lab test type of service.

ptx · on Nov 10, 2016

Maybe Flattr [0] is close enough? You set a monthly budget, pick things to support over the course of the month, and at the end of the month those things automatically get their slice of your pre-set budget.

[0] https://flattr.com/

helper · on Nov 10, 2016

I loved this series. It inspired us to do similar experiments with SSDs as we were spec'ing out new servers. I highly recommend doing this so you get a feel for what SMART looks like for your specific SSDs. Its nice to be able to monitor that to have some idea when your SSDs are going to die, especially if most of your drives are aging together.

userbinator · on Nov 10, 2016

If you divide the data written at the point where reallocated sectors start appearing by the size, you can figure out the actual average endurance of the flash. That results in:

    400 Samsung 840 Series
    2344 Samsung 840 Pro
    2400 Kingston HyperX 3K
    2800 Intel 335 Series
    4400 Corsair Neutron GTX

paullth · on Nov 9, 2016

its nice when a tech article ends with a song

aurizon · on Nov 10, 2016

WTF, This a repost of a 2 year old dead article