Disk drive reliability in detail

I tend to get abstract and philosophical about data here, and it’s good to have have an occasional splash in the cold water of how the stuff gets stored.

Jon Elerath’s article on hard disk reliability in the June 2009 Communications of the ACM (may require ACM login) gives a lot of detail about the different kinds of disk storage error and appropriate countermeasures. He gets down to the level of things that scratch vs. things that smear (kind of like H.P. Lovecraft for the archivally-minded).

The big takeaway is that there are big obvious crash-failures like the bearing getting wobbly or servo tracks being trashed: these make the drive stop working, and you rebuild from something you still trust. And then there are insidious quiet read/write failures that you can only counteract with a policy of "scrubbing" drives proactively.

Now that I buy storage by the 1.5 terabytes/spindle, I really should do something to dissuade Those Whose Names Are Random from assimilating my data to the Outer Abyss of Maximum Entropy. If you should happen to find some of your goats missing, better not to ask…

This entry was posted in Storing data. Bookmark the permalink.

Leave a Reply