Sunday, August 28, 2011

ParanoidFS

There is only one thing that I expect from a filesystem -- I expect it to store my files. If a hard drive is like a storage warehouse, a filesystem is like installing filing cabinets in that warehouse. While it is possible to just keep papers in a large pile on the floor, this is not recommended.
When it's totally unacceptable to lose data, the current orthodox solution is to use RAID-1, a system in which all data is copied to two hard drives. If one drive fails, it can be replaced and the data copied from the other drive. My new idea is an imprcatical way of achieving this on only one hard drive. (Because your company has a bizarre policy of not ordering new hardware and a crippling hard drive shortage.)
ParanoidFS would store data in five clusters for each item. At read time, the five clusters would be read and compared. A Quorum of three would decide if any blocks were defective. (That is, the blocks "vote" what the correct value is, and if one or two of them have a different value than the others, then they are wrong and are marked as defective.) This could even be done in the background after loading a fifth of the file from each cluster for performance. The filesystem would allow itself to only work in a read-only mode after a certain number of blocks were declared bad, and a warning message would tell you to buy a new hard drive. One you can take to the accounting department.
On the downside, you would get at best 1/5th of the capacity of the hard drive. A typical 1TB drive on the market, for instance, would only provide 200GB of paranoidFS, but it would be a totally immortal 200GB.

No comments:

Related Posts Plugin for WordPress, Blogger...