Advanced RAID
Today’s storage systems prevent data loss through the use of RAID technology, which has been designed to handle disk drive failures but can also deal with some uncorrectable media errors. The continued growth in storage density and disk capacity is not accompanied by commensurate improvements in bit error rates. As a result, disk failures are more frequent, and rebuild procedures must read vast amounts of data: The risk of hitting a hard error is therefore no longer negligible.
Our activities in advanced RAID technologies investigate novel methods to address issues encountered in large size storage installations. In particular, we focus on the occurrence of such data losses and methods to improve the reliability (measured in terms of mean time to data loss, MTTDL) without sacrificing performance and storage efficiency.
We have proposed a novel protection mechanism that is complementary to the existing RAID schemes and further improves the MTTDL by 2-3 orders of magnitude. This mechanism is known as Sector Protection via Intra-Disk Redundancy (or SPIDRe). We compared SPIDRe with existing methods such as disk scrubbing and have developed an in-depth understanding of the benefits and tradeoffs.
