Modern data storage systems are extremely large and consist of several tens or hundreds of storage nodes. In such systems, node failures are daily events, and safeguarding data from them poses a serious design challenge. Data redundancy, in the form of replication or advanced erasure codes, is used to protect data from node failures. By storing redundant data across several nodes, the redundant data on surviving nodes can be used to rebuild the data lost by the failed nodes. As these rebuild processes take time to complete, there exists a chance of additional node failures occurring during rebuild. This eventually may lead to a situation in which some of the data becomes irrecoverably lost from the system.
Our activities in storage system reliability investigate novel methods to address issues encountered in large size storage installations. In particular, we focus on the occurrence of data loss and on methods to improve reliability without sacrificing performance and storage efficiency.
We have shown that spreading the redundant data corresponding to the data on each node across a higher number of other nodes, and using a distributed and intelligent rebuild process will improve the system's mean time to data loss (MTTDL). In particular, declustered placement, which corresponds to spreading the redundant data corresponding to each node equally across all other nodes of the system, is found to potentially have significantly higher MTTDL values than other placement schemes, especially for large storage systems.