In the field of tiered archival storage we address new storage requirements posed by the companies and organizations that base their operations and mission-critical businesses on the ability to store and process vast amounts of data efficiently and cost-effectively. The common new storage requirements are:
- Data needs to be easily available through a standard interface and via a single name space.
- Data needs to be protected continuously and stored for a long time.
- Storage costs and access requirements need to be optimized based on time-varying data usage or value.
- System should scale to a very large number of files or data objects.
Scalable active archive is another term used for storage systems that satisfy these requirements.
Our research on this topic focuses on integrating solid-state drive, disk, and tape tiers under a single name space, and providing additional management functions for moving the data between the tiers.
To provide a single name space, reliability, scalability, and data management, we leverage IBM’s General Parallel File System (GPFS) technology and OpenStack Swift.
To add a reliable and cheap storage tier, we integrate the open-standard Linear Tape File System (LTFS) technology.
LTFS EE provides more details about our approach for building a clustered file system on top of flash, disk and tape.
IceTier provides more details about our approach for building object storage on top of disk and tape.
DOME is a related project that covers the tiered storage aspects such as tiers dimensioning and data placement optimization.
Figure 1. Tiered storage combines different types of storage media, preferably under a single name space and using a standard interface, equipped with data lifecycle management functions for migrating data between different storage tiers.
Ask the expert
IBM Research scientist