SmartTier is a VxFS feature that enables you to allocate file storage space from different storage tiers according to rules you create. SmartTier provides a more flexible alternative compared to current approaches for tiered storage. Static storage tiering involves a manual one- time assignment of application files to a storage class, which is inflexible over a long term. Hierarchical Storage Management solutions typically require files to be migrated back into a file system name space before an application access request can be fulfilled, leading to latency and run-time overhead. In contrast, SmartTier allows organizations to:
Optimize storage assets by dynamically moving a file to its optimal storage tier as the value of the file changes over time
Automate the movement of data between storage tiers without changing the way users or applications access the files
Migrate data automatically based on policies set up by administrators, eliminating operational requirements for tiered storage and downtime commonly associated with data movement
SmartTier leverages two key technologies included with Veritas Storage Foundation: support for multi-volume file systems and automatic policy-based placement of files within the storage managed by a file system. A multi-volume file system occupies two or more virtual storage volumes and thereby enables a single file system to span across multiple, possibly heterogeneous, physical storage devices. For example the first volume could reside on EMC Symmetrix DMX spindles, and the second volume could reside on EMC CLARiiON spindles. By presenting a single name space, multi-volumes are transparent to users and applications. This multi-volume file system remains aware of each volume's identity, making it possible to control the locations at which individual files are stored. When combined with the automatic policy-based placement of files, the multi-volume file system provides an ideal storage tiering facility, which moves data automatically without any downtime requirements for applications and users alike.
In a database environment, the access age rule can be applied to some files. However, some data files, for instance are updated every time they are accessed and hence access age rules cannot be used. SmartTier provides mechanisms to relocate portions of files as well as entire files to a secondary tier.
Automated tiered storage (also automated storage tiering) is the automated progression or demotion of data across different tiers (types) of storage devices and media. The movement of data takes place in an automated way with the help of a software or embedded firmware and is assigned to the related media according to performance and capacity requirements. More advanced implementations include the ability to define rules and policies that dictate if and when data can be moved between the tiers, and in many cases provides the ability to pin data to tiers permanently or for specific periods of time. Implementations vary, but are classed into two broad categories: pure software based implementations that run on general purpose processors supporting most forms of general purpose storage media and embedded automated tiered storage controlled by firmware as part of a closed embedded storage system such as a SAN disk array. Software Defined Storage architectures commonly include a component of tiered storage as part of their primary functions.
In the most general definition, Automated Tiered Storage is a form of Hierarchical Storage Management. However, the term automated tiered storage has emerged to accommodate newer forms of real-time performance optimized data migration driven by the proliferation of solid state disks and storage class memory. Furthermore, where traditional HSM systems act on files and move data between storage tiers in a batch, scheduled like fashion, automated storage tiered systems are capable of operating at sub-file level both in batch and real-time modes. In the case of the latter, data is moved almost as soon as it enters the storage system or relocated based on its activity levels within seconds of data being accessed, whereas more traditional tiering tends to operate on an hourly, daily or even weekly schedule. Some more background on the relative differences between HSM, ILM and automated tiered storage is available at SNIA web site. A general comparison of different approaches can also be found in this 'comparison article on auto tiered storage'.
OS and Software Based Automated Tiered Storage
Most server oriented software automated tiered storage vendors offer tiering as a component of a general storage virtualization stack offering, an example being Microsoft with their Tiered Storage Spaces. However, automated tiering is now becoming a common part of industry standard operating systems such as Linux and Microsoft Windows, and in the case of consumer PCs, Apple OSX with its Fusion Drive. This solution allowed a single SSD and hard disk drive to be combined into a single automated tiered storage drive that ensured that the most frequently accessed data was stored on the SSD portion of the virtual disk. A more OS agnostic version was introduced by Enmotus which supports real-time tiering with its FuzeDrive product for Linux and Windows operating systems, extending support to storage class memory offerings such as NVDIMM and NVRAM devices.
SAN Based Tiered Storage
An example of automated tiered storage in a hardware storage array is a feature called Data Progression from Compellent Technologies. Data Progression has the capability to transparently move blocks of data between different drive types and RAID groups such as RAID 10 and RAID 5. The blocks are part of the "same virtual volume even as they span different RAID groups and drive types. Compellent can do this because they keep metadata about every block -- which allows them to keep track of each block and its associations.". Another strong example of SAN based tiering is DotHill's Autonomous Tiered Storage which moves data between tiers of storage within the SAN disk array with decisions made every few seconds".
Automated Tiered Storage vs. SSD Caching
While tiering solutions and caching may look the same on the surface, the fundamental differences lie in the way the faster storage is utilized and the algorithms used to detect and accelerate frequently accessed data. SSD caching operates much like SRAM-DRAM caches do i.e. they make a copy of frequently accessed blocks of data, for example in 4K cache page sizes, and store the copy in the SSD and use this copy instead of the original data source on the slower backend storage. Every time a storage IO occurs, the caching software look to see if a copy of this data already exists using a variety of algorithms and service the host request from the SSD if it is found. The SSD is used in this case as a lookaside device as it is not part of the primary storage. While some good caching algorithms can demonstrate native SSD performance on reads and short bursts of writes, caching typically operates well below the maximum sustainable rate of the underlying SSD devices as overhead CPU cycles are introduced during the host IO commands that increasingly impact performance as the amount of data cached grows. Tiering on the other hand operates very differently. Using the specific case of SSDs, once data is identified as frequently used, the identified blocks of data are moved in the background to the SSD and not copied as the SSD is being utilized as a primary storage tier, not a look aside copy area. When the data is subsequently accessed, the IOs occur at or near the native performance of the SSDs as there area are few if any CPU cycles needed to do the simpler virtual to physical addressing translations.
- Russ Taddiken – Senior Storage Architect (2006). Automating Data Movement Between Storage Tiers. Retrieved from the UW Records Management Web site: http://www.compellent.com/