Data Storage: Object-Oriented Storage: 14 Things You May Not Know
Object-Oriented Storage: 14 Things You May Not Know
by Chris Preimesberger
Aligns Storage Costs with the Value of Data
Object storage systems remove the complexity and management costs associated with keeping an enterprise storage system in production-ready status. Object storage is based on a single, flat address space that enables the automatic routing of data to the right storage systems, and the right tier and protection levels within those systems according to its value and stage in the data life cycle.
Better Data Availability than RAID
In a properly configured object storage system, content is replicated so that a minimum of two replicas assure continuous data availability. If a disk dies, all other disks in the cluster join in to replace the lost replicas while the system still runs at nearly full speed. Recovery takes only minutes, with no interruption of data availability and no noticeable performance degradation. Conversely, when a RAID disk fails, the system slows to a crawl while it takes hours or days to rebuild the array.
Provides Unlimited Capacity and Scalability
In object storage systems, there is no directory hierarchy (or "tree") and the object's location does not have to be specified in the same way a directory's path has to be known in order to retrieve it. This enables object storage systems to scale to petabytes and beyond without limits on the number of files (objects), file size or file system capacity, such as the 2-terabyte restriction that is common for Windows and Linux file systems.
Performance Scales Linearly as Cluster Grows
As new servers running on commodity hardware are added to an object storage cluster, performance scales linearly, providing massively parallel boosts in both processing and I/O capacity to support the vast number of reads and writes for small files as well as the throughputbytes per seconddemanded by large files, such as videos or medical images.
Leverages Metadata in Ways File Systems Cannot
Object storage systems can easily search for data without knowing specific filenames, dates or traditional file designations. They can also use the metadata to apply service-level agreements (SLAs), policies for routing, distribution and disaster recovery, retention and deletion, as well as automate storage management. These are functions that file systems just cannot address.
Built-In Archiving and Compliance
Reliable archiving is a must-have for any storage system. By some estimates, 70 percent of data generated is never accessed after its initial creation and remains static, while another 20 percent is categorized as semi-active and is rarely accessed. For compliance requirements, state-of-the-art object storage systems establish the authenticity of a specific content object by first creating a universally unique ID (128-bit UUID) for the location-transparent address. A digital fingerprint (hash or digest) can be combined with this, and these values can be stored as a content seal. Active access and long-term archiving co-exist in the same single object-based storage tier.
Backups Are Eliminated
With a well-designed object storage system, backups are not required. Multiple replicas ensure that content is always available and an offsite disaster recovery replica can be automatically created if desired. If the primary cluster becomes unavailable, the replica can be used transparently, since the UUID of all content is identical in all clusters where the replicas are stored. This operation is simply impossible in a file system, which often has to overcome the challenges of cumbersome backup windows and long and difficult restore operations.
Automatic Load Balancing
A well-designed object storage cluster is totally symmetrical, which means that each node is independent, provides an entry point into the cluster and runs the same code. This allows the workload to be evenly distributed across all nodes in the cluster and avoids hot spots that are prevalent in NAS and clustered file systems. Automatic load balancing ensures that I/O requests are automatically routed to the optimal node, keeping performance at a high level.
Migration Made Routine
In object storage infrastructures, a traditional hardware migration or even a large-scale upgrade is no longer a requirement. Instead, object storage architecture simply adopts migration as a daily fact of life. On a continuous basis, new units can be added and will automatically join the cluster, and old units can be retired with a single command.
Hardware Lock-In Eliminated
For archival storage and regulatory compliance requirements where content is maintained for years, the cost and complexity of refreshing technology are a major consideration, particularly for systems tied to expensive proprietary hardware platforms. Deploying a software-only object storage system agnostic to the underlying hardware allows customers to use any commodity server technology they choose, as well as to non-disruptively upgrade it when new hardware is introduced.
Provides Better Disk Utilization
Object storage provides better hard disk utilization than block storage because files are laid down contiguously on disk. Object storage knows the size of a file, so there is never a need to over-provision as with a block-based solution. This means that object storage can run 90 percent or greater in terms of disk utilization, whereas a block-based systemeven highly optimizedcan only achieve up to 70 percent at best.
High Availability and Disaster Recovery
High availability and disaster recovery are built into the object storage architecture. No special HA configuration, clustering or administrator intervention is required for failover or failback. Object storage, coupled with publishers and subscribers used for content distribution, can easily be set up for a high availability and a disaster recovery configuration. Both publishers are participating in an object based internode protocol and both process all events, so no coordination or failover is necessary.
Old Stuff Never Interferes with New Stuff
The typical trigger for organizations to start archiving their traditional file server-based information is when they hit response time issues. This leaves IT management no other option but to remove some "old stuff" from the server to create space for the "new stuff." Since object storage doesn't suffer the performance degradation of hierarchical file systems as a function of object count, there never is a technical reason to move content off the object storage cluster. It can simply be "archived in place."
Less Is More
In conventional storage there are a few standard protocols, but the rest are proprietary interfaces and stovepipe architectures, which most observers can see are reaching their performance limits. Object storage can be seen as the parallel scalable bottom layer of an emerging open and layered storage architecture, modeled after the successful network stack. The concept of the universally unique identifier allows its content to be solidly hooked into higher-level functions and databases.