Data deduplication goes a long way toward reducing data storage costs by making storage much more efficient, which in turn can reduce the overall footprint inside the data center. Knowledge Center contributor Chris Poelker explains data deduplication's benefits, including how leveraging data deduplication can help green your data center.
is data deduplication? What are its benefits? In simplified terms, data
deduplication means comparing objects (usually files or blocks) and
removing all non-unique objects (that is, copies). The basic benefits
of data deduplication can be summarized as follows: reduced hardware
costs, reduced data center footprint, reduced backup costs, reduced
costs for disaster recovery, and increased efficiency use of storage.
If you look at the left side of the figure below, you will see
several blocks being stored that are not unique. The data deduplication
process removes any blocks that are not unique, resulting in the
smaller group of blocks to the right.
You can apply data deduplication in multiple places. Wherever you
apply it, data deduplication can affect costs not only for your Storage
Area Network (SAN), but also for your entire IT infrastructure.
Based on an enterprise environment running typical applications, you
probably could squeeze out between 10 to 20 percent more storage space
just by getting rid of duplicate and unnecessary files. Files are
commonly known as "unstructured data" and the data residing in
databases is commonly known as "structured data." Simple unstructured
data in files can therefore be deduplicated at the file system level,
but the structured data residing in large databases is typically
deduplicated underneath the actual operating system's file system at
the block level.
Interestingly, though, since block-level deduplication does not need
to understand the file system, it is sometimes even more efficient to
deduplicate files at the block level. Whether you choose a solution
that works at the block level, file level or both, you will find that
it can pay for itself extremely fast in the amount of savings you get
from storage, media, power, cooling and floor space costs.