How backup environments benefit
Let's look at a typical backup environment as an example, since that is the area that benefits greatly from data deduplication. Data deduplication solutions can be implemented in many places but data backup and data archiving are the areas where benefits are immediately apparent. The more data you have, and the longer you need to retain it for business reasons or regulatory purposes, the better results you see from your data deduplication solution.
The figure below shows a sample dataset of 20 TB being retained over five weeks, with typical data growth and change rates. If you use a traditional backup solution (such as Veritas NetBackup, CommVault, IBM Tivoli Storage Manager (TSM), EMC Legato or HP Data Protector) to back up the data to media (disk or tape) with no deduplication, you'll need to store more than 101 TB of data in only five weeks. [Okay, for you IBMers out there, TSM is a progressive backup solution, so you will probably store less on tape but don't get me started on all the disk-based file systems being used for the D2D (disk to disk) part of the backup!]
In the figure below, you can see that after five weeks with no deduplication going on, you will have stored about 110 TB of data.
Now let's take the same metrics and apply a deduplication ratio of a little over 6-to-1. Instead of storing 110 TB, we now only need to store a little more than 24 TB for the exact same amount of information.
All things being equal, we can see that data deduplication can offer a dramatic savings in data center floor space, tape media costs, tape storage and shipping costs. And, if used in conjunction with disks as a backup methodology, much faster recovery if something goes wrong.
The green aspects of data deduplication even extend outside the data center to the trucks that are no longer required to ship bulky tapes offsite. I haven't even mentioned yet how data deduplication can improve disaster recovery. Less WAN bandwidth needed to replicate data is a major benefit. Another benefit is, if you send less, you store less on the other side-which relates to the cost of storage, power and cooling of the DR location. So you can see, the value and the benefits can add up real fast, and that relates to a greener world for you in more ways than one.