Data Storage - eWeek



How to Choose the Right Deduplication Technology





  Table of Contents:
  1. How to Choose the Right Deduplication Technology
  2. System Scalability
  3. Decreases Data Volume Where It Runs

A well-chosen, thoughtfully deployed deduplication solution can deliver an impressive ROI for organizations of all sizes. Choosing the right deduplication approach can make a big difference in results. Here, Knowledge Center contributor Janae Lee explains four basic rules about deduplication technology that will help organizations choose the deduplication approach that is best for them.

How to Choose the Right Deduplication Technology
( Page 1 of 3 )

Deduplication is a hot technology. Because of this, many vendors have responded with a proliferation of approaches and terminologies that seem more designed to confuse than to explain. Global deduplication. Content-aware. Target-based. Source-based. ISV-integrated. So, what does it all mean? And how can businesses know when and how to deploy this new offering?

When it comes to deduplication, it helps to focus on the basics. For example, just what is deduplication and what benefits come from using it?

Deduplication explained

First, deduplication is a data discovery and indexing technology which decreases the volume of data in a storage or communication system while maintaining complete data access.

By reducing data volume, deduplication decreases the hardware, software, communications and administration costs associated with maintaining and managing the data. Unlike tools such as data classification, which require human analysis and intervention, data deduplication happens automatically.

A deduplication system finds strings of data which are exactly the same, saves the first instance of each unique string, and stores a pointer (index) for every successive copy. Generally, this process is sub-file. The definitive ROI of any deduplication product is its deduplication ratio—that is, the degree to which it extracts common data, reducing volume. A 100TB data set with a 2:1 deduplication ratio will result in approximately 50TB of data needing to be stored. That same data set at a 20:1 ratio will result in storing five terabytes, while still maintaining application access to all the same information.

Different deduplication products accomplish the string discovery and indexing process in different ways. Despite this, there are four basic rules driving which approach is the best fit for an organization.

Rule No. 1: Higher deduplication ratios are good

Higher deduplication ratios are good; these are delivered by data intelligence and system scalability. Different deduplication approaches and products deliver different deduplication ratios. The success of an approach rests on the solution's effectiveness in finding common strings.  Products that operate sub-file and account for variable length strings tend to discover and extract more duplicate data. Results vary by product and by application usage.

For example, backup natively creates many copies of data both across and within systems over time, but the resulting deduplication ratios can vary widely depending on data type, data change rate and even the customer's backup model. An average deduplication ratio of 20:1 or higher is not unusual, but underneath this average may be virtual machine file backups at 40:1, e-mail backups at 15:1 and transactional database backups at 3:1. Solutions claiming "content awareness" often promise higher deduplication ratios. Organizations should ignore the lingo and assess the results. Most vendors offer a tool or consulting approach to help businesses size what results their product will deliver for an environment.



 
 
>>> More Data Storage Articles          >>> More By Janae Lee
 

FEATURED SPONSOR MESSAGE

Start the New Year with business intelligence—it’s a smart move

Join us on February 1 for an encore rebroadcast at either 5 am or 12 noon EST and discover how business intelligence (BI) supports companies in uncertain business and economic climates. Get expert advice on how to create a strategy that fits your organization's needs and budget and see how quickly it can pay for itself.

Click Here

Brought to you by


eweek digital



Advertisement
 
APPLY FOR A FREE 
SUBSCRIPTION BELOW:

>Try digital eWEEK
>Renew today
>Subscription help
>More FREE Subscriptions
First Name:Last Name:
Title:Company:
Address:City:
State:Zip Code:
Email:
eWEEK Quick LInks