Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Database
    • Database
    • Small Business
    • Storage
    • Virtualization

    How to Implement a Successful Data Deduplication Strategy

    Written by

    Eric Schou
    Published August 30, 2010
    Share
    Facebook
    Twitter
    Linkedin

      eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

      The IT organizations of today cannot rely on the data protection model of yesteryear, which can be characterized as tape-based, decentralized and populated primarily with physical servers. Virtualization and the large amounts of data to protect mandate a new approach to protecting and managing information.

      These days, with 50 percent annual data growth, how can organizations protect all of their data within an ever-shrinking backup window? How quickly can virtual machines or complex applications such as SharePoint actually be restored? And how much data can businesses really afford to lose in the event of an outage?

      Just as next-generation tools such as disk-based backup are revolutionizing data protection, data deduplication is enabling a new era of information management. Now, with the ability to deduplicate data everywhere and manage it centrally, organizations are able to not only improve data protection operations and lower costs but move towards a more systematic approach for managing information growth.

      Why data deduplication?

      Simply stated, data deduplication is the process of eliminating redundant data. Deduplication backs up only unique data at the sub-file level. Needless to say, in environments where storage needs continue to intensify and holding down costs remains a key issue, deduplication offers welcome relief for today’s IT organizations.

      Once familiar with data deduplication, it should not be a surprise that by eliminating redundant data, deduplication enables companies to reduce storage costs. What many do not know, however, is that deduplication has other useful benefits such as bandwidth savings, faster backups, backup consolidation and easier disaster recovery-depending on where and how it is used.

      Measurable Benefits from Data Deduplication

      Measurable benefits from data deduplication

      By looking at all the ways and places one can benefit from data deduplication, an IT organization can make the right decision on where to begin using this powerful technology. IT organizations have seen a range of measurable benefits from deduplication that includes the ability to:

      1. Move up to 90 percent less VM data

      2. Reduce backup storage by as much as 95 percent

      3. Minimize backup windows and reduce network utilization by up to 90 percent

      4. Eliminate 80 percent of tape costs and obviate the need to invest in virtual tape libraries

      Data deduplication can be performed in two places: at the source or at the target. Deduplication as close to the information source as possible delivers the most value and can enhance a large part of many environments. Of course, every environment is different and these decisions should be based on the type of data, the volume and, of course, the recovery service-level agreements (SLAs) of that environment.

      Data deduplication at the source

      With source (often referred to as client-side) data deduplication, data is deduplicated before it is transmitted across the network and stored. By eliminating redundant data before it is sent across the network, deduplication at the source improves the efficient utilization of bandwidth, storage and VM resources across the entire infrastructure.

      It is likely that many organizations could use client-side data deduplication for as much as 60 to 80 percent of their data. This would result in faster backups, dramatically less network usage and reduced storage consumption.

      Some client-side data deduplication solutions work the same across virtual and physical environments. As a result, regardless of whether it is a VM or a physical machine, less data is stored. This not only reduces storage costs in the data center, it also makes it easier to move data to a disaster recovery site using replication.

      Data Deduplication at the Target

      Data deduplication at the target

      Data deduplication can also occur at the target such as a media server or a storage appliance. With media server deduplication, backup data moves from a client (the system protected) to the backup software’s server (the media server). The media server performs the deduplication and sends only the unique data segments to the back-end storage. This leads to savings in back-end storage as well as a reduction in the infrastructure needed to store backup data.

      Media server data deduplication is very suitable for use cases such as off-host VM backups, Network Data Management Protocol (NDMP) backups and data center work loads such as high-transaction databases that tend to have high data change rates.

      Like data deduplication at the media server, deduplication by an appliance is also considered target-side data deduplication. With a disk-based deduplication appliance, backup data moves across a network from a client to a backup server and then to the appliance. The appliance performs deduplication and sends the unique data to its storage source, resulting in an overall reduction in backup storage.

      While most backup software products see these appliances as native disk, some vendors have begun to offer solutions with tighter integration between the software and the storage appliance. The additional integration allows organizations to further improve the performance and savings they derive from these appliance. For example, tighter integration can enhance the use of replication, improve the speed of recovery or enhance disaster recovery operations by better integrating with tape devices.

      Next Steps to a Data Deduplication Strategy

      Next steps to a data deduplication strategy

      Clearly, data deduplication is a cost-effective information management tool that organizations can use virtually anywhere in their enterprise to address pressing IT challenges. From remote offices to VMs to data center work loads, deduplication can play a role in controlling storage costs, increasing reliability and simplifying operations. Here are four questions to ask to help prioritize the approach:

      1. What percentage of data is backed up across the network?

      2. Are VM backup or recovery times satisfactory?

      3. Is there storage that could be redeployed for backup data deduplication?

      4. How much savings would be realized if 50 percent of tapes were eliminated?

      Broadly speaking, data deduplication helps organizations meet increasingly strict SLAs associated with backup windows, recovery time objectives (RTOs) and recovery point objectives (RPOs). But remember that organizations can benefit from deduplication in more than one place. Client-side deduplication can improve backup times for physical and VMs and reduce bandwidth requirements. Of course, target-side offers similar storage benefits and may not require updates to existing backup clients.

      Finally, there are solutions on the market that offer a combination of both source and target data deduplication to achieve even greater storage savings and ROI. Find the approach that works best for you. You’ll soon realize that deduplication is no longer a “nice to have.” It is a requirement in the data center.

      Eric Schou is a Senior Manager with Symantec Corporation. Before joining Symantec, Eric spent over 10 years in the storage industry, working for both Maxtor Corporation and Quantum Corporation in a marketing capacity. Prior to that, Eric worked for Arrow Electronics for five years as a senior sales representative, managing Tier 1 distribution customers. He can be reached at eric_schou@symantec.com.

      Eric Schou
      Eric Schou
      Eric Schou is a Senior Manager with Symantec Corporation. Before joining Symantec, Eric spent over 10 years in the storage industry, working for both Maxtor Corporation and Quantum Corporation in a marketing capacity. Prior to that, Eric worked for Arrow Electronics for five years as a senior sales representative, managing Tier 1 distribution customers.

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      MOST POPULAR ARTICLES

      Artificial Intelligence

      9 Best AI 3D Generators You Need...

      Sam Rinko - June 25, 2024 0
      AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
      Read more
      Cloud

      RingCentral Expands Its Collaboration Platform

      Zeus Kerravala - November 22, 2023 0
      RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
      Read more
      Artificial Intelligence

      8 Best AI Data Analytics Software &...

      Aminu Abdullahi - January 18, 2024 0
      Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
      Read more
      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Video

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2024 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×