Close
  • Latest News
  • Artificial Intelligence
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Applications
    • Applications
    • Database
    • IT Management

    What Is the Difference Between Data Deduplication, File Deduplication, and Data Compression?

    By
    eWEEK EDITORS
    -
    August 15, 2007
    Share
    Facebook
    Twitter
    Linkedin

      Q: Can you explain the differences between compression, file deduplication and data deduplication?
      A: All of these products fit into an overall market and technical concept, which is capacity optimization or data reduction. This refers to a broad group of products that seek to reduce the amount of data that has to be stored. Roughly speaking, you can rank these techniques by the amount of data reduction they yield. Compression might typically get you a 2-to-1 reduction. File deduplication, which is commonly known as content addressable storage or CAS, might yield a 3-to-1 or 4-to-1 reduction. But data deduplication-which is deduplication at the level of individual disk blocks or “chunks” rather than entire files-can often give you a 20-to-1 reduction or better, depending on the type of data. Remember, were talking about the aggregate reduction in the total amount of data stored on your backup storage device, not necessarily the reduction in any particular file or block, which can vary considerably.

      Q: Why is data deduplication so much more effective in reducing data than file deduplication?
      A: Data deduplication examines all your data on the block level and eliminates redundant blocks. So obviously it will take care of entire files that are redundant, but unlike file deduplication it will also eliminate the redundant pieces that occur when many slightly different versions of the same file are created by users or by applications like Microsoft Exchange. If users have been e-mailing back and forth a PowerPoint file while making minor changes, you can end up storing 10 or 20 files whose content is 95 percent identical. Data deduplication will catch that.

      Q: When should you use data deduplication and when should you use file deduplication?
      A: A very short answer would be that file deduplication is often used for backup solutions in so-called ROBO environments (remote office, branch office). Data deduplication can be used either in the data center itself, as a software function installed on the intelligent disk target, or on the backup client side in a ROBO environment.

      Q: Who are some of the more commonly used data deduplication vendors?
      A: There are plenty of vendors, because data deduplication is a very hot area these days, especially now that the VTL (virtual tape library) vendors are getting involved. There is Avamar (acquired by EMC), Symantec Puredisk, Asigra, Data Domain, Diligent Technologies, Falconstor, Sepaton, Quantum. Network Appliance has a product in beta.

      Q: Who are some of the more commonly used file deduplication or content addressable storage vendors?
      A: EMC has the Centera product line. Then there is Archivas (recently acquired by Hitachi Data Systems) and Caringo.

      Q: What accounts for the difference in yield between compression and file deduplication?
      A: With compression you are using some algorithm or other to reduce the size of a particular file by eliminating redundant bits. But if your users or applications have stored the same file multiple times, then no matter how good your compression method is your backup storage will end up with multiple copies of the compressed files. File deduplication goes a step further and eliminates these redundant copies, storing only one. So it gives you more reduction than just compression alone.

      Q: Where does delta block optimization fit in?
      A: This is another capacity optimization technique. Its used by incremental remote backup products like Connected (acquired by Iron Mountain) and EVault (acquired by Seagate). When you go to back up the most recent version of a file that has already been backed up, the software looks at it and tries to figure which blocks are new. Then it writes only these blocks to backup and ignores the blocks in the file that havent changed. But again, this technique has the same shortcoming compared with file deduplication as compression. If two users sitting in the same office have identical copies of the same file, then delta block optimization will create two identical backups instead of storing just one like file deduplication.

      eWEEK EDITORS
      eWeek editors publish top thought leaders and leading experts in emerging technology across a wide variety of Enterprise B2B sectors. Our focus is providing actionable information for today’s technology decision makers.
      Get the Free Newsletter!
      Subscribe to Daily Tech Insider for top news, trends & analysis
      This email address is invalid.
      Get the Free Newsletter!
      Subscribe to Daily Tech Insider for top news, trends & analysis
      This email address is invalid.

      MOST POPULAR ARTICLES

      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Applications

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      IT Management

      Intuit’s Nhung Ho on AI for the...

      James Maguire - May 13, 2022 0
      I spoke with Nhung Ho, Vice President of AI at Intuit, about adoption of AI in the small and medium-sized business market, and how...
      Read more
      Applications

      Kyndryl’s Nicolas Sekkaki on Handling AI and...

      James Maguire - November 9, 2022 0
      I spoke with Nicolas Sekkaki, Group Practice Leader for Applications, Data and AI at Kyndryl, about how companies can boost both their AI and...
      Read more
      Cloud

      IGEL CEO Jed Ayres on Edge and...

      James Maguire - June 14, 2022 0
      I spoke with Jed Ayres, CEO of IGEL, about the endpoint sector, and an open source OS for the cloud; we also spoke about...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2022 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×