Research Specifies How Compounding of Data Causes Storage Problems
Not having a specified plan to edit, deduplicate and/or compress stored
business data in backup or in archives is like compound interest at a bank-but
in a not-so-good way.
At the bank, money slowly builds upon itself to result in higher income for the client. However, in storage, data builds upon itself to slowly but surely undermine an entire storage system.
If duplicate and unnecessary business data and files keep multiplying upon themselves, the mass of content slowly becomes too unwieldy for systems to handle properly. This doesn't happen overnight; rather, it insidiously poisons a storage system as backup builds upon backup, tapes pile upon tapes, and control over everything becomes lost.
Ultimately, what's left are piles of digital tapes or racks of loaded arrays with no clear way to access specific files.
"We're dealing with a situation in which companies are celebrating that they recycle paper, plastic and whatever else, but have the utmost worst policies for information management in the data center," Symantec Director of Product Marketing Sean Reagan told eWEEK.
Symantec recently released the findings of its 2010 Information Management Health Check Survey, the main message of which is that a majority of enterprises are not following their own advice when it comes to information management.
Eighty-seven percent of respondents believe in the value of a formal information retention plan, but only 46 percent actually have one, the survey found. Survey results also found that too many enterprises save information indefinitely instead of implementing policies that allow them to confidently delete unimportant data or records, and therefore suffer from rampant storage growth, unsustainable backup windows, increased litigation risk, and expensive and inefficient discovery processes.
The survey touched 1,700 companies-each with more than 500 employees-in 26 countries. Just over 90 percent of the respondents said they believe they should have a policy to delete data whenever it needs to be deleted.
Simple idea, but not so easy to implement
"The fact is, while most people think it's a good idea to just keep everything, it is not a good idea to keep everything forever," Reagan said. "The way companies are dealing with this today doesn't work. It doesn't work from a storage perspective because you just can't afford to keep everything forever. Infinite retention policies lead to infinite waste."
This is where the compounding comes into play. Companies that do not have clear data deletion policies keep building up data stores that are backed up each night or week, with nonsignificant files and other data lumped in that shouldn't be there.
The more that data compounds upon itself at each backup, the slower and more inefficient the backup and subsequent storage becomes.
"It also doesn't work from a recovery and discovery perspective," Reagan said. "Companies that have just kept everything on file with no real organizing factor to it end up in a serious disadvantage when it comes time to refute allegations that they've deleted e-mail inappropriately or haven't managed their information."
Why aren't companies getting a better handle on managing their data?
"People are taking the traditional way out," Reagan said. "In the past, companies have been buying storage, and they have a pretty well-thought-out process for that. It's very easy to throw storage at the problem. Deduplication can be added to help people store data better, but dedupe won't be the answer to everything. Dedupe will do what it can, but eventually companies are going to have to look closely at their data and start expiring some of it.
"I think people are deferring to the processes they've known and understood, and not taking it forward."
Eventually, companies will reach a point where this strategy is a "fail," Reagan said. They will completely run out of options for managing their data and eventually will have to make rapid decisions around what to delete, he said.
All this nonorganization of storage can end up to be a real cost center for businesses, especially when it comes to finding files in a window of time for litigation.
"The time it takes to recover some of these massive backups, or to find information for litigation or internal investigations, is enormous, and the cost is enormous. We've got some math that shows it's somewhere between 1,500 and 3,000 times more expensive to search and review information than to actually store it," Reagan said.
"The more we store, the bigger this downstream problem gets. And there is only a subset of companies that are actually aware of this and figuring it out."