Five 'Dirty Little Secrets' to Know When Buying a Data Archive

 
 
By Chris Preimesberger  |  Posted 2008-10-10 Email Print this article Print
 
 
 
 
 
 
 

It turns out there are several so-called dirty little secrets that not every vendor will tell you ahead of time about archiving products. There are five categories of these "secrets": scalability, data protection, performance, data migration and energy efficiency. If you're in the market, you need to read this first.

It turns out there are some so-called dirty little secrets that not every vendor will tell you about archiving products. They fall into five categories of "secrets": scalability, data protection, performance, data migration and energy efficiency.

Dirty Little Secret No. 1:  Scalability. CAS (content-addressable storage) archives have a hard limit on the number of objects that can be stored. 

This is a very different metric from the total amount of usable storage a system might have.

"What nobody tells you is that as you grow the number of your stored objects, you're going to run into a few challenges," said Bob Woolery, senior vice president of marketing at Nexsan, which makes SANs (storage area networks) and archiving packages. "Let's say you have 5 terabytes of space. You say, 'Great, when I run out of 5TB, I'll buy 5TB more.' And you purchase it based on that. But the other constraint is your object count.

"Why this is important is that you can grow your archive so large in terms of object count that the system will give you an 'all full up,' when you still may have plenty of capacity left," Woolery said.

So you call up your local vendor and tell him that your system thinks it is full when you still have, say, 2TB of capacity left. "That's when you find out that the object count is what really determines how much capacity you use," Woolery said.

An object limit can be reached long before the actual storage limit is reached, which means customers now have to invest in a second expensive database even though they technically still have space available.

A good example of this is e-mail. A company may archive all e-mail for compliance purposes. The vast majority of these e-mail objects may be small in size, but the sheer volume may max out the archive's object limit quickly, leaving gigabytes or terabytes of storage space unused. This is usually a big shock for companies.

Dirty Little Secret No. 2: Performance degradation. As objects pile up in an archive, the speed at which the archive runs slows down tremendously.

"What they don't want to tell you is that all of a sudden when you get near your object limit, you get this 'crawl' effect," Woolery said. "When you look under the hood of an archive, you see a single database. With the exception of [Nexsan's] Assureon, which has a dual [database], all of those systems have a single database. It can be a small or a large one, but it is still a single database."

A database simply gets filled up and overwhelmed with managing a high number of objects and all their corresponding metadata.

"Because it had to manage an ever-growing number of objects and process them, the processors within the archive end up spending so much time managing those objects that they're not able to take in as many files and push them out the door when you need them," Woolery said.

A dual-database setup alleviates this issue, he said.



 
 
 
 
Chris Preimesberger Chris Preimesberger was named Editor-in-Chief of Features & Analysis at eWEEK in November 2011. Previously he served eWEEK as Senior Writer, covering a range of IT sectors that include data center systems, cloud computing, storage, virtualization, green IT, e-discovery and IT governance. His blog, Storage Station, is considered a go-to information source. Chris won a national Folio Award for magazine writing in November 2011 for a cover story on Salesforce.com and CEO-founder Marc Benioff, and he has served as a judge for the SIIA Codie Awards since 2005. In previous IT journalism, Chris was a founding editor of both IT Manager's Journal and DevX.com and was managing editor of Software Development magazine. His diverse resume also includes: sportswriter for the Los Angeles Daily News, covering NCAA and NBA basketball, television critic for the Palo Alto Times Tribune, and Sports Information Director at Stanford University. He has served as a correspondent for The Associated Press, covering Stanford and NCAA tournament basketball, since 1983. He has covered a number of major events, including the 1984 Democratic National Convention, a Presidential press conference at the White House in 1993, the Emmy Awards (three times), two Rose Bowls, the Fiesta Bowl, several NCAA men's and women's basketball tournaments, a Formula One Grand Prix auto race, a heavyweight boxing championship bout (Ali vs. Spinks, 1978), and the 1985 Super Bowl. A 1975 graduate of Pepperdine University in Malibu, Calif., Chris has won more than a dozen regional and national awards for his work. He and his wife, Rebecca, have four children and reside in Redwood City, Calif.Follow on Twitter: editingwhiz
 
 
 
 
 
 
 

Submit a Comment

Loading Comments...

 
Manage your Newsletters: Login   Register My Newsletters























 
 
 
 
 
 
 
 
 
 
 
Rocket Fuel