Spring cleaning — taking a long, hard look at your possessions to weed out the junk, so as to free up living space — is a longtime tradition in many cultures. Since we spend a good portion of our lives on computers, it also makes good sense to “spring clean” personal desktop/laptop systems, as well as business-use computers and servers.
Let’s face it: How often during the day do you pull aside a file of some kind, intending to return to it when time allows? Subsequently, how often do you not return to it? If we do happen to return, do you trash the file when you’re done? Those orphaned files — whether they be a Word document, photo, music, video or a shortcut to something on the Web — simply become junk after a short while.
This doesn’t even take into account all the automatic junk files an operating system such as Windows creates, too.
In business, this type of faux storage likely doubles or triples in intensity. And it all adds up very quickly.
Like anything else, if you do a little at a time regularly instead of a major once-in-a-great-while assault, it helps avoid a lot of headaches. Unfortunately, when most people start up a PC that has 80GB or 160GB of storage capacity on the main drive — not to mention adjunct drives that may hold up to 250GB or more — the immediate thought usually is: “I’m swimming in capacity. I’ll not have to worry about space getting tight for a few years.”
While that may be true at this time, it is definitely the wrong approach.
“IDC [and other analysts] talk about data doubling every 18 to 24 months,” Dave Roberson, senior vice president and general manager of Hewlett-Packard StorageWorks, told eWEEK. “Most of that is being driven by the content, or unstructured data.
“Database data is growing, but nearly as fast as unstructured data. The drivers of that are Web 2.0 companies; a perfect example is our own SnapFish, where they’re adding a million customers a month and a petabyte a month of capacity,” Roberson said.
SnapFish is HP’s online photo sharing/developing/printing service.
“Yes, a million [new subscribers] per month, and I have no reason not to believe it,” Roberson said. “A petabyte a month is also a pretty amazing statistic.”
Numerous Web 2.0 companies, including Amazon, eBay, Facebook, MySpace, Flickr and others are facing the same issues.
How does such a business keep track of all those files?
When you’re dealing with that kind of data stock, there’s only one answer: install storage archiving software. EMC, IBM, Symantec, NetApp and HP are among the larger names offering this now, and the market, analysts say, is only going to keep growing for at least the next five years.
Business systems need to have specific cataloging and file-saving protocols that must be followed by everyone, all of the time. How many of your employees do you think are saving their files incorrectly and in the wrong places? And how many are saving files they shouldn’t be saving, such as MP3s, videos, personal photographs and other documents, on business computers?
Spot-checks of all company computers — with the full knowledge of all employees — should be scheduled to keep everybody honest and the system lean.
There are few things more frustrating than spending a lot of time looking for something that should have been easy to locate. It’s also irritating to search for a file that someone has accidentally deleted. If you already have a storage archive system, an audit should be on the list of things to do for this year’s spring cleaning.
For very large corporate storage/archiving needs, HP will be coming out later this year with a new system called StorageWorks 9100 Extreme Data Storage System (ExDS9100), designed specifically for businesses with multi-petabyte data storage requirements such as Web 2.0, digital media and other large enterprise customers.
This system is built on blade servers in order to provide the performance needed to drive extreme capacity requirements, Roberson said. The base packages starts with four blades, each of which can deliver up to 200MB per second of performance. Each one can scale up to a maximum configuration of 16 blades with up to 12.8 cores per unit, for a 3.2GB/second performance level, Roberson said. More information will be available at a later date.
The Beast of the Enterprise
As far as e-mail archiving is concerned, Microsoft Exchange is — about 85 to 90 percent of the time, in fact — the big beast that enterprises have to deal with. Outlook e-mail is kept in large Exchange .pst files, which are like big balls of yarn that just keep growing — and taking up valuable storage space.
But Exchange is well known for being complicated and difficult to deal with, largely because the code base has been added to and patched over so many times in the last decade.
A number of companies have sprung up around Exchange — including such firms as Azaleos and PostPath — just to make it “so Exchange doesn’t suck so badly,” PostPath CEO Duncan Greatwood told eWEEK.
PostPath, based in Mountain View, Calif., calls itself the “creator of the only e-mail and collaboration server that is a drop-in alternative to Exchange,” on May 5 launched its Server Archive Edition.
“For the first time, simple, inexpensive and efficient e-mail archiving is available to small- and medium-sized businesses [SMBs] that need to retain copies of some or all of their e-mail communications, but don’t want to deal with the complexies and expense of Exchange,” Greatwood said.
PostPath’s secret sauce is called “simple-forward archiving,” which creates a simple repository for e-discovery searches, which ends the expense of third-party legal searches in the event of litigation.
PostPath’s Server Archive Edition also gives administrators immediate access to every old and current message in the system, enabling on-demand message recovery. Used in tandem, PostPath’s Server Archive Edition and its standards-based backup and/or high-availability mechanisms completes customers’ messaging data protection strategy.
Now, for some suggestions on how to clean out old and irrelevant files from personal computers before looking at an archive system or service:
As a first step — if you don’t have your own on-site backup system — you should sign up for an unlimited-capacity online backup service, such as Mozy (Home or Pro), Carbonite, Amazon S3, Iron Mountain or Google’s new offing, AppEngine. That way you have a safety net for everything that comes through each computer or server. Costs average about $5 per month/per computer.
Provision at least two full days to allow the backup service to do its job. Once that is done, it’s now time for the dirty work: going into the computer or server and finding all the files that shouldn’t be there and trashing them. It’s pretty simple to find all the MP3 files in a computer, for example, simply by using the computer’s own search function. But it’s very time-consuming to weed out all the files that shouldn’t be there.
Getting a good piece of maintenance software like Webroot’s Window Washer is a good idea — for Windows machines, that is. A free version is available for download that cleans out the trash, temporary Windows and Internet files, and a number of other sectors on a regularly scheduled basis, if you choose to do it that way. One person can literally save hundreds of megabytes’ worth of capacity on a daily basis that way.
A drastic method — to be used as a last resort if the computer is so full of garbage that it just isn’t working very well anymore — is to simply clean off the main drive and reinstall the operating system and all the applications. This can take hours to do, but sometimes it’s the best way to go.
If you do decide to go that road, be sure that all your bookmarks and e-mail contacts are backed up. It can take years to build up a large pool of that kind of information and a mere few minutes to destroy it all.
Cleaning up e-mail .pst files — which often can be a gigabyte or more in size — is more complicated. It’s indeed time-consuming, but the best thing to do is simply go through all your e-mail — perhaps alphabetically — and kill everything out that is irrelevant.
Tedious, yes. Does this cry out for an e-mail archiving system, especially if you have hundreds or thousands of seats in an enterprise? You bet.