A Trial by Fire
Take this copy of eWeek, your laptop and cell phone to the nearest cafe. As you settle in with a coffee, imagine that this is all thats left of your entire office.
Disaster recovery planning became a top priority after the events of Sept. 11, 2001, but one companys recent experience shows that the best of recovery plans can be skewed by seemingly mundane events and that good disaster recovery planning isnt a one-time event but a constantly evolving strategy.
In the wee hours of July 15, a fire started on a deck of the two-story wood-frame building in Walnut Creek, Calif., that had been home to WildPackets Inc. for the past 12 years.
The fire department was on the scene 5 minutes after smoke and heat sensors went off, but thats all the time it took for the fire to take hold. By daylight, the building was a total loss.
Fortunately, except for some spiders that have pestered WildPackets employees since they moved in, no one was in the building, and no emergency personnel were injured while fighting the fire. However, aside from two hard drivesone from a Macintosh and the other from a PCnothing of value was recovered from the former home of this 60-plus-employee network traffic monitoring company. The losses included the most recent 70-tape backup library of the companys intellectual property, customer records and financial reports.
"I got the call from the alarm company at 12:43 a.m.," said Mahboud Zabetian, WildPackets president and CEO, "but even as I left my house, which was just a couple miles away from the office, I could see an orange glow in the sky."
Before passing by the rest of this story, make sure you can answer these simple questions: Where are the passwords for your companys encrypted tape backups? When was the last back-up taken off-site? How long would it take to restore data if the source equipment was gone? Has anyone in the organization even practiced a restore from tape? Is the loss plan based on single machines or on whole departments? For that matter, when was the last time a fire drill was conducted at the organization?
WildPackets didnt have the answers to many of these questions before the fire and is paying the price now.
Consider this: It took WildPackets nearly two weeks to recover the data from a month-old tape backup that was stored off-site. And the companys Web site was down for three days while a DNS (Domain Name System) change worked its way through the bureaucracy at VeriSign Inc.
The companys experience should prompt many IT managers to shake up their approach to business continuity.
First, make sure to set aside time to really pick apart standard IT policies. For example, since its start in business, WildPackets had the habit of keeping tape libraries month by month. This was fine when the small company filled only one or two tapes a month. But, by the time of the fire, the library was a mammoth 70 tapes. "We should have looked at that policy and said, When the library is 10 tapes, thats ittime to start a new one," Zabetian said.
As it was, WildPackets received a lot of special assistance from backup software company Dantz Development Corp. First, WildPackets needed to recover the encryption keys for its backups. Second, and much more time-consuming, employees had to manually search the backup catalog for the essential files they needed to get back to work.
Backup recovery was complicated by the fact that at the beginning of July, just weeks before the fire, WildPackets started using a new AIT-3 (Advanced Intelligence Tape-3) backup system. Naturally, the tape in the high-capacity AIT-3 drive at the time of the fire was destroyed.
WildPackets hopes of a speedy restore were dashed when it discovered that an archived AIT-3 tape in the companys on-site fireproof box looked OK, but the intense heat made the tape so brittle that it crumbled when salvage experts at Computer Conversions Inc. tried to load it.
DriveSavers Data Recovery Inc. gave WildPackets its first break. The data salvage company was able to spin up the PC disk drive salvaged from the site. It was a pure stroke of luck that the disk contained WildPackets Intuit Inc. QuickBooks accounting package. WildPackets was able to restore its accounts receivable, invoices and many customer records from this system.
Now, said Zabetian, backup tapes will be going off-site more frequently. The company is also considering adding another T-1 communications circuit just to handle moving incremental backups out of the building. In addition, WildPackets is looking at products and Windows operating system tools that will centralize data on its servers, not on individual desktop systems.
Before the fire, WildPackets used Genuity Inc.s hosting service for some Web site and FTP support but still maintained its own DNS and e-mail servers.
Now, everythings going to Genuity, whose disaster-preparedness strategy includes redundant power and fire suppression systems.
To file insurance claims properly and promptly, WildPackets needed detailed records of the equipment it lost in the fire, including information such as the purchase price, date of purchase and replacement value.
WildPackets reconstructed most of this information from state tax records (a good incentive to accurately list all business assets) and from receipts stored at the companys tax preparers office.
Larger enterprises should take this lesson to heart and ensure that all hardware and software assets are inventoried regularly. This information should be stored off-site, just like tape backups.
WildPackets executives had to reconstruct this crucial claims information from a scattering of sources at the same time that they were trying to secure new offices, rally employees, and project an air of confidence to customers, vendors, insurance companies and banks.
Today, WildPackets is finishing a bid for a new, sprinkler-equipped building and hopes to take occupancy by the end of the year. During eWeek Labs visit, employees were sharing tight quarters with little private work space, but the atmosphere was one of business as usual as they work to develop new products.
The biggest issue now, they say, is the lack of a full-fledged quality assurance testbed; the company is waiting until it moves into its new offices to build one.
One final note: During the recovery process, WildPackets employees did find a few non-IT survivors in the burned-out remains of an office drawerseveral pesky spiders.
Senior Analyst Cameron Sturdevant can be reached at firstname.lastname@example.org.