Disaster Recovery Planning Is Simpler, and Harder, Than Ever

Data storage planning for a disaster of any sort used to be a relatively simple process: The IT staff would back up all important current business data onto extra disk drive servers every night or weekend; transfer older information onto archive tape once a week or month; then watch Iron Mountain or a similar service come take the cartridges away and store them in a cool, dark place where the data would likely never be seen again.

It’s not so simple anymore, as recovery of that data is of greater concern. Planning, installing, deploying and testing disaster recovery systems is now a mandate for enterprises large, medium and small. Since the Internet began serving as a major business venue in the late 1990s, disaster recovery has become a heavily regulated part of a company’s business due to increasing litigation and liability concerns.

In fact, data recovery is now a $20 billion-per-year sector of IT. Two key factors have driven this rise: an increasing number of natural disasters around the world and a number of new data-related regulations recently enacted in the United States and Europe.

The updated Federal Rules of Civil Procedure, the 2002 Sarbanes-Oxley Act and other regulations have changed the way enterprises retain their business data and control access to it. For example, the FRCP regulations, adopted by the U.S. Supreme Court in April 2006 and enacted in December 2006, say businesses must be able to quickly find data when required by the federal court in litigation. That means that every electronic document stored by a business-including e-mail, instant messages, financial documents, computer logs, voice mail, and all text and graphical documents–must be easily retrievable.

Enterprises also must be able to show auditors or courts that they have a repeatable, predictable system in place to handle this data–a high percentage of which is personal customer information, such as contact information, Social Security numbers, buying history and credit information.

Disaster recovery processes thus have become much more automated and security-oriented.

Disaster Recovery Planning Is Simpler, and Harder, Than Ever

=Repeatable, Recoverable, Reportable}

Lew Smith is product manager of virtualization solutions for Interphase Systems in Plymouth Meeting, Pa. The company assists enterprises with planning and managing systems infrastructure and virtualization, compliance/governance and disaster recovery and business continuity.

“Katrina, 9/11, the [Thailand] tsunami–all these events have really raised the awareness of the importance of a good DR plan with our customers,” Smith told eWEEK. “Because now it’s not a matter of if a disaster happens; customers are realizing that it’s a matter of when a disaster happens.”

As a result, companies are investing more in disaster recovery software, hardware and services.

One of the biggest drivers for disaster planning was the Sept. 11, 2001, terrorism attacks.

“I feel horrible for those businesses [that lost everything],” Smith said. “But it was an excellent lesson for businesses to learn, because they really can’t be compliant unless they have a DR system that’s repeatable, recoverable and reportable. Those are the big three pieces that have to be taken into consideration.”

Prices for servers and storage hardware generally have come down and performance has gone way up in the seven years since the 9/11 attacks. Automation of critical processes has become almost pervasive in the disaster recovery sector, making such systems easier to install and deploy.

“Hardware prices aren’t where they used to be. Have they come down a bit? Yes. Has the horsepower increased? Absolutely,” Smith said. “But if you look at the technology that’s now on top of that horsepower-specifically, virtualization–that’s by far the biggest development we’ve seen in the last 10 to 15 years in the DR sector.”

The Virtualization Factor

Virtualization, certainly, is now the prime method enterprises use to consolidate the equipment in data centers, cut bottom-line power and cooling costs, and shrink carbon footprints. The portability of resources that virtualization brings provides IT managers with a lot more options for their disaster recovery strategies.

“With virtualization in the data center [and in the disaster recovery system], you now have a powerful weapon that you can use to move things faster and more efficiently,” Interphase Systems’ Smith said. “For example, from a hardware independence perspective, I now no longer have to have a physical one-to-one match between the main data center and a recovery site. I can take that virtual machine, replicate it to other locations, and bring it up in a matter of hours–if not minutes. In the past, it would take days, sometimes weeks, to do a one-to-one physical recovery.”

When you’ve got hundreds or thousands of nodes to restore or replicate, virtualization can become a major factor in getting a business itself back up and running.

VMware’s Site Recovery Manager, launched in May, has become a hot item in the disaster recovery world. VMware customers were using the ESX virtualization platform for backing up and replicating virtual machines and storage months ago–even before the Site Recovery Manager was released as a formal product.

“Virtualization really has taken DR to the next level. Customers are taking that financial leap to acquire the technology now because of the benefits on the back end, due to the extreme portability of virtualized applications and storage,” Smith said.

Systems such as Site Recovery Manager will ease a lot of the worry that comes when auditors knock on your company’s doors: Site Recovery Manager offers a CIO or data center manager an electronic record they can show auditors on the spot-a report that can be run in a few minutes that outlines all the disaster recovery testing that has been run, including the parts of the system that passed and failed, plus all the tests that were run after fixes were put into place.

“Being able to say, -This is how we do it here,’ and print out this report while they’re sitting there with you, that’s an incredible weapon in your arsenal,” Smith said. “That versus pulling out a 250-page -big binder’ and going through every single page with them–by far, I think I’d take the first option.”

Disaster Recovery Planning Is Simpler, and Harder, Than Ever

=Disaster Recovery in the Cloud}

The disaster recovery market has grown so much that it has started branching off into specific kinds of data recovery for different verticals, such as high-performance computing, health care and education. And online backup and replication in the cloud–managed services using the Internet–are also being brought into the mix.

Computing in the cloud helped Tulane University in New Orleans get back up and running three years ago when Hurricane Katrina hit.

Tulane uses products from Xythos, a San Francisco-based data recovery company that focuses most of its business on the education sector, to track all its files and archive them on a daily basis in the cloud. Adam Krob, director of end-user IT support for the university, was very thankful for that system following the big storm.

“We’d been running Xythos for several years, but we didn’t have access to our own [Xythos] server right after the hurricane,” Krob said. “It was not under water, but it was not accessible at all. We contacted Xythos, and they gave us capacity to work [online] until we were able to get our own server back up.”

Tulane’s main IT center was dark for about three weeks. “Our payroll and student systems were brought up fairly quickly at our Sungard recovery sites,” Krob said, “but others, like our Blackboard [online teachers’ site] and Xythos file storage systems, had to wait until we had reconnected with our own data center.”

The service that Xythos made available to Tulane really made a difference in keeping track of Tulane students after they were relocated “at literally hundreds of institutions,” Krob said.

“We had to track them down, record where they all were, make sure they were getting their correct amounts of financial aid, and make sure they were going to come back the next fall,” he said. “We created a spreadsheet that we sent to registrars and financial aid officers at all the universities where we had our students.

“We were able, with Xythos, to create a -drop box’ where they could drop in the spreadsheet [containing personal student information]. There was security on it, security in the transfer. We set up the security session so that once they dropped [the information] in, they couldn’t see it anymore. Only authorized personnel at the assisting institutions and at Tulane could see the students’ personal information.”

Disaster Recovery Planning Is Simpler, and Harder, Than Ever

=The Human Factor}

Even with systems and automated processes in place, staff members are still the key to making sure everything happens as it should during a crisis.

“Every company is becoming more and more dependent on [disaster recovery] technology, and it’s all important. But the No. 1 thing I see that is ignored or not addressed properly is the people [in the organization],” IBM business continuity service manager Pat Corcoran told eWEEK.

“Companies say, ‘Well, during a disaster, we’ll depend on these five, six seven or 10 people to do our disaster recovery plans.’ They put the plan in place, do the testing, and so on, but when a disaster happens, [these people are] not going to be available. So it comes down to: Do your people know what to do during a crisis? Do you know what to do in a crisis? Do you know how to communicate with your people when the power’s gone?”

One of the most important elements of any disaster recovery plan or system is testing. A plan set up after 9/11 and never looked at again is probably a plan that won’t work today.

Corcoran said enterprises don’t test their disaster recovery systems often enough because it takes extra time out of staff schedules and can often interfere with daily production.

Corcoran recommends that organizations test their disaster recovery systems and processes at least twice a year, but adds that testing can be done in reasonable stages.

“I’d say that, in general, DR systems should be tested twice a year, at least,” he said. “Now, that being said, I’ve always tested mine four times a year, but that’s just me. You don’t have to test the entire system each time; test part of it, but test it thoroughly. Then, next time, test another part, and so on.”

Many disaster recovery software providers–such as Ecora, Orange Parachute, Compellent, EMC, IBM, NetApp, Xiotech and Hewlett-Packard–allow IT managers to test their systems without having to take down the entire production apparatus–or even slow it down.
Neverfail, of Austin, Texas, has a continuous availability approach that detects application failures and IT outages before recovery is required and automatically switches the business to other servers to avoid business downtime.

Disaster Recovery Planning Is Simpler, and Harder, Than Ever

Chris Preimesberger

Company

Categories