Updating Disaster-Recovery Plans
In 2002, I wrote a "day-after" disaster-recovery story that urged IT managers to plan today for recovery tomorrow. With the terrible images still fresh in our minds from the earthquake and tsunami in Japan, now is the time to press forward with full-scale business-continuity and disaster-recovery plans for your organization.
Of course, this assumes there is a plan. If your organization doesn't have a business-continuity strategy-and you plan on staying in business after an emergency situation has passed-then the first thing to do is to put a plan in place. You can start by going to my 2002 story here. There you'll see suggested readings and resources for creating a disaster-survival plan.
Even for organizations that have plans, this is the time to make adjustments and updates. Start by asking the question, what does business-continuity look like? An answer along the lines of "everyone comes to work the next day and work continues as normal" is almost as bad as "I have no idea." Both indicate a lack of serious thought about dealing with a significant loss of reliable utility power and water, inaccessible roads, fuel scarcity, and injured or missing staff members. Start your planning with a goal such as getting to 50 percent 80 percent capability within one or two days of the damaging event.
Your disaster plan needs to account for the people who will carry it out. Don't be surprised if staff members choose their family responsibilities over work duties if your organization forces them to make a choice. Charge your human resources team with offering family-support services as a predefined benefit for crucial staff. Also, make sure your organization stocks food, drinking water and toileting supplies in areas where crucial recovery work must take place. It's no good knowing the passwords to your crucial systems if staff members are forced to forage for basic necessities away from the job site.
The cloud can play a crucial role in your disaster-recovery plan. However, recall that every single cloud service lives in a brick-and-mortar data center. Make it your business to know where the company's cloud lives and what plans are in place for its continued operation. Make sure you have a service-level agreement that defines how your cloud-based services will move from one physical location to another in an emergency. Keep in mind that you'll need to press providers for fairly detailed specifics. If you hear words to the effect that "everything will function just as it did prior to the disaster event," then be suspicious. Re-housing an entire data center is a non-trivial effort that will have some impact on performance.
I used to be a big proponent of practice drills. Given the complex interdependence of IT systems, I'm not so sure drills are a good idea for enterprise data systems. Today, I recommend making sure that spare equipment is positioned where it's needed, that run books are up-to-date and accessible to those who need them and that the batteries in your UPSes and the engines in your backup generators are serviced and in good working order. And don't forget to check the fuel tanks, lines and filters for these systems. Rather than pulling the plug on your servers, it's better to spend that time firing up the generators and letting them spin for a couple of hours every quarter.