It’s no secret that those of us who work on the U.S. East Coast have been subjected to a cycle of major winter storms that have hit about every four to six days. The Midwest has had a similar cycle that seems to alternate between the upper and lower Midwest.
As those storms hit with blizzard conditions or with sticky, wet snow, freezing rain and high winds, the results are basically the same: Power goes out, employees get stuck in commuter hell, phone service is spotty and cell service vanishes. The only real differences are in how skilled the regions are in getting back online after one of these events.
Of course, other regions have their own emergencies. In the South, it’s hurricanes from June through November. In the West, there are massive Pacific storms that bring mudslides, followed by droughts that bring fires. And, of course, there’s the occasional earthquake just to stir things up. But right now it’s the East Coast’s turn to be pounded, and businesses around the region are struggling to stay on top of the repeated, and relentless, attacks from the weather.
What many IT managers are finding out is that their plans for handling an emergency are inadequate. Whatever they’ve planned for, the eventuality is worse. Power outages last longer, telecom services go down, phones are out and cell service is available mostly to first responders.
To make matters worse, your employees can’t get to work. But they can’t work from home, and they may not have anything resembling a public WiFi location to work from even if your IT systems are running. In some cases, your employees may not be available at all, either because they’re stranded by the thousands on the highways or because they are effectively unreachable for other reasons.
So if it’s really important that your systems stay up, you have to do the best you can to make sure you have the triad of things that are necessary to do that. This triad is Power, People and Communications. While it can turn out that your disaster planning, no matter how carefully done, will prove fruitless, you might as well maximize your chances.
Start with looking at your power situation. If you’re using two providers and two grids, then you double your chances that one of them will stay up. But you also double your chances that one will go down. So study the track record of each of your power providers for that specific grid.
Dont Depend on Service-Level Agreements
But don’t bother to ask the power company for dependable information, since they’ll promise you near-perfect uptime. Instead, look at the historical record by reading news reports and talking to customers that have used them longer than you have. See how often power is lost, and how long restoration usually takes.
Then, when you plan your emergency generating capacity, make sure it’s double what you think you need and that you have enough fuel to last at least twice as long as what you find the average restoration time to be. Yes, you’ll have to buy a bigger generator, but your company will grow over time, so you’ll need that anyway.
You cannot depend on any single individual to be a critical part of your emergency response plan. You must be able to respond with whoever is available, which means you must do enough cross training and create documentation that will allow those people to keep your systems running, at least for a while. You probably need to draw this responsibility matrix so that you can see who is responsible for what tasks, what their level of training is, and who their backups are. And did I mention that those backups need backups?
Communications problems during an emergency will likely drive you nuts, but you have to do the best you can. Assume that you won’t have reliable cell service and that even if cell service remains in place, it may not be available to you. Assume that your landline phones will be out. Assume that your broadband connection will be out.
It may be that the only way you can keep running is to contract with a company that will provide a replacement operations capability on short notice. Or you may be able to find a second broadband provider that is totally separate from the other in terms of all facilities and all media. You should also consider buying satellite phones for key employees, and then training them on how to use them, so that you can set up a plan for contacting those employees. Remember that those satellite phones don’t work inside a building, despite what you see on television, so both parties have to plan to be in a spot where they can access the satellite.
Finally, don’t take the word of your providers that they will have you up and running right away because you have a service-level agreement. You might get a refund if the SLA isn’t met, but their promises to restore service are meaningless. In other words, trust only what you have direct control over. Have a backup for everything else.