An investigative report issued this week by the U.S. and Canadian governments surrounding the worst power failure in U.S. history is shedding light on the importance of business-continuity planning through findings that the electric-system catastrophe could have been prevented.
The U.S.-Canada Power System Outage Task Force placed much of the blame for the Aug. 14-15 blackout on FirstEnergy Corp., based in Akron, Ohio.
The monstrous power outage crippled parts of Ohio, seven other states and Ontario, cutting off electricity to an estimated 50 million customers as the cascading failure mercilessly sped along transmission lines.
In some parts of the United States, power was not restored for four days. Estimates of total costs of the blackout range between $4 billion and $10 billion.
In its final report, the task force said “inadequate situational awareness” at FirstEnergy in part contributed to the blackouts beginnings in eastern Ohio. Specifically, investigators highlighted factors that served as dominos leading to the collapse.
Those factors included faulty software and control-room procedures at FirstEnergy that led to an alarm-system failure, the inability of FirstEnergy to properly recognize and shut off 15,000 megawatts of power to its customers as a contingency measure, and vegetation affecting power lines.
Investigators offered 46 recommendations to prevent another widespread plunge into darkness, spearheaded by a recommendation to make reliability standards mandatory and enforceable with stiff penalties for noncompliance violations. Currently, the private NERC (North American Electric Reliability Council) oversees the voluntary requirements.
Although FirstEnergy officials take exception to some findings included in the final report, particularly with respect to the claim that its system was deficient of reactive power support on Aug. 14, company officials said the company is moving forward to enhance its reliability as part of the interconnected electric grid system.
“The bulk of the  recommendations arent about us; theyre industrywide issues,” said Kristen Baird, a FirstEnergy spokeswoman. “We think that speaks very widely that there was much more going on than some downed power lines and computer-system problems in Ohio.”
Baird said FirstEnergy will have a new computer-monitoring and computer-management system operational at its command center by June. In addition, an aggressive vegetation-management effort is underway, as well as increased staffing involving two additional command center operators on-site, one of whom will serve in an engineering role to examine modeled and forecasted data.
Experts say many IT operations could be in jeopardy by failing to consider software upgrades and the additions of new devices or systems hardware over time in an environment without considering their potential impact on business-continuity or disaster-recovery plans already in place.
Botched change management
“What Im finding is when firms go to recover, they generally [havent factored] change management on a daily basis, so they have mixed applications, mixed levels of hardware, network infrastructure, and theyre not on the same page when they go to recover,” said Michael Croy, director of Business Continuity for Forsythe, an IT services company based in Skokie, Ill. “When you look at the bottom line of what happened with FirstEnergy, I think thats a prime example of how [business continuity] is not an IT issue; its a business issue.”
Some companies understand just how devastating unexpected disaster can be for their customers. Skip Skivington, national director for Healthcare Continuity Management for the nonprofit Kaiser Permanente, based in Oakland, Calif., said the Aug. 14 power failure in Ohio led to a water plant shutdown— which could have potentially affected the delivery of quality medical care at Kaiser medical buildings in the region without adequate water resources.
“Power and water are big things to the healthcare industry, so weve always looked at that” aspect in architecting business-continuity plans, Skivington said. “Were a community asset. Were not an island, so we have to look at it through two prisms: One is, How do we substantiate our operation internally? And secondly, How do we fit in with regional community plans?”
Skivington said his organization drills its disaster-recovery plans at least twice a year in all its 30 medical centers, accounting for any type of procedural or technology changes that may have been implemented since the last run-through.
Be sure to add our eWEEK.com database news feed to your RSS newsreader or My Yahoo page: