Data Storage: Data Center Disaster Preparedness: 10 Tips for Minimizing the Damage

 
 
By Chris Preimesberger  |  Posted 2011-10-18
 
 
 

Forecast as Best You Can

Use real-time monitoring, trend analysis and forecasting tools to plan responses to events: Understand system utilization and application tiers, and spot patterns and behaviors that help mitigate better peak demand and unforeseen failures and outages.

Forecast as Best You Can

Resource Redundancy Is a Must

Structure your data center redundancy as appropriate for your application service levels. Set up backup energy supplies from various sources to respond to unplanned events. Build in the appropriate redundancy levels for the application priority tier your data center is expected to continue to deliver through the duration of the emergency.

Resource Redundancy Is a Must

Test Triage Procedure for Power Outage

Use run-book automation to implement the power capping, throttling or powering up or down of servers based on demand for computing resources and available capacity. Continue operations of most critical applications during outages.

Test Triage Procedure for Power Outage

Check UPS System Regularly

Design and maintain your power backup uninterruptible power supply to ensure reliability: Size your generators and build your UPS/battery setup according to your data center needs and flexibility for shifting to another site.

Check UPS System Regularly

Establish Application Priorities

Identify application tiers as follows: 4=mission critical, 3=highly desirable, and 2=non-essential. There is no No. 1 in this priority framework. Adjust storage cluster sizes and resource allocation based on application tiers. Define emergency procedures and priorities based on application tiers.

Establish Application Priorities

Plan for Server Capacity Triage

Create specific steps that transfer applications to and from available systems and locations, and implement shedding and shifting and automated failover.

Plan for Server Capacity Triage

Keep Disaster-Recovery Apparatus Up-to-Date

Maintain an up-to-date disaster-recovery and business-continuity plan for multiple scenarios, ranging from a brief brownout to an extended blackout, including weeklong outages when even fuel maintenance services will no longer be available.

Keep Disaster-Recovery Apparatus Up-to-Date

Consider Automating Disaster Recovery

Implement disaster-recovery plan automation. Automate processes to move most critical applications to failover sites without service interruption, including storage and networking.

Consider Automating Disaster Recovery

Test Failover for Reliability

Regularly schedule failover testing to ensure reliability; verify redundancy mechanisms while the system is under load in preparation of a potential failure.

Test Failover for Reliability

Appoint a Team to Own Business Continuity

Form a cross-functional team to own business continuity and develop new processes to improve business continuity.

Appoint a Team to Own Business Continuity

Rocket Fuel