Use real-time monitoring, trend analysis and forecasting tools to plan responses to events: Understand system utilization and application tiers, and spot patterns and behaviors that help mitigate better peak demand and unforeseen failures and outages.
2Resource Redundancy Is a Must
Structure your data center redundancy as appropriate for your application service levels. Set up backup energy supplies from various sources to respond to unplanned events. Build in the appropriate redundancy levels for the application priority tier your data center is expected to continue to deliver through the duration of the emergency.
3Test Triage Procedure for Power Outage
Use run-book automation to implement the power capping, throttling or powering up or down of servers based on demand for computing resources and available capacity. Continue operations of most critical applications during outages.
4Check UPS System Regularly
Design and maintain your power backup uninterruptible power supply to ensure reliability: Size your generators and build your UPS/battery setup according to your data center needs and flexibility for shifting to another site.
5Establish Application Priorities
Identify application tiers as follows: 4=mission critical, 3=highly desirable, and 2=non-essential. There is no No. 1 in this priority framework. Adjust storage cluster sizes and resource allocation based on application tiers. Define emergency procedures and priorities based on application tiers.
6Plan for Server Capacity Triage
Create specific steps that transfer applications to and from available systems and locations, and implement shedding and shifting and automated failover.
7Keep Disaster-Recovery Apparatus Up-to-Date
Maintain an up-to-date disaster-recovery and business-continuity plan for multiple scenarios, ranging from a brief brownout to an extended blackout, including weeklong outages when even fuel maintenance services will no longer be available.
8Consider Automating Disaster Recovery
Implement disaster-recovery plan automation. Automate processes to move most critical applications to failover sites without service interruption, including storage and networking.
9Test Failover for Reliability
Regularly schedule failover testing to ensure reliability; verify redundancy mechanisms while the system is under load in preparation of a potential failure.
10Appoint a Team to Own Business Continuity
Form a cross-functional team to own business continuity and develop new processes to improve business continuity.
AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...