eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.
2Human Error
This is by far the No. 1 cause for cloud downtime. Even with perfect applications, cloud environments are only as good as the people who manage them. This means ongoing maintenance, tweaking and updating must be worked into standard operational procedures. One bad maintenance script can—and will—bring down mission-critical applications.
3Application Bugs
While the cloud does introduce a new level of complexity, application failure still trumps cloud provider issues as a leading cause for downtime. More often than not, such failures are unrelated to the cloud infrastructure running your applications. Traditional IT practices still apply, except that you are continuously developing, testing and deploying your application in the cloud.
4Cloud Provider Downtime
Cloud failures are routine. Whether it’s an instance, an availability zone or an entire region, applications should plan for these failures. This means routinely checking performance and spinning up new instances to replace terminated machines. Amazon Web Services, for one example, enables users to spread and load-balance an application across several availability zones so that when one does fail, the application does not suffer.
5Quality of Service
6Extreme Spikes in Customer Demand
This is actually a great example of cloud superiority. If customer demand exceeds capacity, there’s not much you can do with an on-premise IT infrastructure. In a public cloud environment, you can respond to fluctuations in customer demand by automatically scaling capacity during peaks and backing down when demand levels off.
7Security Breaches
Security is often raised as a red flag when it comes to hosting critical applications in the public cloud. Much like on-premise environments, it’s up to you to comply with regulatory and security concerns. However, the cloud does make it easier to check off a list of security requirements, since cloud providers have addressed these concerns repeatedly with hundreds of enterprise customers.
8Third-Party Service Failures
The whole is greater than the sum of its parts, but all it takes to bring your cloud down is one third-party app that isn’t working. This could happen to any type of infrastructure application (sustaining, garbage collecting, security and so on) in yours or another supplier’s data center. It’s up to you to continuously monitor these applications as well and have a contingency plan in place for a rainy day.
9Storage Failures
In a recent disaster recovery survey, storage failure was listed as a top risk to system availability. The cloud still depends on physical storage, which routinely fails. Much like overall service availability and quality, storage issues can lead to serious performance issues. This means planning for these failures by setting up dedicated cloud storage applications that maintain data resiliency and meet data retrieval requirements.
10Lack of Cloud Disaster Recovery Procedures
Although disaster recovery has been a common practice for decades in physical data centers, cloud DR only recently has come under scrutiny. Few realize that it’s the customers who are solely responsible for application availability. Cloud providers can help you develop failover and recovery procedures, but it’s up to you to integrate them into your applications.