How COVID-19 Is Starting to Expose Cracks in Public Clouds

eWEEK TREND ANALYSIS: AWS has the early lead in the COVID-19-related cloud expansion, but competitors are experiencing some overload issues.

Cloud.Services

The COVID-19 pandemic has changed the world in many ways. For many businesses, the shelter-in-place orders have pushed them to adopt cloud at a pace never seen before. In fact, one could argue that COVID-19 is a watershed moment of some sorts for public cloud computing companies as demand for their services has skyrocketed to unprecedented levels. 

The shelter-in-place orders have caused people working from home to increase their usage of collaboration applications such as Cisco Webex, Slack, Amazon Chime and Zoom, but the business growth is just part of the picture. There has been an equally large increase in consumer-facing services, such as streaming providers Netflix, Disney+ and Hulu, and online gaming services like Epic’s Fortnite. Also, people are using online food delivery services, educational tools and community dashboards. It's a good time to be in the delivery business.

Whatever the service—business or consumer—the cloud is the underlying platform and typically it’s one of the Big 3 cloud providers enabling it: Amazon Web Services (AWS), Google Cloud Platform (GCP) or Microsoft Azure. 

Even vendors that run their own cloud will typically augment their private clouds with one of the above vendors. The increased demand from COVID-19 has created a “make or break” time for the public clouds themselves as they are now being tested like never before. Based on facts that have been publicly reported, the performance seems quite different from cloud to cloud, and it’s worth taking a look at all three in more detail.

Microsoft Azure 

It appears that Microsoft’s cloud platform, Azure, is starting to reach its limits and is showing cracks under the stress. There have been several news stories written about outages including this on The Register that shows “Azure appears to be full” in the UK. A second report shows a similar story but more broadly across Europe. Another recent article talks about the strains shown on Microsoft’s cloud but also indicates that the problems started prior to the rise in traffic from COVID-19. 

I’ve done my own digging on this topic, and it appears that Microsoft is having to ration its cloud capacity and make some hard decisions about where its resources are going. From what I understand, Microsoft has chosen to prioritize its applications that run on its cloud, such as Office 365 and Teams. If this is true, customers' mission-critical workloads are taking a back seat to Microsoft’s internal needs. This could have a long-term impact, because Microsoft won’t be able to fully capitalize on new customer demand related to COVID-19, which could limit growth. More importantly, the availability issues could cause customers to move mission-critical services off Azure until it has more capacity.

One final note on Azure. Management recently posted this blog that states: “Without knowing the true scale of the new demand, we took a cautious approach and put in place temporary resource limits on new Azure subscriptions. (Existing customer subscriptions did not experience these restrictions as each Azure customer account has a defined quota of services they can access.) This allowed us to continue to meet the promised quota for all existing Azure customers.” Reading the comments at the bottom of the post, it appears the decision to place limits on Azure is not being well-received. 

Google Cloud Platform

The smallest of the big three has certainly fared better than Microsoft, but it has had a few issues, showing its cloud is also feeling the strain. One report highlighted that Gmail, Snapchat and Nest all suffered short outages. The story noted the outages came “after a period of intense growth in web traffic.” 

Another report pointed that Downdetector, which provides real-time status and outage information for cloud providers, found that an outage affected Google’s products across the entire Eastern seaboard, where the bulk of the U.S. population lives. Then this UK publication reported a Google Cloud Engine outage was caused by a large backlog of queued mutations, which in turn was caused by lack of memory in the company’s cache servers. 

The March 26 outage took down Google’s cloud services in numerous regions, including parts of Europe and Australia. Impacted services include Dataflow, Big Query, DialogFlow, Kubernetes Engine, Cloud Firestore, App Engine and Cloud Console. 

One concern for customers considering Google is that the company has made international expansion a priority and the cornerstone of its competitive strategy. The fact that the reported issues are showing up where it is expanding could foreshadow trouble. However, I will tip my hat to GCP because it has performed better than Azure with fewer outages. 

Amazon Web Services 

The biggest and most established cloud vendor is often poked at by the up-and-comers for being “legacy.” This pandemic has shown the benefits of being bigger and having a long-established cloud business. By every metric, AWS is the market leader and has built its platform to run almost every service on the planet. I’ve found AWS’ manifest destiny-type attitude to be very egotistical at times, but it’s certainly put the company in a great position at this time.

In fact, in this WSJ article, Amazon publicly stated: “We have taken measures to prepare and we are confident we will be able to meet customer demands for capacity in response to COVID-19.” In the same article, Andy Jassy, the head honcho of AWS, told his employees to “think about all the customers carrying extra load right now because of all the people at home” and use that as a rallying point to keep performance high. 

AWS has also been looking to expand its reach. This week it announced it had opened up operations in Cape Town, creating the first AWS Africa region. Currently, Microsoft and Huawei have announced plans to do the same, creating a race for the region. 

AWS currently handles many of the biggest cloud service providers that have seen a surge, including Slack, Netflix, DoorDash and more. None of these has seen service disruption, so customers should have confidence they can run their workloads problem-free. 

COVID-19 is a moment of reckoning for the public cloud providers. It's time to put up or shut up. The extra loads will show any weakness anywhere and, while no cloud provider is perfect, AWS seems to have the early lead, followed by GCP, with Azure bringing up the rear. Let’s see if it stays that way. 

Zeus Kerravala is an eWEEK regular contributor and the founder and principal analyst with ZK Research. He spent 10 years at Yankee Group and prior to that held a number of corporate IT positions.