People are talking about the Amazon S3 fiasco of the evening of July 20 and what the long- and short-term implications of this breakdown might be for this popular storage service.
Amazon.com’s Amazon Simple Storage Service was beset by unexplained outages for anywhere from 2 to 6 hours Sunday night.
Users of Amazon S3 and Amazon Simple Queue Service in Europe and the United States were out for various windows of time. It was a lucky break that wasn’t a high-transaction business day here in the United States, or it might have elicited a great many more complaints.
eWEEK’s Michael Hickins was one of the first to blog on this subject. Here is Amazon.com’s own report.
These online storage services are like electricity or water supplies: When there’s an outage, there’s nothing a consumer or enterprise can do about it, and all activity comes to a dead halt. A growing number of businesses now depend fully on the Amazon S3 service to store data and to run applications.
The outage, like a similar one at Amazon.com in February 2008 and a big one that hit the 365 Main co-location center in San Francisco in July 2007 that knocked Craigslist, RedEnvelope and Charles Schwab off the Web for several hours, bruises the trust that enterprises place in these companies’ services. None of those customers left 365 Main, however, because the co-location center came clean with an explanation of what happened the very next day.
365 Main’s backup generators didn’t function as they were supposed to after a transformer explosion cut off power to the downtown San Francisco data center. 365 Main has made major upgrades to its system as a result and has not had even a minor outage in the past 12 months.
Keeping the Faith
These incidents haven’t happened often enough for companies to turn away completely. Accidents happen, and generally companies have been forgiving. But how many incidents like this will it take for enterprises to lose faith in the cloud? After all, as storage farms get up into the thousands of nodes, the software and networking get very, very complicated-and the possibility of a breakdown becomes greater as the I/O burdens get heavier.
Sajai Krishnan, CEO of ParaScale-a startup that makes software that connects siloed servers into cloud-type computing architectures using Red Hat Linux-told me that Amazon.com is doing “great” with its new service overall but that it has “undertaken a degree of difficulty even higher than what Google has done.”
Krishnan continued, “Google has been pushing the envelope of technology, but everything is inside, so no one gets to see this. Amazon, on the other hand, is opening up their system for more general-use service so more people can use it. So the challenge they have undertaken is not a simple task.”
These kinds of outages, frankly, don’t surprise people because companies like Amazon.com and Google are pushing the envelope in terms of scale, Krishnan said. Accidents are going to happen, no matter what.
What people need to know is that you don’t need to have a cloud that spans the globe to realize the benefits of it, Krishnan said.
“One of the key takeaways [from this event] is that you can get the economies of scale with much more deployment, without really have to take on the challenges of this kind of scale,” Krishnan said. “The curve eventually starts flattening out in terms of benefits … you really don’t have to be Google- or Amazon-scale to realize the benefits [of the online storage service].”
Enterprises can even build their own IT system “clouds” by pooling resources and not have to worry about relying on an outside service to handle their business data.
Krishnan said he believes that over time we’ll be seeing many more online service providers, but their offerings will be on a much smaller scale than the huge cloud services whose infrastructures might be beginning to crack under the strain of billions of transactions per day.
One Size Doesnt Fit All
“You don’t have to have a one-size-fits-all cloud,” Krishnan said. “A company can have a cloud for archival [data] and another one for video streaming, for example. Each of those two services have very different characteristics, as far as the individual nodes are concerned.”
These small clouds are not simply a mashup of a bunch of amalgamated in-house and outside Web services. “You should be able to scale out your own cloud by simply adding more nodes of commodity servers as you need them,” Krishnan said.
Smaller providers can offer more hands-on service and more differentiated services, using smaller cloud networks. The same can happen for a company that builds its own cloud. If you’re going to do something along the scale of Google Apps or Salesforce.com, for example, those have R&D aspects that are enormous.
But if you have a much smaller service provider, say a regional one, or build one in your own data center, you can get the same value and possibly even better service. Certainly you must be careful who to pick if the decision is made to go outside.
What about the trust factor here? After all, we’re talking about giving the family jewels to strangers to keep for us.
“Nothing is as secure as having all your data within the firewall,” Krishnan said. “But now, the [storage] hardware is becoming very affordable; you can start a cloud with 4TB [of capacity]. You don’t even need to start with a petabyte.
“If you’re looking at a [smaller] service provider, yes, it [trust] is a challenge. You can encrypt the data and whatever, but ultimately you are going to trade off some level of performance and latency for that [peace of mind]. If you really have reservations about putting your data out there [especially for financial companies], then you should look at building your own cloud.”