Amazon SimpleDB a Solid Choice for Simple Web-based Data Storage - Page 2

Download the authoritative guide: How to Develop an IT Security Strategy

I mentioned that I chose the C# library. However, I want to be clear: These libraries are written by Amazon, but they are not the only libraries available. The interface to SimpleDB is through either of two methods: as a Web service (using SOAP) and REST (Representational State Transfer). While the C# library includes several classes and methods for interacting with the SimpleDB data stored on Amazon's servers, behind the scenes the libraries are constructing simple URLs and sending them to the Web servers and waiting for a response. The response comes back in the form of XML, and the C# library parses this XML and stores it in a collection.

(Incidentally, if you're serious about REST, there is a lot of discussion online about whether Amazon broke unofficial REST rules in creating its interface. I'm not going to cover that here. But the information is online if you google SimpleDB REST.)

Since the C# library simply creates REST requests and reads responses, you could actually create your own class library. I imagine over time we'll see more (and better) libraries, as the one that Amazon created really isn't too sophisticated. But since this library is just a wrapper around REST calls, I'm not going to use this library to pass judgment on the SimpleDB product (and I don't recommend you do so either).

I do, however, have one concern that I want to raise: The responses are all in XML. And while XML works great (and those of you who read my blogs and follow me on Twitter know that I'm a proponent of it), XML is a shortcoming here. Returning a set of two numbers, say 1 and 2, takes a significant amount of space, including the XML header line and several XML tags, like so:

<?xml version="1.0"?>

<GetAttributesResponse xmlns="">
















That's a lot of wasted space to get back two integers. Two integers can be stored in just a few bytes; this response is more than 400 bytes. If you're doing a huge amount of data retrieval, that can really add up. (And that means you'll want to make sure you request the data in batches rather than individually. Multiple data items can be returned in a single XML response.) And unfortunately, (perhaps even unfarily) Amazon bills you for data going both ways-data moving into their servers and data coming out. You'll definitely want to count your beans carefully.

Experience with SimpleDB

The C# library, while somewhat simple, worked quite well and I was able to easily put data onto the SimpleDB servers. I ran code to create domains, add and remove data to and from the domains, and list the data in the domains. It worked well. There's really not a lot more to it- SimpleDB is just that: simple. It's for storing data. It's your job to decide what data your site needs to upload.

Of course, working in C# presents an interesting consideration: When dealing with the Web, C# is normally a server-side language. That means you might have your own Web site either hosted on Amazon's EC2 or on your own site. The site would be an ASP.NET site and you would create C# code that interacts with the Amazon SimpleDB servers. Your user, then, would be browsing your site, without realizing behind the scenes that your site is storing its data on Amazon's servers.

But is that reasonable? Does it really make sense to be hosting your own site but not your own data? And does it make sense to have your server make connections to another server on another network to get the data? I can't answer that for your own situation, but it is a question that would need to be answered.

The other option for server-side code is to have your site hosted on EC2. Then it starts to make more sense. EC2 supports Windows servers and you can create ASP.NET sites. (And you can use other platforms and languages on EC2 as well, if you don't want to use ASP.NET.) This makes a bit more sense, but the server still needs to connect to SimpleDB and process the raw XML data.

Of course, another possibility is to push into Web 2.0 and create a site that generates Web pages that include JavaScript code; these pages could then use AJAX to connect right to the site. But there are some serious security risks here, because you probably don't want your users' browsers writing data directly to your SimpleDB domains. Further, by default, the browsers don't even let JavaScript use AJAX to connect to sites that have URL domains different from those hosting the Web page itself.

Another option, then, is to use a hybrid approach where you would have your server make the requests to the SimpleDB server, but then return XML directly to the client browser, and let the browser process the XML using Xpath and XSL transformations.

These are all serious issues that you'll need to explore when building a site that makes use of SimpleDB. Typically, I imagine people will be hosting their server-side code on EC2, as that's certainly what Amazon has in mind. From there you'll have to weigh the pros and cons of how to process the data and what to make the browser do, while factoring in the security issues.

Summary: A Unique Approach

In general, the SimpleDB is a good approach for data that doesn't need to be relational. Will it work for all situations? No. For example, the usual textbook example of a customer, products and sales database that's fully normalized would not lend itself well to this example, unless you want to manually do the joins yourself by reading a customer ID from a customer domain, and then searching the sales domain for product IDs based on that customer ID, and then searching the product domain for the list of products the customer purchased. That would be a lot of work when a simple two- or three-line SQL join would do the job nicely with a relational database.

However, for cases where you need to quickly look up data that doesn't need to be joined (such as a list of products matching a certain set of criteria), then the SimpleDB would work quite well.

Senior Editor Jeff Cogswell can be reached at