Scaling Apps on the Google App Engine

 
 
By Jeff Cogswell  |  Posted 2009-01-14
 
 
 

Scaling Apps on the Google App Engine


When I first became aware of Google App Engine, I was skeptical. It seemed to have some severe shortcomings that would make it unable to match up to competitors, especially Amazon Web Services.

These perceived shortcomings included the fact that your Google applications do not have access to an entire virtual server setup in the way that, for example, Amazon.com's Amazon EC2 (Elastic Compute Cloud) does. On EC2, applications get an entire virtual private server, complete with an entire operating system of your choice (including Linux, Microsoft Windows Server 2003 and even Sun Microsystems' Solaris). The applications can then be written and compiled in any language supported by your chosen operating system, and run on the virtual server.

Google App Engine, on the other hand, limits you to a tightly controlled Python virtual machine. Your applications can only be written in Python, and you don't have access to a file system or any of the operating system features. And Python isn't a truly compiled language.

Click here to see some of the sites built on the Google App Engine.

Furthermore, unlike competitors' offerings, your Google applications don't have access to a full SQL-based database. Instead, if you want to use a database, you must use the Google Datastore.

As it happens, I was not the only one with these concerns. Many bloggers and reporters wrote about these issues. In response, many people came to Google's defense and offered explanations of why these were, in fact, not shortcomings at all, but rather just a different approach to a common problem of creating software that can easily scale and handle millions of users.

People could argue either side and present different ideas about whether Google's approach is good. However, without any real evidence, it was hard to back up either side of the debate. That said, there was one strong argument indicating the plan would work: Google's own products. It's no secret that much of the Google App Engine, or GAE, is based on the very same technology that powered Google's own products, including its search engine. For example, the Datastore was built using technology very similar to the database technology used behind Google Maps, Blogger and even the search engine itself.

Now, almost a year later, the evidence is starting to come in, although slowly. In this article I explore a few applications that developers have built with GAE, and then draw some conclusions about the current state of GAE.

Apps in action

Google itself provides a list of some "Editor's Picks" for applications. These are what people at Google consider among the best applications they've seen. For me, that's a good starting point, because such applications should (hopefully) show what the engine is capable of. You can see the list at the Application Gallery site.

Two Promising GAE Applications


 

One application that immediately caught my attention is called GAE SQL Designer, by Jason W. Miller. This application is a very cool graphical editor for designing SQL. The application relies quite heavily on JavaScript, and it's hard to guess from using it how much it depends on the Google App Engine. However, the reason I wanted to highlight this one is to show that somebody did quite a nice job of creating a powerful Web 2.0 application that runs under Google App Engine. Clearly it's possible to create nice AJAX, Web 2.0-style applications.

As I continued to search for other applications, I was surprised how many applications in the gallery were, to be blunt, rather trivial-simplistic applications that were little more than tests, even some of the Editor's Picks. Let's be realistic here: I want to see sites that are as powerful as Facebook, because if people are creating a site as big and powerful as Facebook, they'll want to know whether Google App Engine is a viable platform to build it on.

Very few of the applications in the gallery struck me as applications that would grow into something with millions of users, and by no means as big as Facebook. I skipped over these, for the most part, because I wanted to see what GAE is really capable of. I wanted to see an application that would push GAE to its limits and see how it fares. (And I do have to wonder why the people at Google, in trying to encourage the use of their engine, would feature such simplistic apps. Are they really trying to sell GAE as something for little tools but not big Web sites?)

But the first item in the Editor's Picks looked like a good one. It's called PackageTrackr (no "e" before the final "r"). The site points out the application is still a beta, so I won't be overly critical of it. However, it's a cool concept. You can enter a tracking number of a package, and optionally specify a carrier or let the site determine the carrier based on the tracking number. (UPS tracking numbers start with 1Z, for example, so there's nothing particularly magical there. But it's a nice feature.) I put in a UPS number for a package I recently received, and the site immediately retrieved the official summary from the UPS site, and on the right side displayed a Google map that has a set of lines drawn on it showing the route the package traveled. That's pretty handy. I like this tool and will likely use it in the future.

In addition, PackageTrackr is pretty fast. There was a slight delay as it drew the map, and it's hard to say exactly why that was since I don't have access to the code itself. But this is a full-featured, nicely built Web site and not just a silly little test app. I don't know how many people were using the site at the same time that I was, but there are some ways to get estimates. Alexa gives it a rather high number, so it probably doesn't yet have a huge number of visitors. However, it was very quick from the short tests I did with it. That said, it's still nowhere near as big as Facebook.

Another application that somebody built that's featured in the Editor's Picks is called Giftag (only one "t"). This is a clever application that lets people create wish lists, much like the wish list feature on Amazon.com. This is clearly a database-driven application in that users can create accounts and then create wish lists of items such as Blu-ray Discs, including images of the items in their wish list. Using a DNS (Domain Name System) lookup, I was able to determine that the site is hosted on Google. Google applications are required to be hosted on Google; however, I checked anyway, since it's theoretically possible to have a site not hosted by Google that doesn't itself run Google App Engine, but does interact with a GAE site. This particular site is indeed hosted by Google, and that includes storing of the images.

The site is pretty easy to use and reminds me of other sites, such as Twitter. Although I've been making comparisons to Facebook, which has a heavy-duty user interface, Twitter is also a good one to consider. While it doesn't have much of a user interface (because it doesn't need one) compared with Facebook or MySpace, it does have a huge amount of power. Now, Giftag isn't likely to be nearly as huge as Twitter, but it does demonstrate similar programming techniques, and the site works quite well.

Incidentally, when you register the site, it also includes an add-on for Firefox and Internet Explorer. This is an interesting use of Google App Engine, because the add-on, while running in your browser, is interacting with the Giftag site. I'm not going to do a full review here of the site and its capabilities due to space limitations, but I do think this is a clever use of GAE.

So far, these sites are interesting. However, what I'd really like to see is an application with thousands of people using it simultaneously, but that will probably have to wait a few more months-or even another year-as GAE picks up steam.

The Google Datastore in Action


Behind the scenes of these GAE applications is a type of database that Google created called the Datastore. Early in the development of GAE, it didn't take long to find posts listing a number of concerns about the Datastore and how limited it seemed to be. The Datastore is a radical departure from the usual RDBMS (relational DMBS) we've all come to know inside and out.

Google describes it as a multidimensional sorted map. For hardcore database developers, that sounds pretty limiting. I don't know if Google anticipated the resistance, but blog comments usually revolved around, "Yes, but I can't do such-and-such." Typically somebody would post such a question, and then others would respond by showing just how you really could do such-and-such, with a little bit of shift in thinking.

For example, the Datastore limits your result sets to 1,000 rows per query. In addition, the concept of relations is gone. To many of us, that would seem to make the Datastore basically unusable.

However, the approach isn't to start with a relational design and come up with workarounds, but rather to redesign from scratch under the new approach. When you do that, you end up with a database system that is, some bloggers have said, "blindingly fast." The blog post linked here also suggests that writing data isn't as fast as it is in relational databases.

That may be true, but does it matter? When you write data to a database through a Web application, you often don't have to sit and wait for the results. For example, if you are composing an e-mail in a site such as Gmail, you can click the send button. Google can immediately take you to the next page, showing the mail as sent. Behind the scenes, Google may not have finished saving the data to your sent box. Either way, you can continue with your next task. (Readers of this article will certainly think of specific examples where you do need to immediately find out the results of a database write, but I'm talking about general-purpose Web applications used by the masses.)

When you're reading a database, on the other hand, you typically don't want to wait. Type something into Google and notice how fast you get your response. It's pretty much instant. Then think of the millions of other queries that were likely taking place at the same time yours was, and it's really impressive. If there's a tradeoff between write speed and read speed, write speed should win from a usability standpoint.

Final thoughts

What, then, are my final thoughts about Google App Engine? I saw some really cool applications, and I saw some that weren't at all impressive. None of the applications demonstrated any massive power. However, there are some things to consider here. For one, the applications are all running on Google's servers together; they aren't hosted individually. That means while many of them might not have a lot of activity at any given instant, all the applications combined may together have a great deal of activity. They all seemed just as fast as any other good sites out there.

Combine this with the defense that Google's Datastore is incredibly fast, and the fact that these applications are running on Google's tried-and-proven servers. I would conclude that even though there might not be many really powerful applications written in Google App Engine yet, as more developers start using GAE, we are sure to see some. (And those existing ones that I found may become immensely popular.) I would guess these applications will, quite likely, perform very well compared with non-GAE apps like Facebook and Twitter.

My conclusion, then, is that Google App Engine is a totally viable platform for large Web-based applications.

Senior Editor Jeff Cogswell can be reached at jeffrey.cogswell@ZiffDavisEnterprise.com.

Rocket Fuel