Friday, November 27, 2009

Google App Engine - a first timer's experience

I discovered Google App Engine by accident several months ago when I first looked into building a robot for Google Wave. It was very much a bookmark-and-move-on kind of an introduction.

I eventually did get back to the bookmark and explored GAE some more and have become a huge fan. For starters, it is very much in the spirit of our mission at Sharendipity - providing tools that make it easier for everyone to create custom web applications.

App Engine still requires app creators to know how to program, but it provides an awesome infrastructure for deploying and scaling applications on the web. Without spending a penny, developers get all sorts of goodies including...



  • A data store for easy database creation

  • Built-in user management using standard Google accounts

  • Built-in logging

  • Cron jobs to manage scheduled tasks

  • Task queues to schedule and manage autonomous jobs

  • An application dashboard for analytics and viewing of application data


With a (free) daily quota of 1.3M requests per application, App Engine is a great way to start a new product. As your product grows, you can move into billable services to increase your quotas.


My Experiment


I needed to find an application to build that met the following criteria...



  • Limited amount of new programming since my time is overbooked already.

  • Enough complexity that I could explore App Engine features beyond the "Hello World" tutorial.


So I decided to port an existing service that I had built in grad school - an email distribution list for the Astronomy Picture of the Day (APOD). Previously, this was being hosted using my alumni account at the University of Wisconsin, Madison.

The APOD email service proved to work great because it fit both criteria. There was very little new programming to do since I'd already built it once. And it let me explore several elements of programming within App Engine including...



  • The use of webapp - they're web application framework for templating and handling requests.

  • The creation of tasqueue tasks to throttle outbound emails.

  • The use of the datastore to manage email subscribers.

  • The use of cron jobs to schedule the daily APOD emails.


The Hangups


The two challenges up front were learning Python plus the App Engine environment (including the APIs for the various services I needed). But the documentation for both is so thorough that it rarely held me up.

The quirks that actually caused friction were:



  1. The subtleties of the App Engine platform itself that are learned through trial and error.

  2. The non-deterministic nature of its performance.


This latter issue is the one thing that should bring pause to the decision of building out a business on top of the platform. However, I tend to be optimistic about this and assume it will improve as it matures.

In the mean time, however, I found myself actually managing bad performance in App Engine without any optimization of my own code. The code is too simple to be slow! One of the overriding quotas for App Engine is the per minute CPU quota. You have somewhere less than 30 seconds to complete a request. And while you wouldn't want to take anything near that for a web request, it becomes a little constraining for non-web requests like cron jobs and taskqueue jobs.

All of the jobs in the APOD application are small and constrained. Parse HTML, send an email, or loop through a list of email addresses. Yet, the time it takes to execute these changes wildly from day to day.

When the execution time exceeds the quota, you need to be prepared to manage the exception everywhere. When it happens in a taskqueue job, it can be particularly annoying since the task will re-queue itself - even if the meat of the job had already been completed.

After I initially deployed the app, it felt a lot like I was patching holes for a boat that was already in the water. I added more instrumentation and caught more exceptions until I mitigated all of the problems.

The most glaring problem appears to be a problem in the use of the Mail package. Sending email will frequently lead to DeadlineExceededError exceptions. Remote calls in a throttled environment like this should always be asynchronous.

It appears that they've done just this with remote HTTP requests. However, one of the subtle problems I had was the intermittent failure of urlfetch() calls. I've seen as much as 20% of these calls failing with DownloadError exceptions. As a result, I've built-in my own retry mechanism wherever urlfetch is used.


What's Missing


App Engine is awesome in its overall breadth and ease of use. But if I had to come up with a wish list, it would be the following...



  • An asynchronous Mail package

  • Better SDK tools for testing and simulating cron jobs and Mail actions.

  • As high as some of the quotas are, the email rate quota is too low (only 8 emails/minute). There is likely a very real concern about spam bots, but perhaps there could be an authorization process so legitimate applications could get higher quotas.


The App Engine is a great way to quickly explore new web application ideas. With an easy to use SDK, push-button deployment, and a wide array of built-in services, there has never been a better time to be a programmer.

Interested in Astronomy? Sign yourself up to receive the APOD picture each day - http://apodemail.appspot.com!

3 comments:

  1. Thanks for the feedback, Greg!Are you coming to .astronomy next week, by any chance?

    ReplyDelete
  2. @nick i'm not but it looks like a great conference.

    ReplyDelete
  3. not sure when you looked at the mail quota but if you turn on billing it will do more than 8 a minute and it is still free as long as you don't go over 2000 a day

    ReplyDelete