Monday, September 5, 2011

App Engine's place as a developer playground

The Google App Engine developer community is a hot mess this week over the new pricing plan for the platform. And for good reason. Many developers are seeing their hosting expenses going up by as much as 500%.


If you're looking for a post that is trashing the App Engine team, you can move along. You won't find it here. These guys are smart and considerate. If you spend any time interacting with them on StackOverflow, email or in person at Google IO, you understand this. In fact, just using the platform for a project you can appreciate their outlook and passion for their product and users. That's not to say they don't have room to improve. But enough with the negativity already!


Effects of new pricing on my projects


There have been a lot of people posting about their apps and revealing the effects of the new pricing on them. I wanted to do the same as a reference point. Note that my use of App Engine has primarily been for personal projects. Some have web front-ends, some have SMS interfaces, some are just based on background tasks and others come and go while I experiment with ideas or calendar events. I still think most of these experiments are well suited for App Engine, but I need to take a hard look at the more successful apps to figure out a long-term strategy because they are not scaling well with the new pricing plan.


I'll share two examples - both philanthropic projects - comparing the effects of the new pricing.


Astronomy Picture of the Day


This app had originally been written in Perl as a grad student and was hosted at the University of Wisconsin. I decided to port the application as the vehicle for learning App Engine and Python so it was the first app I ever wrote on the platform. It's primarily a background app. Every afternoon it runs a job that scrapes the contents of the APOD site and packages it into an email and sends it off to all subscribers. There's a simple web frontend that lets anyone sign up. There are currently 1900+ subscribers.


The app is free to run on the platform today and will cost $0.19/day - or $0.03 per user per year - after the price changes. 100% of those costs can be attributed to the use of the Mail API.


My only complaint about this app is that the change seems extreme. Going from 2,000 free emails to 100 feels like an attempt to curb the spamming community. And for the charity projects like this, all of the good net citizens are the losers with this change.


SMSMyBus


This app was originally built to provide a better interface for the Madison Metro bus service. It provides real-time arrival times for buses via SMS, chat, email and phone. But then it blossomed into a full-featured API for the Metro for other developers.


The app costs $0.01/day to operate today (excluding the SMS interface). It is estimated to cost $6.79/day after the price change. $2,478/year. Yah. That's a whopping 67,800% increase. Shebang.


The root cause of essentially all the cost can be attributed to the main API call that returns arrival times at a particular bus stop - getarrivals - and some of the clients call this repeatedly (like every two minutes). It is also where the confusion starts for me with respect to the new pricing.


Frontend instance hours


Frontend instance hours is projected to be $5.68/day, 84% of the bill. This represents the platform's transition from billing for CPU usage to billing for the contention of instance usage. I get it that they need to do this. They were using the wrong resource metric for monitoring before. 


But how do I go from a $0.00 cost for resource consumption to $5.68/day?!? That kind of increment just feels insane. How about $0 to $0.50? Or $0 to $1?


Datastore writes


Datastore writes is projected to be $1.00/day, 14% of the bill. This is harder for me to resolve for a couple of reasons. First, I can't find any cost under the current pricing plan for these operations even though the app's profile is fairly consistent. So I struggle, conceptually, with how this has suddenly become an issue for the app.


Second, $1/day equates to 1M writes/day in the datastore and I simply can't figure out where all of those writes are coming from. My back of the napkin math shows 40,000 writes. I'm totally baffled by this projection. 


The rest of it


The rest of the projected cost is a combination of storage and datastore read operations. I can eliminate the former if I simply store less data I wanted to use for analytics. It saves me money, but in the end, ignoring some of the data hurts the developers that use the API.


Optimizing


Now it's my job to go in and take another stab at optimizing the code and start with the getarrivals API call. I thought I had good habits with this so I was a little embarrassed when I found an obvious hole in the query path for route listings. There's a fairly repetitive query that was not being memcached - oops! Now fixed.


The second thing I'm experimenting with is the application's instance configuration. By default, I was letting the platform's scheduler determine my load patterns and create new instances whenever necessary. But I've made two changes. First, I took the scheduler out of 'auto' mode and set the maximum number of idle instances to one, and I've cranked up the minimum latency for the pending request queue to 250ms. In theory, each of these changes should drive the cost down because I should be using less frontend instance time throughout the day.


Let's see what happens! As I do my part with optimization, I'd like to see the App Engine do their part and move to the middle as well. :)


What to do next


I'm guessing that the App Engine platform simply priced things wrong the first time. I think the concept of platform as a service that exploits existing Google infrastructure was a smart, but geeky idea that was poorly modeled or had bad assumptions about its use/abuse. Ironically, the idea didn't scale well and they've been forced to admit that early assumptions on how to price it were just wrong.


The good part about this move... developers are forced to take a deep dive into optimization. Something i've written about before and have been doing again since the clock started on the pricing changes. This not only makes for a better platform for Google and project sharing the resources, but it makes for a better net as a whole. Faster is better.


The bad part... 



  • Developers will be forced to dead pool worthy projects that don't have a business model. 

  • Developers will be forced to port apps to other platforms. That could be a painful pill to swallow for developers when they aren't money making projects

  • Developers may be sacrificing analytics to avoid datastore bloat and access charges.


What the App Engine team should do about it



  • Provide better pricing structures for philanthropic and open source projects. App Engine is a great platform for these things and it provides a great playground for developers to support important projects at a low cost while also learning about a platform they can adopt for larger, commercial project down the road. They've hinted at this but will they do it? - http://code.google.com/appengine/kb/postpreviewpricing.html#special_programs_...

  • Provide more runway for optimization. A couple of weeks to get the sleeves rolled up and optimize their apps just isn't enough time.

  • Provide better analysis tools to highlight problems

  • Take baby steps. Must they really take these giant leaps in pricing?

  • Roll out Python 2.7 to support concurrent requests in Python projects.


In the process of writing this post I found some great resources...