Saturday, September 3, 2011

Madison Transit API Homebrew

A post on my Madison Transit API homebrew is long overdue. It was an incredibly fun project and when you see how it has enabled others, you realize how much more compelling it is then the simple SMS app I had originally created.

The API has been out for a long time, but there was very little usage early on. I'm not sure what sparked the flame but in the last few months or so there has been a lot of interest and lot of activity from developers. It dawned on me that I've yet to write about it. And that's a shame because the API is actually the most clever piece of the whole SMSMyBus effort.

In the beginning
In the beginning, there was a very specific goal of creating an SMS app that delivered real-time arrival estimates. That was it. It was in response to my own need - as a commuter - for such an app. It was also going to be an entry into the Twilio developer contest that was run the week they released the SMS API publicly.

I knew going in that the Metro didn't provide a formal API or even an XML feed, but I'd done enough HTML screen scraping before to know how to overcome it. Little did I know how tedious that would become with the Metro data. So what started out as a simple texting application turned into a day of cursing the poorly formatted HTML I found on my screen. And every minute of the way I could be heard asking the same question, "Why doesn't the Metro have an API?!?"

When I finished, I swore that I would never let anyone else suffer the pain that I endured so decided to abstract the work I did for the texting app and provide access to my data so other developers could get to the transit data through a standard web service interface.

Why it's clever
The ugly work of screen scraping was now done and buried in the implementation of the SMS application. But by creating a couple of other web handlers, I could easily make the same scheduling data accessible via a web service call.

For example - - will return real-time arrival estimates for routes traveling through the stop at Main and Carrol on the square. That's the way every developer wants to query data from a web service.

With a little more scraping, I could also find the physical location of every stop (latitude and longitude) so developers could query the location of stops by a stop ID and could also query all nearby stops based on a lat/long coordinate. In the end, there was a pretty impressive looking transit API - simple to use and understand for new developers. And it didn't matter if it was ugly in its technique of screen scraping inside the implementation. It had a clean interface for the developers.

Supported API methods include...
getarrivals - Get real-time arrival estimates for any stop ID with the option of filtering by route ID.
getroutes - Get a list of every route in the Metro system.
getstops - Get the details for every stop a route travels through.
getnearbystops - Get the details for all stops within a given distance of a latitude/longitude.
getstoplocation - Get the location details for a stop given a sop ID.
getvehicles (not implemented) - Get details for a particular vehicle on the road.
getservicebulletins (not implemented) - Get a list of service bulletins in the Metro system.

Why I Built It
As I noted, a big reason for building the API was simply to save the next person the trouble of building up a new set of screen scraping tricks to get the data. But I also knew that most people that have attempted this have stopped at the schedule data. I don't know anything that has gone the extra step of getting the route and stop location data as well.More importantly, I wanted to find a way to contribute to the larger mission of opening up a public dataset to make it more accessible. Open data systems are important for lots of reasons, but most importantly, it allows communities to operate more efficiently. And nowhere is that more evident than in transit. Every city, state, and national government in the world should be on a path towards open data right now.

Unfortunately, Madison has been slow to find its course. My dream was that by creating the API and recruiting some more developers to use it, we'd have enough applications to take back to the city to say, "Look! If you open more data, developers will do great things!"

I also did it for the intellectual challenging of build an API. I've been an API consumer for a long time, but had never constructed one myself. It's an exercise I encourage any API consumer to go through. You'll have a whole new perspective for what it takes to build, document and support an API.

Sample Applications
In theory, all of the original interfaces - SMS, google chat, email, and phone - could have been re-implemented using this API, but I didn't go back and do that. But lots of new apps have been built. For example, the bus kiosk displays hanging in a few local businesses use the API to visualize arrival estimates for nearby stops.

Larry Walker used the API to build a Chumby app and also used it to display arrival estimates on a small LED display on an Arduino. He also built a mobile browser app to more easily access arrival estimates. And I used the API to build a Google gadget so you can get scheduling data in the sidebar of gmail. The attached gallery shows off some of these examples.

Go get started building your own Madison transit app!


  1. A great read Greg! I feel your pain for HTML scraping, but the product is fantastic! Did you send your work to Madison Metro? I feel like they should have hired you to implement the official API!

  2. I would gladly hand this over to the Metro, but they don't seem to be interested.

  3. Awesome work Greg! Not just on the technical front, but on the civic improvement front.

  4. Great story Greg, thanks for sharing.I wanted to write up a couple stories on this, but would like to talk to you more first. Can you email me at

  5. what were you using to scrape the site with? Great post by the way.

  6. @todd I use BeautifulSoup -