The case for Tasks Queues on Google App Engine: Pinging remote APIs

I’ve spend the past few months building YapMe, and our first MVP was released on the AppStore a few days ago! The app aims to bridge the gap between photos and videos, letting users to take pictures with ambient sound, up to 25 seconds, in a single click.

To build it, I’ve decided to fully rely on the Google Cloud Platform: App Engine, Datastore, Endpoints, and more. I’ll blog about the overall experience later, but here’s a quick post about a particular topic: Task Queues.

Gathering user metrics

As for every new products, metrics matter. To gather those, we use various APIs and toolkits: Crashlytics, Google Analytics, and Intercom.

While Crashlytics and Google Analytics calls are done directly through the device, Intercom calls are done in the back-end. So for instance, when adding a new followee, instead of doing

- (iOS) /POST add_followee to YapMe
 -- (YapMe back-end) User.add_followee(other)
 -- (YapMe back-end) 200 OK
 - (iOS) /POST add_followee to Intercom
 -- (Intercom back-end) set "add followee" metric
 -- (Intercom back-end) 200 OK

Or

- (Android) /POST add_followee to YapMe
 -- (YapMe back-end) User.add_followee(other)
 -- (YapMe back-end) 200 OK
 - (Android) /POST add_followee to Intercom
 -- (Intercom back-end) set "add followee" metric
 -- (Intercom back-end) 200 OK

We simply do

- (Android | iOS) /POST add_followee to YapMe
 -- (YapMe back-end) User.add_followee(other)
 -- (YapMe back-end) /POST add_followee to Intercom
 ---- (Intercom back-end) set "add followee" metric
 ---- (Intercom back-end) 200 OK
 -- (YapMe back-end) 200 OK

Here are a few reasons for this:

  • Unlike Crashlytics or GA, our Intercom metrics are not directly related to the app (e.g. session length) but to actions on database entities (e.g. creating a new yap, or following a user). As those actions are recorded in the back-end, it makes sense to gather the metrics at the same time;
  • Some metrics are conditional, and those conditions are evaluated on the back-end (e.g. “has the media been already shared by the user?”). Pushing metrics from the app would require another layer of back-and-forth between the device and the API;
  • We’ll eventually have multiple clients (iOS, Web, Android), so having the metrics handled on the back-end avoids us to implement those on any client, especially useful when update are required: this can be done on the back-end without pushing new app releases.

Pinging remote endpoints with Task Queues

Initially deploying those back-end metrics with a simple urlfetch (GAE API to handle remote requests), I was bugged by some queries which were more time consuming than expected. Using the new Cloud Trace tool, I’ve noticed that the Intercom queries where taking a while on the back-end, as seen on the log trace below, representing the track of an API call on our back-end, while the two urlfetch.Fetch() calls are used to call the Intercom API.

Using Cloud Trace to debug remote API calls
Using Cloud Trace to debug remote API calls

There are a few solutions to handle this, and to make sure the main API call continues without waiting a reply from Intercom (I don’t really care if the call a success or not, we’re OK losing a metric if something happens):

  • Use async urlfetch requests. Yet, it keeps the connection open while I just want a simple ping and don’t need to handle the query result;
  • Use a Python thread. In this case, the task is threaded (so the main API call can exit) but it runs on the same instance(s) as the one that initiated the thread, consuming resources on those;
  • Use a Task Queue. The Intercom query is pushed in a separate push queue, that is immediately processed and auto-scales, delegating the work to a new module in our case.

Which gives the following trace result. It takes less than 10% of the original time, and delegates all the process and resources to another module, so that one and the related instances are not overloaded by simple pinging tasks.

Pushing remote API calls in a Task Queue
Pushing remote API calls in a Task Queue

Note that we’re using the same approach to implement push notifications, which are now available in our new release. In both cases, pushing data into the queue and handling it is straight-forward, as described in Google’s push queues tutorial. Note that pull queues are executed on App Engine, which means you cannot do advanced processing (such as image processing). For those, we rely on pull queues. More about this later.

Leave a Reply

Your email address will not be published. Required fields are marked *