Export and structure your musical activity with schema.org

Following my recent post on schema.org and personalisation on the Web, I wrote a music actions exporter for various services, including Facebook, Deezer, and last.fm. Available at http://music-actions.appspot.com, it’s mostly a proof-of-concept, but it showcases the ability to uniformly export and structure your data (in that case music listening actions) whatever service you initially used. Does that ring a bell?

As the previous post focused on why it matters, I’ll cover technical aspects of the exporter here, including the role of JSON-LD for representing content on the Web.

One model to rule them all

The Music Actions exporter is not rocket science. Basically speaking, it translates (application-specific) JSON data into another (open, with shared semantics) JSON representation, using JSON-LD. But that’s also where the power lies: it would take only a few engineering hours to most platforms to expose their actions with schema.org if they already have a public API – or user profile pages (think RDFa or microdata) – doing so. And they would probably enjoy the same benefits as when publishing factual data with schema.org.

Moreover, it will make life easier for developers: understanding a single model / semantics and learning a common set of tools will be enough to get and use data from multiple sources, as opposed to handling multiple APIs as it is currently the case – meaning, eventually, more exposure for the service. This is the grand Semantic Web promise, and I’m glad to see it more alive than ever.

In particular, let’s consider the music vertical: Inter-operable taste profiles, shared playlists, portable collections, death-to-cold-start… you name it, it could finally be done. The promise has been here for a while, many have tried, and it obviously reminds me some earlier work I’ve done circa 2008 (during and post-Ph.D.), including this initiative with Yves Raimond from the BBC using FOAF, SIOC, MO and more:

Coming back to the exporter, here’s an excerpt of my recent Facebook music.listens activity (mostly gathered from spotify here) exported as JSON-LD, with a longer feed here.

{
"@context": {
"name": "http://schema.org",
"agent_of": {
"@reverse": "http://schema.org/agent"
}
},
"@id": "http://facebook.com/alexandre.passant",
"url": "http://facebook.com/alexandre.passant",
"name": "Alexandre Passant",
"@type": "Person",
"agent_of": [{
"@type": "ListenAction",
"object": {
"@id": "http://open.spotify.com/track/1B930FbwpwrJKKEQOhXunI",
"url": "http://open.spotify.com/track/1B930FbwpwrJKKEQOhXunI",
"@type": "MusicRecording",
"name": "Represent (Rocked Out Mix)",
"audio": "http://open.spotify.com/track/1B930FbwpwrJKKEQOhXunI",
"byArtist": [{
"@id": "http://open.spotify.com/artist/3jOstUTkEu2JkjvRdBA5Gu"
"url": "http://open.spotify.com/artist/3jOstUTkEu2JkjvRdBA5Gu"
"@type": "MusicGroup",
"name": "Weezer",
}],
"inAlbum": [{
"@id": "http://open.spotify.com/album/0s56sFx1BJMyE8GGskfYJX",
"url": "http://open.spotify.com/album/0s56sFx1BJMyE8GGskfYJX"
"@type": "MusicAlbum",
"name": "Hurley",
}]
}
}]
}

For every service, it returns the most recent tracks listened to (as ListenAction), including – when available – additional data about artists and albums. In the case of Deezer and Lastfm, those information are already in the history feed, while for Facebook, this requires additional calls to the Graph API, querying individual song entities in their data-graph.

Using Google Cloud Endpoints as an API layer

Since the exporter works as a simple API, I’ve implemented it using Google Cloud Endpoints. As part of Google’s Cloud offering, it greatly facilitates the process of building a Web-based APIs. No need to build a full – albeit lightweight – application with routes / handlers (webapp2, etc.): document the API patterns (Request and Response messages),  define the application logic, and let the infrastructure manages everything.

It also automatically provides a web-based front-end to test the API, and other advantages of Google App Engine infrastructure, such as Web-based logs management in order can trace production errors without logging-in to a remote box.

GAE Endpoints API Explorer
GAE Endpoints API Explorer

The only issue is that it can’t directly return JSON-LD , since it encapsulate everything into the following response.

{
"kind": "musicactions#resourcesItem",
"etag": "\"_oj1ynXDYJ3PHpeV8owlekNCPi4/NH17nWS3hMc3GSHWziswWp2pTFk\""
"data": "<a style="color: #428bca;" href="http://music-actions.appspot.com/static/data.json">some action data</a>"
}

Thus, if you use the exporter,  you’ll need to parse the response and extract the data string value, then transform it into JSON to get the “real” JSON-LD data. That’s not a big deal as you probably won’t link to the API URL anyways since the it contains your private authentication tokens. But it’s worth keeping in mind for some projects.

JSON-LD and the beauty of RDF

Last but not least: the use of JSON-LD, augmenting JSON with the concept of “Linked Data“, i.e. “meanings, not strings”.

Let’s look at the representation of 2 ListenAction instances for the same user (using their Facebook IDs in this example). The JSON-LD serialisation will be as follows.  I’m using the @graph property to represent two statements about distinct objects (as those are 2 different ListenAction) in the same document, but I could have used multiple contexts.

{
"@context": "http://schema.org&quot;,
"@graph": [{
"@type": "ListenAction",
"agent" : {
"@id": "http://graph.facebook.com/607513040&quot;,
"name": "Alexandre Passant",
"@type": "Person"
},
"object": {
"@id": "http://graph.facebook.com/10150500879645722&quot;,
"name": "My Name Is Jonas",
"@type": "MusicRecording"
}
}, {
"@type": "ListenAction",
"agent" : {
"@id": "http://graph.facebook.com/607513040&quot;,
"name": "Alexandre Passant",
"@type": "Person"
},
"object": {
"@id": "http://graph.facebook.com/10150142973310868&quot;,
"name": "Buddy Holly",
"@type": "MusicRecording"
}
}]
}

Below is the corresponding graph representation, with 2 nodes for the same agent (i.e. the user committing the action).

Representing ListeningActions with JSON-LD
Representing ListeningActions with JSON-LD

Yet, an interesting aspect of JSON-LD is its relation with RDF – the Resource Description Framework and its graph model especially suited for the Web. As JSON-LD uses @ids as common node identifiers, a.k.a. URIs, those 2 agents are actually the same, and so the graph looks like:

Merging agents with JSON-LD
Merging agents with JSON-LD

Finally, an interesting property of RDF / JSON-LD graphs is their directed edges. Thus, instead of writing the previous statement from an Action-centric perspective, with un-identified action instances (a.k.a. blank nodes), we can write it from a User-centric perspective using an inverse property (“reverse” in the JSON-LD world), as follows.

Using inverse properties in JSON-LD
Using inverse properties in JSON-LD

Leading to the following JSON-LD document, thanks to the definition of an additional reverse property in the context. This makes IMO the document easier to understand, since it’s now user-centric, with the user / Person being the core element of the document, with edges from itself to the actions it contributes to.

{
"@context": {
"name": "http://schema.org&quot;,
"agent_of": {
"@reverse": "http://schema.org/agent&quot;
}
},
"@id": "http://graph.facebook.com/607513040&quot;,
"name": "Alexandre Passant",
"@type": "Person",
"agent_of": [{
"@type": "ListenAction",
"object": {
"@id": "http://graph.facebook.com/10150500879645722&quot;,
"name": "My Name Is Jonas",
"@type": "MusicRecording"
}
}, {
"@type": "ListenAction",
"object": {
"@id": "http://graph.facebook.com/10150142973310868&quot;,
"name": "Buddy Holly",
"@type": "MusicRecording"
}
}]
}

From shared actions to shared entities

While being (for now) a proof of concept, the exporter is a first step towards a common integration of musical actions on the Web. Of course, the same pattern / method could be applied to any other vertical. But, more interestingly, we can hope that services will directly publish their actions using schema.org, as they’ve been doing for other facts – for instance artist concert data, now enriching Google’s search results through their Knowledge Graph.

In addition, an interesting next step would be to use common object identifiers across services, in order to not only share a common semantics about actions, but also about the objects used in those actions. This could be achieved by referring to open knowledge bases such as Freebase, or using vertical-specific ones such as our new seevl API in the music area. Oh, and there will be more to come about seevl and actions in the near future. Interested? Let’s connect.

About these ads

4 thoughts on “Export and structure your musical activity with schema.org

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s