The new schema.org actions: What they mean for personalisation on the Web

The schema.org initiative just announced the release of a new action vocabulary. As their blog post emphasises:

The Web is not just about static descriptions of entities. It is about taking action on these entities.

Whether they’re online or offline, publishing those actions in a machine-readable format follows TimBL’s “Weaving the Web” vision of the Web as a social machine.

It’s even more relevant when the online and the offline world become one, whether it’s through apps (4square, Uber, etc.) or via sensors and wearable tech (mobile phones, Glass, etc.). A particular aspect I’m interested in is how those actions can help to personalise the Web

The rise of dynamic content and structured data on the Web

This is not the first time actions – at least online ones –  are used on the Web: think of Activity StreamsWeb Intents, as well as SIOC-Actions that I’ve worked on with Pierre-Antoine Champin a few years ago.

Yet, considering the recent advances on structured Web data (schema.org, Google’s Knowledge Graph, Facebook OpenGraph, Twitter cards…), this addition is a timely move. Every one can now publish their actions using a shared vocabulary, meaning that apps and services can consume them openly – pending the correct credentials and privacy settings. And that’s a big move for personalisation.

Personalising content from distributed data

Let’s consider my musical activity. Right now, I can plug my services into Facebook and use the Graph API to retrieve my listening history. Or query APIs such as the Deezer one. Or check my Twitter and Instagram feeds to remember some of the records I’ve put on my turntable. Yet, if all of them would publish actions using the new ListenAction type, I could use a single query engine to get the data from those different endpoints.

Deezer could describe actions using the following JSON-LD, and Spotify with RDFa, but it doesn’t really matter – as both would agree on shared semantics through a single vocabulary.

<scripttype="application/ld+json">
{
  "@context":"http://schema.org",
  "@type":"ListenAction",
  "agent":{
    "@type":"Person",
    "name":"Alex"
  },
  "object":{
    "@type":"MusicGroup",
    "name":"The Clash"
  },
  "instrument":{
    "@type":"WebApplication", 
    "name":"Deezer",
    "url":"http://deezer.com"
  } 
} </script>

Ultimately, that means that every service could gather data from different sources to meaningfully extract information about myself, and deliver a personalised experience as soon as I log-in.

You might think that Facebook enables this already with the Graph API. Indeed, but data need to be in Facebook. This is not always the case, either because the seed services haven’t implemented – or removed – the proper connectors, or because you didn’t allow them to share your actions.

In this new configuration, I could decide, for every service I log-in, which sources it can access. Log-in to a music platform? Let’s access to my Deezer and Spotify profiles, where some schema.org Actions can be found. Booking a restaurant? Check my OpenTable ones. From there, those services can quickly build my profile and start personalising my online experience.

In addition, websites could decide to use background knowledge to enrich one’s profile, using vertical databases, e.g. Factual for geolocation data or our recently relaunched seevl API for music meta-data, combining with advanced heuristics such as such as time decay, actions-objects granularity and more to enhance the profiling capabilities (if you’re interested in the topic, check the slides of Fabrizio Orlandi’s Ph.D. viva on the topic) .

Privacy matters

This way of personalising content could also have important privacy implications. By selecting which sources a service can access, I implicitly block access to data that is non-relevant or too private for that particular service – as opposed to granting access to all my content.

Going further, we can imagine an privacy-control matrix where I can select not only the sources, but also the action types to be used, keeping my data safe and avoiding freakomendations. I could provide my 4square eating actions (restaurants I’ve checked-in) to a food website, but offer my musical background (concerts I’ve been to) to a music app, keeping both separate.

Of course, websites should be smart enough to know which action they require, doing a source/action pre-selection for me. This could ultimately solve some of the trust issues often discussed when talking about personalisation, as Facebook’s Sam Lessin addressed in his keynote on the future of travel.

What’s next?

As you could see, I’m particularly interested in what’s going to happen with this new schema.org update, both from the publishers and the consumers point of view.

It will also be interesting to see how mappings could emerge between it and the Facebook Graph API, adding another level of interoperability in this quest to make the Web a social space.

About these ads

11 thoughts on “The new schema.org actions: What they mean for personalisation on the Web

  1. This specification looks great, Alexandre, thank you for your explanation and analysis! Indeed, it really looks like an distributed version of GraphAPI, released by Facebook a few years ago. Creating interoperable formats is great, but (1) do you know which services (especially in the music domain) are exposing / intending to expose data using this format?

    Also, I must admit that I’m a but disappointed because I thought at first that the action vocabulary was made to *trigger* actions to virtually any service (API) that implements it. For example: I would send “I want to bookmark this place” to a server (in a standardised/generic format), and that server would propose to bookmark it on all the location services that I use (e.g. Foursquare, Yelp, Google Maps…). — assuming that I authorised access to my accounts, of course. In short, this would be an distributed alternative to Facebook’s OpenGraph. (2) Do you know any way to achieve this?

    1. Hey Adrien, glad you enjoyed it!

      (1) I’m not aware of any (yet), but since most music services already provide users’ listening history through their API, that shouldn’t be too complex. I’ve actually spend a few hours over the week-end building an API that translates FB and Deezer listening activity into schema.org actions, and hope to deploy this today.

      (2) Yes, you can do that. If you look at https://developers.google.com/gmail/actions/reference/one-click-action you’ll see how you can implement such behavior in GMail (if you’ve seen those “Listen now” buttons in Spotify e-mails, that’s probably how it’s done). You’ll have to write the application logic of the handler (e.g. pushing the bookmark), but you can at least define the trigger using schema.org and JSON-LD. I can already imagine the use-case you have in mind :-)

      1. So, if I understand well, Gmail can push Actions (the same type as the ones you presented in your article, even though they were past actions) to a service. I see that the implied HTTP request is sent by Google’s server instead of the user’s web client, which is a substantially different than simply having a direct hyperlink in the body of the email. Besides that, I still see an explicit reference to specific service to implement that action, in Google’s examples. What I had in mind was a “listen to this track in whatever player i’m using” action, rather than a “listen now in spotify” action. And I’m not sure that schema.org’s Actions are suited for that kind of flow.

        PS: Kudos for developing a translation API for Deezer and FB!
        PS2, just so you know: When trying to post by reply, I got an error saying that WordPress could not connect to my G+ profile or something. => I switched to my Twitter account.

      2. So you’d like the browser to identify which tool to use, rather than going to a remote endpoint to do that task?

        In this music use-case, it might be possible to do something by running Playdar (http://www.playdar.org/) locally. Or maybe sending the query to a local webpage that redirects to whatever player you use, so that the endpoint is just a wrapper to identify which tool is the best to do the job.

        You can probably do that for any kind of action. http://webintents.org/ might also be helpful here.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s