Google I/O talk on Semantic Annotations of YouTube videos, featuring our own seevl

Enhancing the Freebase/YouTube API mappings… using Freebase and YouTube

The YouTube V3 API is one of those thing you’ll definitely fall in love with, if you’re into real-world Semantic Web applications, a.k.a “Things, not words”. With its integration with Freebase – the core of Google’s Knowledge Graph -, it’s a concrete and practical showcase of the Web as a distributed database of things and relations, and not only keywords and links between pages.

YouTube Data API v3 with Freebase mappings: the good, the bad, and the ugly

While relatively simple to use, it provides advanced features to let developers built data-driven applications. On the one hand, it allows to search for videos by Freebase entities, as you can try in a recent demo from YouTube themselves. On the other hand, it returns which entities are used/described in a video.

Yet, identifying topics from videos is a difficult task, and if you’re not convinced (and interested in all things Machine Learning related), check the following Google I/O talk from last year.

Google I/O talk on Semantic Annotations of YouTube videos, featuring our own seevl
Google I/O talk on Semantic Annotations of YouTube videos, featuring our own seevl

While the API generally delivers correct information, it sometimes requires a bit of work to automatically uses its results in a music-related context (to be exact, the issues might be in the underlying data, rather then on the API itself):

  • In some cases, it provides multiple artists – which is often correct, e.g. Blondie and Debby Harry but makes difficult to find who’s the main one, as the API delivers them at the same level (topicIds).
  • In others, it returns empty results, like this (recently deleted, maybe as part of the YouTube music limbo?) Nirvana video.
  • Finally, when an awesome band like Weezer decides to cover Coldplay, both bands are returned by the API.

This is something we’ve improved to build our former seevl for YouTube plug-in, and while it’s not available anymore, as we’ve moved away from consumer-facing products to refocus on a B2B, turn-key, music discovery solution, I’ve decided to open source the underlying library to find who’s playing and what (yes, that’s music only) in any YouTube videos.

Introducing youplay – who’s and what’s playing in a YouTube music video

The result is youplay, available on PyPI and github, a MIT-licensed python library that works as an enhancement on top of the YouTube Data API v3 to automatically identify who’s and what’s playing in a music video. It uses different heuristics, data look-up, and more to find the correct artists if multiple ones are returned (unless they’re all playing in the video, like this RHCP + Snoop Dogg version of Scar Tissue), to filter ambiguous ones, or to find the correct artist and track if the API doesn’t deliver anything.

Here’s an example

#!/usr/bin/env python
import youplay

(artists, tracks) = youplay.extract('0UjsXo9l6I8')
print '%s - %s' %(', '.join([artist.name for artist in artists]), tracks[0].name)

(artists, tracks) = youplay.extract('c-_vFlDBB8A')
print '%s - %s' %(artists[0].name, tracks[0].name)

will return

(env)marvin-7:youplay alex$ python sample.py
Jay-Z, Alicia Keys - Empire State of Mind
Dropkick Murphys - Worker's Song

The tool is also packaged with a command line script returning JSON data for easy integration into non-python apps.

(env)marvin-7:youplay alex$ ./bin/youplay ebBjGp7QOGc
{
  "tracks": [
    {
      "mid": "/m/0dt1kzp", 
      "name": "For My Family"
    }
  ], 
  "artists": [
    {
      "mid": "/m/022tqm", 
      "name": "Agnostic Front"
    }
  ]
}

With a little help from my friends

The fun part? All the look-ups (if any) are using the Freebase and YouTube API themselves, such as:

  • Finding the top-tracks of an artist from Freebase and matching it with the video name if the original API call when it returns only artist names;
  • Identifying if a song has been recorded by multiple artists;
  • Looking-up related YouTube videos to identify what’s the common topic between all of them, and guess the current artist of a video with no API-results.

Isn’t it a nice way to bridge the gap?

Even though I hope the API will be useful to other music-tech developers, I also wish that it soon becomes obsolete, as Google’s Knowledge Graph, and other structured-data efforts on the Web, keep growing on the Web in terms of AI, infrastructures and APIs/toolkits – making more and more easier every day to build data-driven applications (if only I had this 10years ago when I started digging into the topic!).

Oh, and I’m attending Google I/O next week, and if you’re working on similar projects, ping me and let’s have a chat!

Leave a Reply

Your email address will not be published. Required fields are marked *

3 thoughts on “Enhancing the Freebase/YouTube API mappings… using Freebase and YouTube