Sky, water, dark - a good combo of the expected doom-metal tags

Automatically generate album covers with deep learning

Two weeks ago, I wrote about how I used Clarifai‘s deep learning image recognition API and Google Prediction to identify an artist genre based on their album cover. But what about reversing it, and automatically designing album covers for a particular genre, thanks to the insights from this previous experiment?

Inspired by another machine-learning approach which automatically write Rap lyrics to write new ones, here’s how Clarifai and Flickr helped me to create album covers for an imaginary Doom-metal band!

Read about related experiments in my #MusicTech e-book
Read about related experiments in my #MusicTech e-book

Learning about a genre’s iconography

In my previous experiment, I parsed about 300 album covers from different genres (K-pop, Doom-metal, Punk-rock) using the Clarifai image recognition API. Using the API, a set of tags have been extracted for each cover, which allowed me to identify what are the most representative ones for each genre (i.e. the ones frequently appearing for a genre but not in others).

Here’s for instance the most representative tags for Doom-metal (compared to the two other genres of the sample):

  • horror
  • fantasy
  • water
  • smoke
  • black and white
  • history
  • fire
  • pattern
  • sky
  • scary

Using this list, it became quite easy to automatically design new doom-metal album covers !

Finding and merging the rights images to design the perfect cover

The process to design the cover works as follow:

    1. Select a random tag from the previous list, and query the Flickr API to get the 500 most relevant pictures matching this tag;
    2. Select a random picture from the results, and pass it again to the Clarifai API. Since tags on Flickr are manually assign by photographers, there might be some conceptual mismatch between the tag and the elements contained in the picture. By calling the API again, we can make sure tht the picture contains (as per image extraction) the required tag. If not, revert to step 1.
    3. Repeat the process to get a second picture;
    4. Blend the two pictures with Pillow, and a band name, and the album title (with random fonts and positions), et voila!

So, does that work?

Well, actually, the results are quite fun besides a few oddities – fine tuning would be required to improve the results, but I’ve just designed this in a few hours. Here are a few samples below, from the imaginary album “Louder than Death” of an imaginary band called “Laceration” (I figured out later that a few “metal” bands use this name)

Not too dark, but scary!
A first one, not too dark, but scary…
Sky, water, dark - a good combo of the expected doom-metal tags
Then: ‘sky’, ‘water’, ‘dark’ – a good combo of the expected doom-metal tags…
With a bit of fire now...
Now with a little bit of fire…
My favorite one, even though the title does not fit
Finally, another scary one, probably my favorite, even though the title does not fit!

If you want to try on your own, get the source on GitHub. You’ll just need to set-up APIs keys for Clarifai and Flickr.

Read about related experiments in my #MusicTech e-book
Read about related experiments in my #MusicTech e-book

Leave a Reply

Your email address will not be published. Required fields are marked *

2 thoughts on “Automatically generate album covers with deep learning