RDF nodes as Python dictionaries

I was recently looking for some code to wrap RDF nodes to Python objects / dictionnaries - something like activerdf in ruby - in order to manipulate RDF nodes easilly, and then use Jinja template engine to render SW documents in HTML.

I didn’t find anything[1] so I wrote a tiny module that does the job, using redland. By finding all (SPO) statements with the given node as subject, it creates a Python dictionnary with predicates as keys, and lists of object nodes as values (using a list even if there’s only one node in order to get something generic). And then, recurse - avoiding infinite loops - for each object.

Regarding literals, they are stored in 3 ways in the dictionnary:

It will also save the URI of resource nodes.

Eg with a DOAP file coverted to a python dict. Then, I can use something like node['doap_description'][0]['strings_fr'][0] to get the description of the project in french.

Finally, I just have to pass the dictionary to a Jinja template, and get my DOAP profile rendered as HTML. Result can be seen there.

The module, template and conversion script are available here.

Notes

[1] Edit 22/11: Actually there’s Sparta that seems to do what I needed

Adapting SPARQL queries to a given language

I’m currently refactoring my homepage and want it to be created from my FOAF profile, as at the moment where it consists in FOAF + XSLT to render XHTML. I don’t want to use RDFHomepage as I want something simple, not involving other RDF files than my FOAF profile (even if the publications + FOAF mapping from RDFHomepage is quite nice), and want to add other features as RDF in HTML and RSS feed parsing, as Got wrote in his example.

So, I’m now retrieving data using librdf and SPARQL. I first ran the following SPARQL query - using OPTIONAL for each parameter as I may remove some properties from my profile and don’t want to change the query but also want to test it on other profiles that may have less information.

 PREFIX foaf: <http://xmlns.com/foaf/0.1/>  PREFIX bio: <http://purl.org/vocab/bio/0.1/>  SELECT ?foaf_name ?foaf_givenname ?foaf_family_name ?foaf_nick    ?foaf_surname ?foaf_mbox ?bio_olb ?bio_keywords  WHERE {    ?_x foaf:primaryTopic ?node .    OPTIONAL {?node foaf:name ?foaf_name } .    OPTIONAL {?node foaf:givenname ?foaf_givenname } .    OPTIONAL {?node foaf:family_name ?foaf_family_name } .    OPTIONAL {?node foaf:nick ?foaf_nick } .    OPTIONAL {?node foaf:surname ?foaf_surname } .    OPTIONAL {?node foaf:mbox ?foaf_mbox } .    OPTIONAL {?node bio:olb ?bio_olb } .    OPTIONAL {?node bio:keywords ?bio_keywords }  }

And … got 8 different results ! Indeed, I’m using both french and english to describe some properties in my profile, eg

  <bio:keywords xml:lang="en">    Social software, Semantic Web, weblogs, RDF, OWL, ontologies  </bio:keywords>  <bio:keywords xml:lang="fr">    Logiciels sociaux, Web Sémantique, weblogs, RDF, OWL, ontologies  </bio:keywords>

so it retrieves all combination of properties / langs.

Instead of parsing the results to get only values for a given language, I use language matching in the query. I’ve added a FILTER to each node so that they will be fetched only if they don’t have any xml:lang or if it matches the "favourite language" for this query (to make the query totally clean, I think I should add a isLiteral test before).

 PREFIX foaf: <http://xmlns.com/foaf/0.1/>  PREFIX bio: <http://purl.org/vocab/bio/0.1/>  SELECT ?foaf_name ?foaf_givenname ?foaf_family_name ?foaf_nick    ?foaf_surname ?foaf_mbox ?bio_olb ?bio_keywords  WHERE {    ?_x foaf:primaryTopic ?node .    OPTIONAL {?node foaf:name ?foaf_name .      FILTER (lang(?foaf_name) = "" || langMatches(lang(?foaf_name), "FR"))    } .    OPTIONAL {?node foaf:givenname ?foaf_givenname .      FILTER (lang(?foaf_givenname) = "" || langMatches(lang(?foaf_givenname), "FR"))    } .    OPTIONAL {?node foaf:family_name ?foaf_family_name .      FILTER (lang(?foaf_family_name) = "" || langMatches(lang(?foaf_family_name), "FR"))    } .    OPTIONAL {?node foaf:nick ?foaf_nick .      FILTER (lang(?foaf_nick) = "" || langMatches(lang(?foaf_nick), "FR"))    } .    OPTIONAL {?node foaf:surname ?foaf_surname .      FILTER (lang(?foaf_surname) = "" || langMatches(lang(?foaf_surname), "FR"))    } .    OPTIONAL {?node foaf:mbox ?foaf_mbox .      FILTER (lang(?foaf_mbox) = "" || langMatches(lang(?foaf_mbox), "FR"))    } .    OPTIONAL {?node bio:olb ?bio_olb .      FILTER (lang(?bio_olb) = "" || langMatches(lang(?bio_olb), "FR"))    } .    OPTIONAL {?node bio:keywords ?bio_keywords .      FILTER (lang(?bio_keywords) = "" || langMatches(lang(?bio_keywords), "FR"))    } .

}

Here it is, I’ve got only one result now, which match only french items when language is specified.

Actually, this query is created using this python snippet:

 prefixes = {    'foaf'  : 'http://xmlns.com/foaf/0.1/',    'bio'   : 'http://purl.org/vocab/bio/0.1/',  }  properties = [    'foaf:name',    'foaf:givenname',    'foaf:family_name',    'foaf:nick',    'foaf:surname',    'foaf:mbox',    'bio:olb',    'bio:keywords'  ]  def c2u(string) :    return string.replace(':','_')  lang = 'FR'  prefixes = string.join(["PREFIX %s: <%s>" %(key, value) for key, value in prefixes.items()])  select = string.join(["?%s " %c2u(value) for value in properties])  conds = string.join(["OPTIONAL {?node %s ?%s . FILTER (lang(?%s) = '' || langMatches(lang(?%s),'%s')) }" \    %(value, c2u(value), c2u(value), c2u(value), lang) for value in properties])  query = "%s SELECT %s WHERE {?_x foaf:primaryTopic ?node . %s}" %(prefixes, select, conds)  print query

Lots of things to do at the moment, but I hope that the whole script and its template will be ready for the week end, at the same time as a new release of the SIOC plugin for DotClear.

Edit 2006-09-09 @ 12h: Removed foaf:dateOfBirth from example as it doesn’t exist, see comments.

RSS Timeline update

I’ve just noticed and fixed a bug in my RSS2Timeline service.

I used entry.content[0].value to get a feed item content, while feedparser uses entity.description as one if its universal term (which means you can use the same code for any RSS or Atom feed).

Now, the service should handle any feed without error.

Timeline for RSS feeds

When playing with Timeline, I thaught it could be a nice interface for RSS feeds, especially for weblogs or planets.

So, I wrote an ”RSS to Timeline” service, that takes any RSS/Atom feed as an input, and translates it into the correct JSON / Timeline format. Just put the correct URL as a data source for your Timeline, and you’ll get it !

Eg:

As you can see, I’ve also setup a demo service where you can see your feed in action. Everything is described in details here.

Regarding the implementation, the script is written in Python, using feedparser and mod_python. I first started in PHP with MagpieRSS, but it doesn’t provide universal methods to access feed/items informations, so the way to access content depends on the feed format. Yet, with feedparser, methods and properties are the same whatever your feed: is RSS 0.9, RSS 1.0, Atom … which is really interesting for writing universal agregators / translators.

It was also the first time I used mod_python, and I must say the Publisher handler is also very easy to use, with a templating system and interaction between the interface and the script.