Adapting SPARQL queries to a given language

I’m currently refactoring my homepage and want it to be created from my FOAF profile, as at the moment where it consists in FOAF + XSLT to render XHTML. I don’t want to use RDFHomepage as I want something simple, not involving other RDF files than my FOAF profile (even if the publications + FOAF mapping from RDFHomepage is quite nice), and want to add other features as RDF in HTML and RSS feed parsing, as Got wrote in his example.

So, I’m now retrieving data using librdf and SPARQL. I first ran the following SPARQL query – using OPTIONAL for each parameter as I may remove some properties from my profile and don’t want to change the query but also want to test it on other profiles that may have less information.

 PREFIX foaf: <http://xmlns.com/foaf/0.1/>  PREFIX bio: <http://purl.org/vocab/bio/0.1/>  SELECT ?foaf_name ?foaf_givenname ?foaf_family_name ?foaf_nick    ?foaf_surname ?foaf_mbox ?bio_olb ?bio_keywords  WHERE {    ?_x foaf:primaryTopic ?node .    OPTIONAL {?node foaf:name ?foaf_name } .    OPTIONAL {?node foaf:givenname ?foaf_givenname } .    OPTIONAL {?node foaf:family_name ?foaf_family_name } .    OPTIONAL {?node foaf:nick ?foaf_nick } .    OPTIONAL {?node foaf:surname ?foaf_surname } .    OPTIONAL {?node foaf:mbox ?foaf_mbox } .    OPTIONAL {?node bio:olb ?bio_olb } .    OPTIONAL {?node bio:keywords ?bio_keywords }  }

And … got 8 different results ! Indeed, I’m using both french and english to describe some properties in my profile, eg

  <bio:keywords xml:lang="en">    Social software, Semantic Web, weblogs, RDF, OWL, ontologies  </bio:keywords>  <bio:keywords xml:lang="fr">    Logiciels sociaux, Web Sémantique, weblogs, RDF, OWL, ontologies  </bio:keywords>

so it retrieves all combination of properties / langs.

Instead of parsing the results to get only values for a given language, I use language matching in the query. I’ve added a FILTER to each node so that they will be fetched only if they don’t have any xml:lang or if it matches the “favourite language” for this query (to make the query totally clean, I think I should add a isLiteral test before).

 PREFIX foaf: <http://xmlns.com/foaf/0.1/>  PREFIX bio: <http://purl.org/vocab/bio/0.1/>  SELECT ?foaf_name ?foaf_givenname ?foaf_family_name ?foaf_nick    ?foaf_surname ?foaf_mbox ?bio_olb ?bio_keywords  WHERE {    ?_x foaf:primaryTopic ?node .    OPTIONAL {?node foaf:name ?foaf_name .      FILTER (lang(?foaf_name) = "" || langMatches(lang(?foaf_name), "FR"))    } .    OPTIONAL {?node foaf:givenname ?foaf_givenname .      FILTER (lang(?foaf_givenname) = "" || langMatches(lang(?foaf_givenname), "FR"))    } .    OPTIONAL {?node foaf:family_name ?foaf_family_name .      FILTER (lang(?foaf_family_name) = "" || langMatches(lang(?foaf_family_name), "FR"))    } .    OPTIONAL {?node foaf:nick ?foaf_nick .      FILTER (lang(?foaf_nick) = "" || langMatches(lang(?foaf_nick), "FR"))    } .    OPTIONAL {?node foaf:surname ?foaf_surname .      FILTER (lang(?foaf_surname) = "" || langMatches(lang(?foaf_surname), "FR"))    } .    OPTIONAL {?node foaf:mbox ?foaf_mbox .      FILTER (lang(?foaf_mbox) = "" || langMatches(lang(?foaf_mbox), "FR"))    } .    OPTIONAL {?node bio:olb ?bio_olb .      FILTER (lang(?bio_olb) = "" || langMatches(lang(?bio_olb), "FR"))    } .    OPTIONAL {?node bio:keywords ?bio_keywords .      FILTER (lang(?bio_keywords) = "" || langMatches(lang(?bio_keywords), "FR"))    } .

}

Here it is, I’ve got only one result now, which match only french items when language is specified.

Actually, this query is created using this python snippet:

 prefixes = {    'foaf'  : 'http://xmlns.com/foaf/0.1/',    'bio'   : 'http://purl.org/vocab/bio/0.1/',  }  properties = [    'foaf:name',    'foaf:givenname',    'foaf:family_name',    'foaf:nick',    'foaf:surname',    'foaf:mbox',    'bio:olb',    'bio:keywords'  ]  def c2u(string) :    return string.replace(':','_')  lang = 'FR'  prefixes = string.join(["PREFIX %s: <%s>" %(key, value) for key, value in prefixes.items()])  select = string.join(["?%s " %c2u(value) for value in properties])  conds = string.join(["OPTIONAL {?node %s ?%s . FILTER (lang(?%s) = '' || langMatches(lang(?%s),'%s')) }"     %(value, c2u(value), c2u(value), c2u(value), lang) for value in properties])  query = "%s SELECT %s WHERE {?_x foaf:primaryTopic ?node . %s}" %(prefixes, select, conds)  print query

Lots of things to do at the moment, but I hope that the whole script and its template will be ready for the week end, at the same time as a new release of the SIOC plugin for DotClear.

Edit 2006-09-09 @ 12h: Removed foaf:dateOfBirth from example as it doesn’t exist, see comments.

Leave a Reply

Your email address will not be published. Required fields are marked *