Linked Data

SPARQL + pubsubhubbub = sparqlPuSH

There have been lots of discussion recently regarding dynamics and notification in the Semantic Web realm, including various vocabularies for describing changes and approaches for notifying them - as Leigh recently blogged about it.
Last month, while visiting Kno.e.sis, Pablo an I worked on an approach using pubsubhubbub for RDF changes notification, that I'm happy to announce today.

The result is sparqlPuSH, an interface that can be plugged on any SPARQL endpoint and that broadcast notifications to clients interested in what's happening in the store using the pubsubhubbub protocol. At a glance, anyone can register a particular query to the RDF store (e.g. list all microblog posts, or list any changes made by X, using the Changesets vocabulary) and results are provided in an RSS / Atom feed that is then sync-ed using pubsubhubbub: each time new data corresponding the register query is added into the store, the store itself notifies the interested parties of such updates.
Practically, this means that you can be notified in real-time of any change happening in a SPARQL endpoint.

The following video describes how the approach works as well as shows a related use-case and you can download its source at http://code.google.com/p/sparqlpush/.
It can be used as an interface on the top of any SPARQL endpoint and also comes with an ARC2 interface (if you're using a different endpoint, the interactions happen via HTTP and use requires that your endpoint provides JSON SPARQL query results).

We believe that a push system like this for RDF notification can change lots of things regarding RDF data management and how to make sense of it, in real-time. In addition, we hope that such approach could be generalised not only to SPARQL endpoints, but to resource themselves, so that one resource can ping a pubsubhubbub hub when it changes, the notifications being then broadcasted to interested parties.

Observing the Growth of the LOD Cloud

While Bob DuCharme reminded a few months ago how much the famous LOD cloud growed from its beginning, I recently rendered the following graph that I'd like to share (click on it for full-size image).

Growth of the LOD cloud

It has been drawn based on the cloud history and the original ESWC2007 poster. Hopefully, such graph can be automatically done when we'll get a machine-readable description of that cloud, e.g. using voiD.

Using LOD in your webpages: SPARCool module for Drupal and JavaScript library

I've just wrote a Drupal module for SPARCool, which allows to embed SPARCool results in Drupal nodes, so that you can reuse any data from the LOD cloud to enhance your webpages !

It simply works by using the following pattern in your pages [ sparcool | predicate | URI ] (without any space), that is translated to a JSONP callback to the SPARCool service, et voilà ! You can also add a ;l=xx parameter after the property if you want to limit the query to a particular language.

For instance, I can include some bands members in this blog post, from DBPedia:

[sparcool|dbp:currentMembers|http://dbpedia.org/resource/Beastie_Boys]

As well as the abstract of a paper, from the Semantic Web Dog Food:

[sparcool|swrc:abstract|http://data.semanticweb.org/workshop/scripting/2008/paper/12]

Or the list of my last.fm contacts, from DBTune:

[sparcool|foaf:knows|http://dbtune.org/last-fm/terraces]

The Drupal project is hosted at http://drupal.org/project/sparcool, you can get the source via CVS. In addition, I also wrote a tiny JavaScript function (that is used in this module - and requires JQuery if you want to use it separately) that can help to include such results in any page. Finally, SPARCool now relies on prefix.cc to fetch the prefixes.

Have fun with Linked Open Data !

NB: Since the results are included via AJAX when printing the page, you should't get their results in your aggregator or planet websites

SemTech slides

I've just uploaded the slides of the two Semtech tutorials I was invoved in today. They are embedded below, but you can also access them directly on slideshare. It was a busy day - actually, I'm in the same meeting room from 7:30 am ! - but really interesting and I guess (and hope) the tutorials were well recieved, with interesting feedback and questions from the audience. As said previously and during both presentations, if you have any questions, feel free to drop an e-mail or directly come to have a chat if you want to discuss some of these topics more in detail.

CommonTag - An easy-to-use vocabulary for Semantic Tagging

I'm happy to announce CommonTag, a new RDFS vocabulary for Semantic Tagging, designed to bridge the gap between free-text tagging and Linked Data. In a similar way that what I've done in the past with MOAT, CommonTag allows one to create links between his tags (as simple keywords) and the concept they represent, defined as URIs of Semantic Web resources, from public knowledge bases such as Freebase or DBpedia.

What is especially relevant with regards to CommonTag is that the vocabulary aims to be simple to understand, easily accessible, and with an easy RDFa annotation process for end-users and Web developers. On the other hand, it features mappings with existing tagging vocabularies (the Tag Ontology, MOAT, SCOT, SIOC and SKOS) for those who want to go further or use their existing applications with this new model.

But most interestingly, as one can see when browsing the website, a key feature is that CommonTag is not an isolated initiative but supported by various companies involved in the Semantic Web and the Social Web -- and especially in both ! -- namely (for the initial nucleus and by alphabetical order, hope it will grow soon !) AdaptiveBlue, DERI (NUI Galway), Faviki, Freebase, Yahoo, Zemanta and ZigTag - and I must add that was a great experience to design this vocabulary together !

CommonTag is already supported in various applications as you can see on the website and on the following picture, from Zemanta to index your blog posts to Sindice to build applications on the top of it. And there is more to come soon, stay tuned ;-)

Inconsistencies in the LOD cloud

I was just trying to figure out how many inconsistent statements were caused by the use of owl:sameAs in the LOD cloud, by running some queries on the LOD SPARQL endpoint powered by OpenLink Virtuoso.
I then ran a simple owl:sameAs / owl:disjointWith query that unfortunately timeout-ed:

SELECT DISTINCT ?a ?b WHERE {
  ?a a ?c1 .
  ?b a ?c2 .
  ?c1 owl:disjointWith ?c2 .
  ?a owl:sameAs ?b .
}

I then restricted the experiment to foaf:Person and foaf:Document, and found about 20 resources instanciated with both classes, which is obviously inconsistent since they are disjoint in FOAF.

SELECT DISTINCT ?a ?b WHERE {
  ?a a foaf:Person .
  ?b a foaf:Document .
  ?a owl:sameAs ?b .
}

(query results here or in .png)

Going further, I wanted to identify where do these owl:sameAs statement come from, i.e. exporters or people themselves, and while most of them are generated from RDF-aware applications, some are in personal FOAF files (my previous profile is here, shame on me !) .

select DISTINCT ?a ?b ?g where { 
  ?a a foaf:Person . 
  ?b a foaf:Document .
  GRAPH ?g { ?a owl:sameAs ?b . } .
}

(query results here)

While this is only a small number of inconsistent statements compared to the number of foaf:Person / foaf:Document instances in the cloud, this is imho one simple alert to consider alternatives to owl:sameAs, such as UMBEL isLike or AKT's Consistent Reference Services.

I'm also wondering if - apart the SAOR work presented last year at ISWC Semantic Web Challenge - there are other attempts to check consistency of the LOD cloud, using Pellet or other. Any hint ?

Introducing SPARCool

As WWW2009 is starting tomorrow, with a tutorial on the Web of Data in the morning and the LDOW2009 workshop the whole day, I'm happy to introduce SPARCool.

SPARCool is a simple webservice (à-la Triplr) that allows to run basic SPARQL queries on any URI that follows some of the Linked Data principles (i.e. being dereferencable and returning RDF information about the entity) thanks to a simple URL pattern: http://sparcool.net/format/predicate[;l=lang]/URI. For instance, as described on the website, http://sparcool.net/j/dbp:abstract;l=en/http://dbpedia.org/resource/Semantic_Web will return (in JSON) answers for the following query:


PREFIX dbp: <http://dbpedia.org/property/>
SELECT ?value
FROM <http://dbpedia.org/resource/Semantic_Web>
WHERE {
<http://dbpedia.org/resource/Semantic_Web> dbpedia:abstract ?value .
FILTER (lang(?value) = "en") .
}

You can get the results in various formats, one of them being HTML (so that results can be included in any webpages), as well as redirecting the the first answer of a query, as in http://sparcool.net/r/foaf:img/http://dbpedia.org/resource/Ramones, useful for images and hyperlinks. SPARCool is powered thanks to roqet wrapped in a PHP script (source-code should be released soon).

If you're attending our (i.e. Michael and myself) "Hello Open World" tutorial tomorrow morning, this is the kind of application you should be able to write after it. And BTW, if you're in Madrid and want to have a chat, I'll be here tomorrow and thursday (for the SemSearch workshop) as well as, obviously, the LOD gathering tomorrow evening.

Syndicate content