SDoW2008: Program and proceedings

The program of SDoW2008 has just been published on the workshop website. In addition to the 7 full papers, 2 short papers and 2 demos, there will be two keynotes: the first one by Peter Mika on “Semantic Search and the Social Web” and the second one by Harry Halpin entitled “Beyond Walled Gardens: Open Standards for the Social Web”. Attending TPAC this week, I can tell that the second one is really a hot-topic: there were some discussions yesterday in the SWIG group meeting, two related lightning talks today, and an upcoming “Workshop on the Future of Social Networking” lead by the W3C Mobile Web Initiative.

The SDoW proceedings were also officially published today on CEUR-WS, volume 405:
http://CEUR-WS.org/Vol-405. If you’re attending the workshop, consider reading it before, as it might help Q/A and discussions. There will also be a lightnight talk session, so that any attendee will be able to present his works / ideas regarding Social Web and the Semantic Web.

SIOC goes OWL-DL

Just sent that to sioc-dev, but I guess it worth a larger announcement :

We just made some changes to the SIOC Core ontology and to the related modules:

- Added OWL-DL compliance statements for SIOC Core and the Types / Access / Services modules
– Edited owl:disjointWith statements for some classes of SIOC Core
– Removed domain of sioc:note
– Removed domain of sioc:has_owner and range of sioc:owner_of
– Defined sioc:account_of as inverse property of foaf:holdsAccount
– Defined sioc:avatar as a subproperty of foaf:depiction

So, SIOC is now OWL-DL !
This change was motivated by the current SWANSIOC integration project that will be introduced during the upcoming ISWC tutorial on Semantic Web for Health Care and Life Sciences.

The SIOC Core Ontology Specification has been updated according to the changes.

The other good news regarding SIOC is that Yahoo! SearchMonkey now supports (and recommends !) it in its developer documentation. Moreover, in case you did not already read it, John published the Tales from the SIOC-o-sphere #8 about two weeks ago.

More generally, if you want to join the SIOC community, by developing new applications or APIs, or if you request some help regarding implementing SIOC in your existing tools, feel free to come on #sioc on irc.freenode.net or ask on the sioc-dev ML.

Where are all the Semantic Web presentations ?

A follow-up to my previous LODr introduction post, and as you might guess with the title, one more way to show the value of RDF-based applications in general. Or more precisely, open-RDF-based and LOD-compliant:

  • By open-RDF-based, I mean using RDF but also publishing it, eitheir through a SPARQL endpoint, Semantic sitemaps or RDFa. What I want to focus on here is that using RDF ‘inside-only’ doesn’t make your service be part of the Semantic / Linked Data Web, while it can indeed be considered as Semantic-Web based – I hope that’s clear enough. If you don’t expose anything, you can have the better RDF infrastructure that you want, you will still be a Web-of-Documents application. (Some related thoughs here);
  • Regarding LOD-compliance, I refer to reusing or interlinking existing resources in order to benefit from it within your application. The value of your service can then reside not only on the data you provide, but on the way you interact with – and reuse – other datasources. Moreover, this can also concern corporate environments.

What I want to stress in that post is how such applications can become components of a general infrastructure (the Semantic Web itself) that will provide new services to end-users. Especially, regarding LODr, it lets users interlink popular Web 2.0 content to Semantic Web resources and such interaction can then be used for data discovery. For instance, the following query will retrieve all Slideshare presentations related to the Semantic Web, i.e. linked to a resource that is itself linked to the SW category in DBpedia. This query involves various vocabularies, as SIOC (to retrieve the item), FOAF (its author), the Tag Ontology (its tags), MOAT (tags meanings) and a DBpedia URI as an entry point to find related topics.

SELECT DISTINCT ?item ?author ?date ?tag ?meaning
WHERE {
  ?item a sioc:Item ;
  dct:created ?date ;
  sioc:has_space <http://slideshare.net> ;
  foaf:maker ?author .
  [] a tags:RestrictedTagging ;
    tags:taggedResource ?item ;
    tags:taggedWithTag [
      tags:name ?tag .
    ] ;
    moat:tagMeaning ?meaning .
  ?meaning ?p <http://dbpedia.org/resource/Category:Semantic_Web> .
}
ORDER BY DESC(?date)
LIMIT 5

You can browse the answer here, formatted in HTML.

Of course, the URIs that you can use in LODr and with MOAT in general are not restricted to DBpedia ones. You can use URIs defining some of your friends, conferences you attended, etc. Consequently, those URIs can be used in queries patterns, as well as other interlinked URIs. For instance, the following one will retrieve all pictures from Flickr linked to an event that happened in Tenerife, and in that case it will use the ESWC2008 URI, going through some Geonames data:

SELECT DISTINCT ?item ?author ?date ?tag ?meaning
WHERE {
  ?item a sioc:Item ;
  dct:created ?date ;
  sioc:has_space <http://flickr.com> ;
  foaf:maker ?author .
  [] a tags:RestrictedTagging ;
    tags:taggedResource ?item ;
    tags:taggedWithTag [
      tags:name ?tag .
    ] ;
    moat:tagMeaning ?meaning .
  ?meaning foaf:based_near <http://sws.geonames.org/2522437/> .
}
ORDER BY DESC(?date)
LIMIT 5

Answer

Finally, while all those queries involve the lodr.info endpoint, each LODr intance comes with its own triplestore (and related endpoint), so that one can add some more RDF in it for advanced mash-ups. And as it also provides RDFa and semantic sitemap support, semantic web crawlers and indexes as SWSE or Sindice can also consume it and then deliver it when you look for a particular URI.

Say hello to lodr.info

In one of my recent post, I mentionned LODr, a semantic-tagging application based on MOAT. While I started it a few months ago, it’s finally online now. I put the code in svn last friday and twitted about it, but did not make any official announcement yet, so here it is. I certainly should have released before, but as the source code involves lots of classes, I wanted to be sure of the architecture.

So, what is it about ?

LODr aims to apply to MOAT principles (in a few words, link your tags to concepts URIs – people URI, Musicbrainz artists, DBpedia resources … – , share those relationships in a community and then tag content with those URIs) to existing Web 2.0 content. So you can “re-tag” your existing Flickr pics, slideshare presentations, etc, using those principles and make your social data enter the LOD cloud. I think focusing on the existing word is important here, as LODr lets you keep your Web 2.0 habits by using your favourite tools, but provides a separate service to semantically-enrich it. I don’t want to go into too much details here, but in brief, some interesting points regarding the applications are:

  • While tags / URIs relationships are shared within the LODr community in a central RDF-base (following the MOAT architecture principles), LODr is a personal application, so that you just need to install the software on your webserver to enjoy it. Moreover, as it’s local, you can re-use your data immediately for any mash-up;
  • LODr is completely RDF-based. It might be a bit geeky, but as some were recently wondering where are all the RDF-based applications, here’s one. And of course RDF-based means using standard vocabularies, such as SIOC, FOAF, DC, the Tag Ontology and of course MOAT. The RDF-backend is powered by ARC2, so you can enjoy a SPARQL endpoint for your data. Last but not least, each item page features RDFa, using the previous vocabularies, even if you decide not to use MOAT for a particular item (so that any Web 2.0 item you aggregate is RDFa-ized);
  • Aggregated data will provide you a complete tagcloud for your social activity (which might be SCOT-ed in the next updates), as seen here. Each tag link redirects to a list of items provided using Exhibit, and you can restrict by source (i.e. the service it’s from) or creation date. And if a tag have been assigned a URI, you’ll get a link to browse the related items using a similar interface;
  • When browsing all items tagged with a particular URI, you’ll get suggested some related URIs. Related because of co-occurence as usually in tag-based applications, but also because they’re directly interlinked, or because they share a common property. To avoid information overload, only the URIs you used to re-tag some of your items will be shown;
  • The application can be easily extended. LODr uses wrappers to retrieve your data, and each wrapper is only a few lines of code (e.g. 24 lines for the Flickr one). At the moment, wrappers use RSS to retrieve data and the feeds are automatically discovered from the user FOAF profile – dataportability rocks ! Yet, the architecture allows to use authenticated wrappers (to use services API) but also SIOC exports for those tools;
  • As the MOAT process is more time-consuming that simple tagging (since you must define tag/URI relationships, at least at the first time as you can do automated tagging after) the URIs can be displayed as labels when you need to choose which one is relevant for your tag (using the inference capabilities described here as not all resources have a direct rdfs:label property ) . When you need a new URI, the application relies on the Sindice search widget, as done in the Drupal MOAT module. And the system then checks if the new URI is valid, but I’ll blog about that particular point later;
  • Finally, in addition of the previous features, LODr can be used to discover all the community content. This feature is not provided by the local application, but by LODr.info, that aggregates your RDF data when you re-tag it to provide search capabilities. Then, you can directly list all items linked to a particular URI. Want to find content related to the Forbidden City ? Or to SPARQL ? And to be even more enjoyable, I added a Ubiquity command so that from any Wikipedia page (more services will be supported soon), you can get the list of all related items (through DBpedia in order to find the concept URI from a document page). While it provides a really-straightforward way to discover related Web 2.0 content when browsing the Web, I also hope it can convice people of the complete process.

So, you can simply download the code from the website and install it. For those who just want to have a look, you can check my LODr instance (while you won’t be able to edit it, you can check the display interfaces). As there might be some bugs and I’m still adding features, please consider using the SVN version instead of the tgz. And then, enjoy the power of Linked Data for your Web 2.0 content ;-)

Lightweight subPropertyOf / subClassOf inference with ARC2

As a regular user of the ARC2 framework, I really enjoy the way it ease the development of Semantic-Web applications. Especially, its SPARQL capabilities offer an intuitive way to write / get / update graphs and triples in the backend triple-store.

Unfortunately, while ARC2 provides resource consolidation based on IFPs or using some pre-defined properties, it does not feature lightweight RDFS entailement based on subPropertyOf and subClassOf subsumption. As Benjamin pointed out on IRC, such inference can be done using a combination of ARC2 triggers and SPARQL INSERT / CONSTRUCT clause. I just created two triggers that does the job, providing lightweight inferencing for subproperties and subclasses in ARC2, using the SPARQL query that follows (in that case, regarding the properties):

INSERT INTO <$graph> CONSTRUCT {
 ?s ?top ?o .
} WHERE {
  GRAPH <$graph> {
    ?s ?prop ?o .
  }
  ?prop rdfs:subPropertyOf ?top .
}

The trigger are, in my case, launched after each LOAD action, but can also be used in combinaison with the INSERT clause, by simply editing the store parameters:

$config = $arc_config + array
  'store_triggers' => array(
    'insert' => array('graphTimestamp'),
    'load' => array('subPropertyInference'),
  ),
);

As you can notice, the query is limited to a particular $graph (both in selecting and inserting). As this $graph var corresponds to the URI of the graph that has just been loaded in the store, it avoids recomputing the triples on the whole store each time a new graph is added. Moreover the new statements also belong to the original graph. You might want to change this according to your inference policy, but I think for such lightweight inference patterns (that do not involve other graphs), that makes sense to store additional statements in the original graph.

Regarding the inference pattern itself, instead of defining manually the properties that must be taken into account, this query retrieves all the properties that have been defined as subproperties of any others to automatically infer the ‘top property’ relationship. While this is certainly better than manually adding some property / subproperty lists, especially for maintenance purposes, it requires that the underlying models (e.g. FOAF if you want to deal with rdfs:label / foaf:name subsumption) must be loaded in the triple store, which you can do when setting it up, e.g.:

$default_vocabs = array(
  'http://xmlns.com/foaf/spec/index.rdf',
  'http://www.geonames.org/ontology/ontology_v2.0_Full.rdf',
);
// Setup the store
$this->store->setUp();
// Load ontologies so that we can infer subproperties later
foreach($default_vocabs as $vocab) {
  $graph = LODrTools::get_datagraph($vocab);
  $this->store->query("LOAD <$vocab> INTO <$graph>");
}

Then, you can benefit from that lightweight inference engine when querying data from your store, as for instance a query related to “?s rdfs:label ?o” will retrieve “?s foaf:name ?o” statements.

Finally, one important trick to consider when LOAD-ing data in ARC2 is that when using LOAD <URI> on dereferencable URIs, the graph name will be the URI itself, which is confusing, especially if you want to define statements about the graph (i.e. provenance, creation date – as in this trigger – ). A simple solution is to define an arbitrary GRAPH URI based on the ressource URI itself, and then run LOAD <URI> INTO <GRAPH> as done on the previous snippet of code, which solves the problem and let you assign statements to the graph, and not to the URI itself.

Links to the triggers: