rdf

rdf

A proposal for Semantic OMB

From what I read on Twitter, it seems there's a bit of confusion regarding SMOB. Indeed, while SMOB provides a framework for Open and Semantic Microblogging, it does not define a new protocol, but simply uses SPARQL/Update over HTTP to exchange information between hubs (posting / removing notices and following / followers). Hence, this is not something that competes against OMB, the OpenMicroBlogging specification.

Actually, OMB is something we planned to look at for a long time, as briefly discussed when Status.net / OMB was presented in the W3C Social Web XG telco. I've finally took the time to analyse the full spec and checked how it compares with the distributed microblogging implementation of SMOB, and more generally with the vision of Semantic Web / Linked Data (SW/LD) microblogging services.

So here is a proposal for "Semantic OMB" (on Status.net wiki) that describes how the current OMB protocol fits with the previous idea. In particular, it aligns the terminology with existing classes / properties from well-known ontologies, and discusses how some current parts of the spec should be updated. It also discuss how OMB operations can be mapped to SPARQL/Update queries, based on the ones that currently happen in SMOB for cross-hubs synchronisation.

As you can see when browsing it, besides the terminology mappings, most of the things are compliant and there are only a few things that shall be discussed, in order to:

  • enable a better "distributed-ness" by keeping profiles owned by their users and not necessarily creating remote accounts;
  • making some mandatory elements being optional, as they are contained in the data that is exchange between services thanks to the Linked Data principles.

Thanks to these small updates, it could provide a protocol enabling SW/LD systems to be designed based on the OMB protocol, while having a sufficient abstraction level to comply with OMB systems using other technologies for data modeling and exchange. I'd be more than happy to see such features in an upcoming OMB release, and hopefully see deeper links between OMB and SW/LD efforts, as both aims to achieve the same goal of openness and interoperability. Comments and feedback are welcome on the related thread on the OMB mailing-list.

A simple PHP library for 4Store

I was recently playing with 4store, the new RDF-store engine by Steve Harris / Garlik, after having used 3store for a few years in a previous project.
As I don't want to use the HTTP server right now, but need to manage data input / query in PHP, I wrote a tiny lib that you can get here. It provides methods to import and delete graphs, as well as running SPARQL queries, and eventually outputs results with the requested content-type (XML, JSON or text). Then, adding graphs and querying data can be simply done as follows:

$s = new FStore('demo');
$s->import('http://rdfs.org/sioc/ns');
$s->query("select ?s where { <http://rdfs.org/sioc/ns#Item> ?s ?o }");
$s->delete('http://rdfs.org/sioc/ns');

Social Data on the Web at ISWC2009

It has been announcedin the past few weeks but I didn't really blog about it so far. We're hosting a second edition of the Social Data on the Web (SDoW) workshop at the next ISWC2009 in Washington. Here's the call for papers (longer version here).

The 2nd Social Data on the Web workshop (SDoW2009) co-located with the 8th International Semantic Web Conference (ISWC2009) aims to bring together researchers, developers and practitioners involved in semantically-enhancing social media websites, as well as academics researching more formal aspect of these interactions between the Semantic Web and Social Web.

Since its first steps in 2001, many research issues have been tackled by the Semantic Web community such as data formalism for knowledge representation, data querying and scalability, or reasoning and inferencing. More recently, Web 2.0 offered new perspectives regarding information sharing, annotation, and social networking on the Web. It opens new research areas for the Semantic Web which has an important role to play to lead to the emergence of a Social Semantic Web that should provide novel services to end-users, combining the best of both Semantic Web and Web 2.0 worlds. To achieve this goal, various tasks and features are needed from data modeling and lightweight ontologies, to knowledge and social networks portability as well as ways to interlink data between Social Media websites, leveraging proprietary data silos to a Giant Global Graph.

Following the successful SDoW2008 workshop at ISWC2008, SDoW2009 aims to bring together Semantic Web experts and Web 2.0 practitioners and users to discuss the application of semantic technologies to data from the Social Web.

The workshop welcome submission of short and full papers as well as demos of applications combining Semantic Web and Social Web technologies - all due to the 10th of August.

First Public Working Draft of SPARQL New Features and Rationale

I'm pleased to announce that the W3C SPARQL Working Group has just published the First Public Working Draft of SPARQL New Features and Rationale, i.e. description and motivations of the new features to be included in the next version of SPARQL.


The W3C SPARQL Working Group has published the First Public Working Draft of SPARQL New Features and Rationale. This document provides an overview of the main new features of SPARQL and their rationale. This is an update to SPARQL adding several new features that have been agreed by the SPARQL WG. These language features were determined based on real applications and user and tool-developer experience. (Via W3C Semantic Web Activity News)

If you're implementing SPARQL engines, please note that the current syntaxes for each features are example syntax, and then should not be considered as final in any way. Any comments regarding this draft are welcomed by e-mail to public-rdf-dawg-comments@w3.org.

Teasing ...

Workshop homepage and CFP coming soon !

Introducing SPARCool

As WWW2009 is starting tomorrow, with a tutorial on the Web of Data in the morning and the LDOW2009 workshop the whole day, I'm happy to introduce SPARCool.

SPARCool is a simple webservice (à-la Triplr) that allows to run basic SPARQL queries on any URI that follows some of the Linked Data principles (i.e. being dereferencable and returning RDF information about the entity) thanks to a simple URL pattern: http://sparcool.net/format/predicate[;l=lang]/URI. For instance, as described on the website, http://sparcool.net/j/dbp:abstract;l=en/http://dbpedia.org/resource/Semantic_Web will return (in JSON) answers for the following query:


PREFIX dbp: <http://dbpedia.org/property/>
SELECT ?value
FROM <http://dbpedia.org/resource/Semantic_Web>
WHERE {
<http://dbpedia.org/resource/Semantic_Web> dbpedia:abstract ?value .
FILTER (lang(?value) = "en") .
}

You can get the results in various formats, one of them being HTML (so that results can be included in any webpages), as well as redirecting the the first answer of a query, as in http://sparcool.net/r/foaf:img/http://dbpedia.org/resource/Ramones, useful for images and hyperlinks. SPARCool is powered thanks to roqet wrapped in a PHP script (source-code should be released soon).

If you're attending our (i.e. Michael and myself) "Hello Open World" tutorial tomorrow morning, this is the kind of application you should be able to write after it. And BTW, if you're in Madrid and want to have a chat, I'll be here tomorrow and thursday (for the SemSearch workshop) as well as, obviously, the LOD gathering tomorrow evening.

Say hello to lodr.info

In one of my recent post, I mentionned LODr, a semantic-tagging application based on MOAT. While I started it a few months ago, it's finally online now. I put the code in svn last friday and twitted about it, but did not make any official announcement yet, so here it is. I certainly should have released before, but as the source code involves lots of classes, I wanted to be sure of the architecture.

So, what is it about ?

LODr aims to apply to MOAT principles (in a few words, link your tags to concepts URIs - people URI, Musicbrainz artists, DBpedia resources ... - , share those relationships in a community and then tag content with those URIs) to existing Web 2.0 content. So you can "re-tag" your existing Flickr pics, slideshare presentations, etc, using those principles and make your social data enter the LOD cloud. I think focusing on the existing word is important here, as LODr lets you keep your Web 2.0 habits by using your favourite tools, but provides a separate service to semantically-enrich it. I don't want to go into too much details here, but in brief, some interesting points regarding the applications are:

  • While tags / URIs relationships are shared within the LODr community in a central RDF-base (following the MOAT architecture principles), LODr is a personal application, so that you just need to install the software on your webserver to enjoy it. Moreover, as it's local, you can re-use your data immediately for any mash-up;
  • LODr is completely RDF-based. It might be a bit geeky, but as some were recently wondering where are all the RDF-based applications, here's one. And of course RDF-based means using standard vocabularies, such as SIOC, FOAF, DC, the Tag Ontology and of course MOAT. The RDF-backend is powered by ARC2, so you can enjoy a SPARQL endpoint for your data. Last but not least, each item page features RDFa, using the previous vocabularies, even if you decide not to use MOAT for a particular item (so that any Web 2.0 item you aggregate is RDFa-ized);
  • Aggregated data will provide you a complete tagcloud for your social activity (which might be SCOT-ed in the next updates), as seen here. Each tag link redirects to a list of items provided using Exhibit, and you can restrict by source (i.e. the service it's from) or creation date. And if a tag have been assigned a URI, you'll get a link to browse the related items using a similar interface;
  • When browsing all items tagged with a particular URI, you'll get suggested some related URIs. Related because of co-occurence as usually in tag-based applications, but also because they're directly interlinked, or because they share a common property. To avoid information overload, only the URIs you used to re-tag some of your items will be shown;
  • The application can be easily extended. LODr uses wrappers to retrieve your data, and each wrapper is only a few lines of code (e.g. 24 lines for the Flickr one). At the moment, wrappers use RSS to retrieve data and the feeds are automatically discovered from the user FOAF profile - dataportability rocks ! Yet, the architecture allows to use authenticated wrappers (to use services API) but also SIOC exports for those tools;
  • As the MOAT process is more time-consuming that simple tagging (since you must define tag/URI relationships, at least at the first time as you can do automated tagging after) the URIs can be displayed as labels when you need to choose which one is relevant for your tag (using the inference capabilities described here as not all resources have a direct rdfs:label property ) . When you need a new URI, the application relies on the Sindice search widget, as done in the Drupal MOAT module. And the system then checks if the new URI is valid, but I'll blog about that particular point later;
  • Finally, in addition of the previous features, LODr can be used to discover all the community content. This feature is not provided by the local application, but by LODr.info, that aggregates your RDF data when you re-tag it to provide search capabilities. Then, you can directly list all items linked to a particular URI. Want to find content related to the Forbidden City ? Or to SPARQL ? And to be even more enjoyable, I added a Ubiquity command so that from any Wikipedia page (more services will be supported soon), you can get the list of all related items (through DBpedia in order to find the concept URI from a document page). While it provides a really-straightforward way to discover related Web 2.0 content when browsing the Web, I also hope it can convice people of the complete process.

So, you can simply download the code from the website and install it. For those who just want to have a look, you can check my LODr instance (while you won't be able to edit it, you can check the display interfaces). As there might be some bugs and I'm still adding features, please consider using the SVN version instead of the tgz. And then, enjoy the power of Linked Data for your Web 2.0 content ;-)

Lightweight subPropertyOf / subClassOf inference with ARC2

As a regular user of the ARC2 framework, I really enjoy the way it ease the development of Semantic-Web applications. Especially, its SPARQL capabilities offer an intuitive way to write / get / update graphs and triples in the backend triple-store.

Unfortunately, while ARC2 provides resource consolidation based on IFPs or using some pre-defined properties, it does not feature lightweight RDFS entailement based on subPropertyOf and subClassOf subsumption. As Benjamin pointed out on IRC, such inference can be done using a combination of ARC2 triggers and SPARQL INSERT / CONSTRUCT clause. I just created two triggers that does the job, providing lightweight inferencing for subproperties and subclasses in ARC2, using the SPARQL query that follows (in that case, regarding the properties):

INSERT INTO <$graph> CONSTRUCT {
 ?s ?top ?o .
} WHERE {
  GRAPH <$graph> {
    ?s ?prop ?o .
  }
  ?prop rdfs:subPropertyOf ?top .
}

The trigger are, in my case, launched after each LOAD action, but can also be used in combinaison with the INSERT clause, by simply editing the store parameters:

$config = $arc_config + array
  'store_triggers' => array(
    'insert' => array('graphTimestamp'),
    'load' => array('subPropertyInference'),
  ),
);

As you can notice, the query is limited to a particular $graph (both in selecting and inserting). As this $graph var corresponds to the URI of the graph that has just been loaded in the store, it avoids recomputing the triples on the whole store each time a new graph is added. Moreover the new statements also belong to the original graph. You might want to change this according to your inference policy, but I think for such lightweight inference patterns (that do not involve other graphs), that makes sense to store additional statements in the original graph.

Regarding the inference pattern itself, instead of defining manually the properties that must be taken into account, this query retrieves all the properties that have been defined as subproperties of any others to automatically infer the 'top property' relationship. While this is certainly better than manually adding some property / subproperty lists, especially for maintenance purposes, it requires that the underlying models (e.g. FOAF if you want to deal with rdfs:label / foaf:name subsumption) must be loaded in the triple store, which you can do when setting it up, e.g.:

$default_vocabs = array(
  'http://xmlns.com/foaf/spec/index.rdf',
  'http://www.geonames.org/ontology/ontology_v2.0_Full.rdf',
);
// Setup the store
$this->store->setUp();
// Load ontologies so that we can infer subproperties later
foreach($default_vocabs as $vocab) {
  $graph = LODrTools::get_datagraph($vocab);
  $this->store->query("LOAD <$vocab> INTO <$graph>");
}

Then, you can benefit from that lightweight inference engine when querying data from your store, as for instance a query related to "?s rdfs:label ?o" will retrieve "?s foaf:name ?o" statements.

Finally, one important trick to consider when LOAD-ing data in ARC2 is that when using LOAD <URI> on dereferencable URIs, the graph name will be the URI itself, which is confusing, especially if you want to define statements about the graph (i.e. provenance, creation date - as in this trigger - ). A simple solution is to define an arbitrary GRAPH URI based on the ressource URI itself, and then run LOAD <URI> INTO <GRAPH> as done on the previous snippet of code, which solves the problem and let you assign statements to the graph, and not to the URI itself.

Links to the triggers:

Social, mobile, semantic

Monday's DBpedia mobile presentation at LDOW2008 impressed me a lot. Actually, while I never worked on it, I'm really interested in ways to combine mobile applications, Semantic Web / Linked Data technologies and social networking.

SparlPress and foaf:openid

This website now uses SparqlPress.

Morten did a lot of work to include a scutter with ARC2-integration into the plugin, and so this blog now features a RDF backend, that stores some data from my website and related documents (FOAF profile, related seeAlso's) and also from people who commented there.

Syndicate content