owl:sameAs

Inconsistencies in the LOD cloud

I was just trying to figure out how many inconsistent statements were caused by the use of owl:sameAs in the LOD cloud, by running some queries on the LOD SPARQL endpoint powered by OpenLink Virtuoso.
I then ran a simple owl:sameAs / owl:disjointWith query that unfortunately timeout-ed:

SELECT DISTINCT ?a ?b WHERE {
  ?a a ?c1 .
  ?b a ?c2 .
  ?c1 owl:disjointWith ?c2 .
  ?a owl:sameAs ?b .
}

I then restricted the experiment to foaf:Person and foaf:Document, and found about 20 resources instanciated with both classes, which is obviously inconsistent since they are disjoint in FOAF.

SELECT DISTINCT ?a ?b WHERE {
  ?a a foaf:Person .
  ?b a foaf:Document .
  ?a owl:sameAs ?b .
}

(query results here or in .png)

Going further, I wanted to identify where do these owl:sameAs statement come from, i.e. exporters or people themselves, and while most of them are generated from RDF-aware applications, some are in personal FOAF files (my previous profile is here, shame on me !) .

select DISTINCT ?a ?b ?g where { 
  ?a a foaf:Person . 
  ?b a foaf:Document .
  GRAPH ?g { ?a owl:sameAs ?b . } .
}

(query results here)

While this is only a small number of inconsistent statements compared to the number of foaf:Person / foaf:Document instances in the cloud, this is imho one simple alert to consider alternatives to owl:sameAs, such as UMBEL isLike or AKT's Consistent Reference Services.

I'm also wondering if - apart the SAOR work presented last year at ISWC Semantic Web Challenge - there are other attempts to check consistency of the LOD cloud, using Pellet or other. Any hint ?

Syndicate content