Joe Raad, Wouter Beek, Frank Van Harmelen, Nathalie Pernelle and Fatiha Saïs.
Abstract: Although best practices for publishing Linked Data encourage the re-use of existing IRIs, multiple names are often used to denote the same thing. Whenever multiple names are used, owl:sameAs statements are needed in order to align them. Studies that date back as far as 2009, have observed multiple misuses of owl:sameAs links. As a result, alignment of Linked Data is currently broken, since many owl:sameAs links are erroneous, even introducing inconsistencies. In this paper, we show how network metrics such as the community structure of the owl:sameAs graph can be used to detect such (possibly) erroneous statements. We evaluate our method on a subset of the LOD Cloud that contains over 558M owl:sameAs statements.
Keywords: Linked Open Data; Identity; owl:sameAs; Communities