Detecting Erroneous Identity Links on the Web using Network Metrics

Fred Farr Forum October 11, 2018 11:00 - 11:20

Bookmark and Share

Joe Raad, Wouter Beek, Frank Van Harmelen, Nathalie Pernelle and Fatiha Saïs.  

Abstract:  Although best practices for publishing Linked Data encourage the re-use of existing IRIs, multiple names are often used to denote the same thing. Whenever multiple names are used, owl:sameAs statements are needed in order to align them. Studies that date back as far as 2009, have observed multiple misuses of owl:sameAs links. As a result, alignment of Linked Data is currently broken, since many owl:sameAs links are erroneous, even introducing inconsistencies. In this paper, we show how network metrics such as the community structure of the owl:sameAs graph can be used to detect such (possibly) erroneous statements. We evaluate our method on a subset of the LOD Cloud that contains over 558M owl:sameAs statements.

Keywords:  Linked Open Data;  Identity;  owl:sameAs;  Communities