Bash Datalog: Answering Datalog Queries with Unix Shell Commands

Fred Farr Forum October 10, 2018 11:40 - 12:00

Thomas Rebele, Thomas Pellissier Tanon and Fabian M. Suchanek.  

Abstract:  Dealing with large tabular datasets often requires extensive preprocessing. This preprocessing happens only once, so that loading and indexing the data in a database or triple store may be an overkill. In this paper, we present an approach that allows preprocessing large tabular data in Datalog – without indexing the data. The Datalog query is translated to Unix Bash and can be executed in a shell. Our experiments show
that, for the use case of data preprocessing, our approach is competitive with state-of-the-art systems in terms of scalability and speed, while at the same time requiring only a Bash shell, and a Unix-compatible operating system.

Keywords:  preprocessing;  querying;  RDF dump;  datalog;  knowledge base;  Unix shell