KADE: Aligning Knowledge Base and Document Embedding Models using Regularized Multi-Task Learning

Matthias Baumgartner, Wen Zhang, Bibek Paudel, Daniele Dell'Aglio, Abraham Bernstein and Huajun Chen.

Abstract: Knowledge Bases (KBs) and textual documents contain rich and complementary information about real-world objects, as well as relations among them. While text documents describe entities in freeform, KBs organizes such information in a structured way. This makes these two forms to represent information hard to compare and integrate, limiting the possibility to use them jointly to improve predictive and analytical tasks. In this article, we study this problem, and we propose KADE, a solution based on a regularized multi-task learning of KB and document embeddings. KADE can potentially incorporate any KB and document embedding learning method. Our experiments on multiple datasets and methods show that KADE effectively aligns documents and entities embedding, while maintaining the characteristics of the embedding models.

Keywords: Knowledge Graphs; Deep Semantics; Embeddings; Machine learning; Natural Language Text Processing

PDF