Generalizing link prediction for information extraction
thesisposted on 2022-03-28, 11:07 authored by Sonit Singh
Information Extraction (IE) is the task of extracting from a text the entities and the relationships that hold between them, in a form that can be stored in a database called a Knowledge Base (KB) or Knowledge Graph (KG). Link prediction, also called as Knowledge Base Completion, is the task of predicting missing links in order to make KG more complete. While most of IE and link prediction models have focused on binary relationships, in the real world relationships are often n-ary (n > 2). Recently, IE algorithms have been proposed that can extract relationships of arbitrary arity, but as far as we know there is no corresponding work on link prediction involving relationships of arbitrary arity. In this thesis, we introduce the task of n-ary link prediction by proposing two different models to model n-ary relationships and two different training methods to train the proposed models. We also provide new dataset (based on Wikidata) for training and evaluating our proposed approaches. We also propose a modification in the standard evaluation criteria in order to overcome the bottleneck of huge computational complexity when working on large-scale KBs. Evaluation in terms of Mean Rank, Hits@10 and classification accuracy on tuple dataset show that our proposed approaches have the ability to generalize link prediction over tuples having arbitrary arity.