Probabilistic models of relational implication

Holt, Xavier Ricketts

doi:10.25949/19433813.v1

01whole.pdf (874.3 kB)

Probabilistic models of relational implication

thesis

posted on 2022-03-28, 13:29 authored by Xavier Ricketts Holt

Knowledge bases and relational data form a powerful ontological framework for representing world knowledge. Relational data in its most basic form is a static collection of known facts. However, by learning to infer and deduct additional information and structure, we can massively increase the expressibility, generality, and usefulness of the underlying data. One common form of inferential reasoning in knowledge bases is implication discovery. Here, by learning when one relation implies another, we can implicitly extend our knowledge representation. There are several existing models for relational implication, however we argue they are sufficiently motivated but not entirely principled. To this end, we define a formal probabilistic model of relational implication. By using estimators based on the empirical distribution of our dataset, we demonstrate that our model outperforms existing approaches. While previous work achieves a best score of 0 . 7812 AUC on an evaluatory dataset, our ProbE model improves this to 0 . 7915 . Furthermore, we demonstrate that our model can be improved substantially through the use of link prediction models and dense latent representations of the underlying argument and relations. This variant, denoted ProbL, improves the state of the art on our evaluatoin dataset to 0 . 8143 . In addition to developing a new framework and providing novel scores of relational implication, we provide two pragmatic resources to assist future research. First, we motivate and develop an improved crowd framework for constructing labelled datasets of relational implication. Using this, we reannotate and make public a dataset comprised of 17 , 848 instances of labelled relational implication. We demonstrate that precision (as evaluated by expert consensus with the crowd labels) on the resulting dataset improves from 53 % to 95 %. We also argue that current implementations of link prediction models are not sufficiently scalable or parametisable. We provide a highly optimised and parallelised framework for the development and hyperparameter tuning of link prediction models, along with an implementation of a number of existing approaches.

History

Notes

Theoretical thesis. Bibliography: pages 50-51

Awarding Institution

Macquarie University

Degree Type

Thesis MRes

Degree

MRes, Macquarie University, Faculty of Science, Department of Computing

Department, Centre or School

Department of Computing

Year of Award

2018

Principal Supervisor

Mark Johnson

Rights

Copyright Xavier Ricketts Holt 2018. Copyright disclaimer: http://mq.edu.au/library/copyright

Language

English

Extent

1 online resource (ix, 51 pages)

Former Identifiers

mq:71366 http://hdl.handle.net/1959.14/1273637

Usage metrics

Keywords

Linguistics Knowledge acquisition (Expert systems)natural language processing knowledge base completion Linguistics -- Data processing relational implication

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Probabilistic models of relational implication

History

Table of Contents

Notes

Awarding Institution

Degree Type

Degree

Department, Centre or School

Year of Award

Principal Supervisor

Rights

Language

Extent

Former Identifiers

Usage metrics

Categories

Keywords

Licence

Exports