01whole.pdf (1.95 MB)
Download file

Reinforcement learning for query-based multi-document extractive summarisation

Download (1.95 MB)
thesis
posted on 28.03.2022, 12:38 by Christopher Rhys Jones
Text summarisation helps to manage the growth of digitally stored textual information, by allowing users to learn key information from reading short summaries. This research project focuses on query-based multi-document extractive summarisation, which constructs a summary made of sentences extracted directly from multiple source documents and based on a user query. Much of the past research in extractive summarisation is based on supervised machine learning approaches, which requires converting target human summaries into explicit annotations of the input sentences. In contrast, our research focuses on reinforcement learning, which can incorporate the target human summaries directly into the learning process. We explore the impact of various key aspects of reinforcement learning. First, we compare several variants of the Proximal Policy Optimization (PPO) approach with baseline reinforcement learning approaches. Second, we investigate pretraining our policy using supervised approaches. We report our results on data provided by the BioASQ Challenge. We observe that PPO penalises changes to the policy as mentioned in literature. However, there is no significant improvement to our summarisation quality when using PPO or pre-training.

History

Table of Contents

1 Introduction -- 2 Literature Review -- 3 Approach -- 4 Proximal Policy Optimization -- 5 Pre Training -- 6 Results and Discussion -- 7 Conclusion.

Notes

Theoretical thesis. Bibliography: pages 55-64

Awarding Institution

Macquarie University

Degree Type

Thesis MRes

Degree

MRes, Macquarie University, Faculty of Science and Engineering, Department of Computing

Department, Centre or School

Department of Computing

Year of Award

2019

Principal Supervisor

Diego Mollá-Aliod

Rights

Copyright Christopher Rhys Jones 2019 Copyright disclaimer: http://mq.edu.au/library/copyright

Language

English

Extent

1 online resource (xi, 64 pages) illustrations

Former Identifiers

mq:72010 http://hdl.handle.net/1959.14/1280499