01whole.pdf (1.08 MB)
Download file

Supervised machine learning for extractive query based summarisation of biomedical data

Download (1.08 MB)
posted on 29.03.2022, 03:36 by Mandeep Kaur
Automation of text summarisation is a pressing need due to the plethora of textual information available online. Motivated by the success of machine learning in this domain, this research explores several supervised machine learning approaches for extracting summaries in response to queries. The first objective of this research is to compare the quality of classification and regression approaches for query-based multi-document extractive summarisation. To enable the comparison, we use a common extractive summarisation framework which attempts to identify salient sentences by scoring them based on a common set of features. Our experiments are performed on biomedical data provided by the BioASQ challenges. The second objective is to address the important issue of converting the sample summaries available in the training data into annotations that can be used to train statistical classifiers for extractive summarisation. We conduct different trials of data annotation and assess their impact in the results. On the basis of our investigations for the specific dataset used in this research, we show that the classification scheme performed better than the regression and results presented by different annotation techniques reveal that annotation with threshold 0.1 outperforms the other techniques.


Table of Contents

1. Introduction -- 2. Literature review -- 3. Summarisation corpus and evaluation -- 4. Research methods -- 5. Experimental results and discussions -- 6. Conclusion and future work -- References.


Bibliography: pages 48-53 Empirical thesis.

Awarding Institution

Macquarie University

Degree Type

Thesis MRes


MRes, Macquarie University, Faculty of Science and Engineering, Department of Computing

Department, Centre or School

Department of Computing

Year of Award


Principal Supervisor

Diego Mollá-Aliod


Copyright Mandeep Kaur 2018. Copyright disclaimer: http://mq.edu.au/library/copyright




1 online resource (53 pages) graphs, tables

Former Identifiers

mq:70673 http://hdl.handle.net/1959.14/1266590

Usage metrics