Sequential Transfer Learning for text summarisation

Galat, Dima

doi:10.25949/20645445.v1

01whole.pdf (1.94 MB)

Sequential Transfer Learning for text summarisation

thesis

posted on 2022-08-26, 01:40 authored by Dima Galat

The number of research publications keeps steadily increasing every year, and programmatically identifying valuable information is a challenging but essential task. Large datasets required for training state-of-the-art abstractive summarisation models are hard to find in niche domains, and most research publications only report summarisation results on the same news corpora. Adapting pre-trained summarisation models to a different domain allows overcoming limitations posed by data availability and computational costs. We explore how the second round of parallel data fine-tuning can allow utilising these models for practical applications. We select a biomedical domain summarisation task without abundant data, which has not benefited from such models in the past. We review possible approaches to this task and explore several ways of adapting a model trained on a different dataset to biomedical data. Our empirical evidence suggests that a model fine-tuned on summarisation data can be fine-tuned again for a niche domain, and using only a few hundred training samples allows to obtain a 27.3% improvement, all while taking only minutes to train. This approach can be used with other models and domains, thereby helping the adoption of the latest research in the real world.

History

Notes

A thesis submitted for the degree of Master of Research

Awarding Institution

Macquarie University

Degree Type

Thesis MRes

Degree

Thesis (MRes), Macquarie University, Faculty of Science and Engineering, 2020

Department, Centre or School

Department of Computing

Year of Award

2020

Principal Supervisor

Diego Molla-Aliod

Rights

Copyright: The Author Copyright disclaimer: https://www.mq.edu.au/copyright-disclaimer

Language

English

Extent

76 pages

Usage metrics

Keywords

Summarisation transfer learning summarization abstractive summarisation abstractive summarization biomedical summarisation biomedical summarization transformers

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Sequential Transfer Learning for text summarisation

History

Table of Contents

Notes

Awarding Institution

Degree Type

Degree

Department, Centre or School

Year of Award

Principal Supervisor

Rights

Language

Extent

Usage metrics

Categories

Keywords

Licence

Exports