Sequential Transfer Learning for text summarisation
The number of research publications keeps steadily increasing every year, and programmatically identifying valuable information is a challenging but essential task. Large datasets required for training state-of-the-art abstractive summarisation models are hard to find in niche domains, and most research publications only report summarisation results on the same news corpora. Adapting pre-trained summarisation models to a different domain allows overcoming limitations posed by data availability and computational costs. We explore how the second round of parallel data fine-tuning can allow utilising these models for practical applications. We select a biomedical domain summarisation task without abundant data, which has not benefited from such models in the past. We review possible approaches to this task and explore several ways of adapting a model trained on a different dataset to biomedical data. Our empirical evidence suggests that a model fine-tuned on summarisation data can be fine-tuned again for a niche domain, and using only a few hundred training samples allows to obtain a 27.3% improvement, all while taking only minutes to train. This approach can be used with other models and domains, thereby helping the adoption of the latest research in the real world.