Gradual unfreezing transformer-based language models for biomedical question answering

Khanna, Urvashi

doi:10.25949/19428323.v1

01whole.pdf (2.04 MB)

Gradual unfreezing transformer-based language models for biomedical question answering

thesis

posted on 2022-03-28, 02:39 authored by Urvashi Khanna

Pretrained transformer-based language models have achieved state-of-the-art results on various Natural Language Processing (NLP) tasks. These models can be fine-tuned on a range of downstream tasks with minimalistic modifications. However, fine-tuning a language model may result in the problem of catastrophic forgetting and tend to overfit on smaller training datasets. Therefore, gradually unfreezing the pretrained weights is a possible approach to avoid catastrophic forgetting of the knowledge learnt from the source task. Multi-task fine-tuning is an intermediate step on a high-resource dataset that yields good results for low-resource tasks. In this project, we will be investigating the strategies of multi-task fine-tuning and gradual unfreezing on DistilBERT, which have not yet been applied for biomedical domain. First, we explore whether DistilBERT improves the accuracy of a low-resource dataset, BioASQ, with question answering (QA) task as our NLP use-case. Second, we investigate the effect that gradual unfreezing has on the performance of DistilBERT. We observe that despite being 40% smaller and without any domain-specific pretraining, DistilBERT achieves comparable results to a larger model, BERT on smaller BioASQ dataset. However, we observed that gradually unfreezing DistilBERT has no significant impact on the results of our QA task in comparison to standard non-gradual fine-tuning.

History

Notes

Bibliography: pages 49-57 Theoretical thesis.

Awarding Institution

Macquarie University

Degree Type

Thesis MRes

Degree

MRes, Macquarie University, Faculty of Science and Engineering, Department of Computing

Department, Centre or School

Department of Computing

Year of Award

2021

Principal Supervisor

Diego Mollá Aliod

Rights

Copyright Urvashi Khanna 2021 Copyright disclaimer: http://mq.edu.au/library/copyright

Language

English

Extent

1 online resource (xi, 61 pages) illustrations

Former Identifiers

mq:72044 http://hdl.handle.net/1959.14/1280832

Usage metrics

Keywords

Natural language processing (Computer science)BioASQ BERT DistilBERT Transfer learning Gradual unfreezing

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Gradual unfreezing transformer-based language models for biomedical question answering

History

Table of Contents

Notes

Awarding Institution

Degree Type

Degree

Department, Centre or School

Year of Award

Principal Supervisor

Rights

Language

Extent

Former Identifiers

Usage metrics

Categories

Keywords

Licence

Exports