posted on 2024-11-06, 05:50authored byAllan Francia
<p>Emotion classification of text data has numerous potential benefits in various applications ranging from assessing the mental health of the community through to understanding customer feedback on products and services. The recent development of transformers for language modelling has raised the benchmark results for numerous tasks in natural language processing. Alongside the performance improvement, the development of transformer models also presented several challenges such as availability of pretraining data, language model size (including parameters), training time and required computing power requirements. This research project determines if there are benefits in pretraining a transformer language model called BERT using emotion specific domain data from a social media platform called Vent. It compares the output of pretraining to just fine-tuning the original model. In doing so it also determines what size is required to pretrain the BERT model to generate reasonable results as well as how the size of pretraining data scales with the results within a constrained computing power budget. This research project also benchmarks its emotion classification results against the multilabel classification GoEmotions project. Although, the pretraining results did not outperform the original BERT models both for the multi-class and multilabel tasks, the results were close.</p>
History
Table of Contents
1 Introduction -- 2 Literature review -- 3 Research methods -- 4 Results and discussion -- 5 Conclusion and future work -- Bibliography
Awarding Institution
Macquarie University
Degree Type
Thesis MRes
Degree
Master of Research
Department, Centre or School
School of Computing
Year of Award
2024
Principal Supervisor
Diego Molla-Aliod
Additional Supervisor 1
Cecile Paris
Rights
Copyright: The Author
Copyright disclaimer: https://www.mq.edu.au/copyright-disclaimer