Macquarie University
01whole.pdf (14.56 MB)

Natural Language Processing Models in Clinical Medicine: development, validation and implementation of automated text analysis in electronic medical records

Download (14.56 MB)
posted on 2024-03-06, 05:34 authored by David Fraile Navarro

Health professionals and General Practitioners commonly use clinical free text in the form of progress notes, referral letters or discharge summaries, among others. Given the growing complexity of the electronic health record systems, clinical documentation tasks have become a burden for many doctors, having to spend a considerable amount of time processing and editing clinical documents, either themselves or with the help of medical scribes. Natural Language Processing offers an alternative that could allow the automation of several clinical documentation tasks, including clinical named entity recognition, relation extraction and summarisation.

In this thesis, we have several aims. The first is to understand the existing NLP technologies and their ability to perform information extraction tasks in the health field. In our systematic review, we observed the development of the field over the last decade with the advent of deep learning methodologies, and in particular, the transformer architectures including their generative variants. However, little of this research has been translated into implemented in practice solutions.

Second, we aimed to understand why this misalignment between literature and implementation existed. To this end, we conducted in-depth semi-structured interviews with General Practitioners about the role of text automation and AI in their practice. We found that clinicians favoured Doctor-AI collaboration. This would allow them to perform documentation tasks more efficiently, augment their workflow and reduce the documentation burden. With further advancements, doctors also envisaged moving beyond screen and keyboard as newer technologies would allow them to pursue a more direct engagement with patients. However, several concerns need to be addressed for this technology to be trusted.

Third, we tested these assumptions with an interactive prototype system that ex-hibited various degrees of automation, across several clinical documentation tasks and analysed the clinicians’ feedback on their preferred mode of interaction. Doctors over-whelmingly preferred some degree of automation (mostly moderate) to no automation, across all the documentation scenarios. Concerns remained, however, around ensuring proper testing and accuracy of the system, legal issues, and the potential for automa-tion bias to emerge.

Fourth, a key task clinicians requested the automated systems to perform consisted of converting doctor-patient conversations into succinct clinical notes. Transformer architectures have shown promising capabilities in abstractive text summarisation. We tested these assumptions by fine-tuning state-of-the-art transformer models to produce summaries of clinical conversations. We found that it was practical to fine-tune models previously trained for general dialogue summarisation with a limited number of clinical dialogues.

Lastly, we tested further these transformer-based fine-tuned systems on the long dialogue summarisation task, covering entire clinical consultations. In addition, we tested recently-released large language models (such as GPT-3.5 ), assessing both clas-sical and novel benchmarks, as well as using human expert evaluators. We found that summaries produced by large language models were judged by novel benchmark-ing techniques and by experts to be on par with human-generated summarie,s when evaluated across several quality metrics.

In conclusion, Natural Language Processing, and its newest advancements in the form of large language models, allow for automation of clinical documentation tasks, including summarisation of patient-doctor conversations. While clinicians are eager to use automation to improve their work, how these tools are integrated into the clinical workflows and how they are applied in a safe and privacy-preserving manner requires further research.


Table of Contents

1. Introduction -- 2. Clinical named entity recognition and relation extraction using natural language processing of medical free text: A systematic review -- 3. Collaboration, not confrontation: Understanding General Practitioners’ attitudes towards natural language and text automation in clinical practice -- 4. Designing a text-automation prototype for General Practice: Lessons from a user study prototype survey -- 5. Clinical Dialogue Abstractive Summarisation -- 6. Discussion -- A. Appendix

Awarding Institution

Macquarie University

Degree Type

Thesis PhD


Doctor of Philosophy

Department, Centre or School

Australian Institute of Health Innovation

Year of Award


Principal Supervisor

Shlomo Berkovsky

Additional Supervisor 1

Mark Dras

Additional Supervisor 2

Anthony Nguyen


Copyright: The Author Copyright disclaimer:



Former Identifiers

AMIS ID: 277977

Usage metrics

    Macquarie University Theses


    Ref. manager