01whole.pdf (543.7 kB)
Download file

Disfluency detection using a noisy channel model and deep neural language model

Download (543.7 kB)
thesis
posted on 28.03.2022, 16:02 by Paria Jamshid Lou
Although speech recognition technology has improved considerably in recent years, current systems still output simply a sequence of words without any useful information about the location of disfluencies. On the other hand, such information is necessary for improving the readability of speech transcripts. In fact, speech transcripts containing a lot of disfluencies are difficult to understand, so removing disfluent words can make speech transcripts more readable. Moreover, many tasks including dialogue systems input spontaneous speech. Such systems are usually trained on fluent, clean corpora, so inputting disfluent data would decrease their performance. This thesis aims at introducing a model for automatic disfluency detection in spontaneous speech transcripts called LSTM Noisy Channel Model. The model uses a Noisy Channel Model (NCM) to find "rough copies" that are likely to indicate disfluencies and generate n-best candidate disfluency analyses. Then, the underlying fluent sentences of each candidate analysis are scored using a Long Short-Term Memory (LSTM) language model. The LSTM language model scores, along with other features, are used in a reranker to identify the most plausible analysis. We show that using LSTM language model scores as features to rerank the analyses generated by an NCM improves the state of-the-art in disfluency detection.

History

Table of Contents

1. Introduction -- 2. Literature review -- 3. LSTM noisy channel model -- 4. Experiments and results -- 5. Summary and conclusions.

Notes

Bibliography: pages 43-46 Theoretical thesis.

Awarding Institution

Macquarie University

Degree Type

Thesis MRes

Degree

MRes, Macquarie University, Faculty of Science and Engineering, Department of Computing

Department, Centre or School

Department of Computing

Year of Award

2017

Principal Supervisor

Mark Johnson

Rights

Copyright Paria Jamshid Lou 2017 Copyright disclaimer: http://mq.edu.au/library/copyright

Language

English

Extent

1 online resource (ix, 46 pages)

Former Identifiers

mq:70779 http://hdl.handle.net/1959.14/1267651