Disfluency detection using a noisy channel model and deep neural language model

Lou, Paria Jamshid

doi:10.25949/19435736.v1

01whole.pdf (543.7 kB)

Disfluency detection using a noisy channel model and deep neural language model

thesis

posted on 2022-03-28, 16:02 authored by Paria Jamshid Lou

Although speech recognition technology has improved considerably in recent years, current systems still output simply a sequence of words without any useful information about the location of disfluencies. On the other hand, such information is necessary for improving the readability of speech transcripts. In fact, speech transcripts containing a lot of disfluencies are difficult to understand, so removing disfluent words can make speech transcripts more readable. Moreover, many tasks including dialogue systems input spontaneous speech. Such systems are usually trained on fluent, clean corpora, so inputting disfluent data would decrease their performance. This thesis aims at introducing a model for automatic disfluency detection in spontaneous speech transcripts called LSTM Noisy Channel Model. The model uses a Noisy Channel Model (NCM) to find "rough copies" that are likely to indicate disfluencies and generate n-best candidate disfluency analyses. Then, the underlying fluent sentences of each candidate analysis are scored using a Long Short-Term Memory (LSTM) language model. The LSTM language model scores, along with other features, are used in a reranker to identify the most plausible analysis. We show that using LSTM language model scores as features to rerank the analyses generated by an NCM improves the state of-the-art in disfluency detection.

History

Notes

Bibliography: pages 43-46 Theoretical thesis.

Awarding Institution

Macquarie University

Degree Type

Thesis MRes

Degree

MRes, Macquarie University, Faculty of Science and Engineering, Department of Computing

Department, Centre or School

Department of Computing

Year of Award

2017

Principal Supervisor

Mark Johnson

Rights

Copyright Paria Jamshid Lou 2017 Copyright disclaimer: http://mq.edu.au/library/copyright

Language

English

Extent

1 online resource (ix, 46 pages)

Former Identifiers

mq:70779 http://hdl.handle.net/1959.14/1267651

Usage metrics

Keywords

disfluency speech transcript deep learning LSTM noisy channel model Automatic speech recognition language modelling

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Disfluency detection using a noisy channel model and deep neural language model

History

Table of Contents

Notes

Awarding Institution

Degree Type

Degree

Department, Centre or School

Year of Award

Principal Supervisor

Rights

Language

Extent

Former Identifiers

Usage metrics

Categories

Keywords

Licence

Exports