Macquarie University
Browse
01whole.pdf (3.75 MB)

A Markov chain Monte Carlo algorithm for change-point detection in nanopore sequencing data

Download (3.75 MB)
thesis
posted on 2022-10-14, 02:27 authored by Sophia ShenSophia Shen

Understanding the genetic makeup of organisms is a very important goal in bioinformatics. DNA sequencing, the process of determining the order of the nucleotide bases in DNA, can now be performed quickly and cheaply with commercially available devices no bigger than a USB stick. The latest DNA sequencers use nanopore technologies to capture long, repetitive DNA structures with great success, however, the reported reading accuracy needs improving. One main source of error occurs during the basecalling process when raw nanopore signals outputted by the sequencers are being translated into genetic codes. The difficulty of basecalling lies in that not only do the nanopore signals need to be segmented, but they also need be grouped into four types, each representing a genetic code. In this thesis, we propose a novel algorithm using change-point detection methods and Markov chain Monte Carlo (MCMC) sampling techniques. We use real and simulated data to demonstrate the effectiveness of the proposed algorithm and compare it with other change-point detection packages.

History

Table of Contents

1. Introduction -- 2. Methods -- 3. Results -- 4. Discussion and Future Direction -- A. Appendix -- References

Notes

A thesis submitted to Macquarie University for the Degree of Master of Research

Awarding Institution

Macquarie University

Degree Type

Thesis MRes

Degree

Thesis (MRes), Macquarie University, Faculty of Science and Engineering, 2021

Department, Centre or School

Department of Mathematics and Statistics

Year of Award

2021

Principal Supervisor

Georgy Sofronov

Rights

Copyright: Sophia Shen Copyright disclaimer: https://www.mq.edu.au/copyright-disclaimer

Language

English

Extent

66 pages

Usage metrics

    Macquarie University Theses

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC