Macquarie University
01whole.pdf (3.75 MB)
Download file

A Markov chain Monte Carlo algorithm for change-point detection in nanopore sequencing data

Download (3.75 MB)
posted on 2022-10-14, 02:27 authored by Sophia ShenSophia Shen

Understanding the genetic makeup of organisms is a very important goal in bioinformatics. DNA sequencing, the process of determining the order of the nucleotide bases in DNA, can now be performed quickly and cheaply with commercially available devices no bigger than a USB stick. The latest DNA sequencers use nanopore technologies to capture long, repetitive DNA structures with great success, however, the reported reading accuracy needs improving. One main source of error occurs during the basecalling process when raw nanopore signals outputted by the sequencers are being translated into genetic codes. The difficulty of basecalling lies in that not only do the nanopore signals need to be segmented, but they also need be grouped into four types, each representing a genetic code. In this thesis, we propose a novel algorithm using change-point detection methods and Markov chain Monte Carlo (MCMC) sampling techniques. We use real and simulated data to demonstrate the effectiveness of the proposed algorithm and compare it with other change-point detection packages.


Table of Contents

1. Introduction -- 2. Methods -- 3. Results -- 4. Discussion and Future Direction -- A. Appendix -- References


A thesis submitted to Macquarie University for the Degree of Master of Research

Awarding Institution

Macquarie University

Degree Type

Thesis MRes


Thesis (MRes), Macquarie University, Faculty of Science and Engineering, 2021

Department, Centre or School

Department of Mathematics and Statistics

Year of Award


Principal Supervisor

Georgy Sofronov


Copyright: Sophia Shen Copyright disclaimer:




66 pages

Usage metrics

    Macquarie University Theses