Denoising Piecewise Constant Nanopore Signals

Adrian Vidal, Emanuele Viterbo

Research output: Contribution to journalArticleResearchpeer-review

1 Citation (Scopus)

Abstract

Nanopore sequencing signals can be described as indirect noisy observations that reflect the instantaneous conductance of the nanopore channel as an analyte DNA molecule translocates through the pore in real time, with δ nucleotides (δ-mers) blocking the pore at any instant. The sequence of overlapping δ-mers along the ssDNA molecule are thus indirectly observed as a sequence of conductance levels (i.e., a signature) that is used to characterize its DNA sequence. In this paper, we denoise piecewise constant nanopore signals drawn from the same Gaussian-output, left-to-right hidden Markov model (HMM) and recover the unknown signature that is used to parameterize the HMM. We place a Gaussian prior on the signature and use importance sampling to approximate the minimum mean-square error estimate (MMSE) of the signature given the signals. To circumvent the difficulty of sampling from the true posterior, we construct a proposal distribution from which the joint segmentation of the observed signals can be efficiently sampled in O(Mn2k) time, where M is the number of signals, n is the average duration of each signal, and k is the length of the signature. Finally, we evaluate the performance of the algorithm using both simulated and experimental nanopore signals generated by Oxford Nanopore Technologies’ (ONT) R10.4.1 nanopore. The proposed method can be effective in constructing accurate δ-mer tables used to fully characterize all the 4δ states of any nanopore sequencer.

Original languageEnglish
Pages (from-to)1993-2007
Number of pages15
JournalIEEE Transactions on Signal Processing
Volume73
DOIs
Publication statusPublished - 15 May 2025

Keywords

  • hidden Markov models
  • importance sampling
  • minimum mean-square error estimation
  • Nanopore sequencers

Cite this