Skip to main navigation Skip to search Skip to main content

Reconstruction of microbial haplotypes by integration of statistical and physical linkage in scaffolding

  • Chen Cao
  • , Jingni He
  • , Lauren Mak
  • , Deshan Perera
  • , Devin Kwok
  • , Jia Wang
  • , Minghao Li
  • , Tobias Mourier
  • , Stefan Gavriliuc
  • , Matthew Greenberg
  • , A. Sorana Morrissy
  • , Laura K. Sycuro
  • , Guang Yang
  • , Daniel C. Jeffares
  • , Quan Long

Research output: Contribution to journalArticleResearchpeer-review

Abstract

DNA sequencing technologies provide unprecedented opportunities to analyze within-host evolution of microorganism populations. Often, within-host populations are analyzed via pooled sequencing of the population, which contains multiple individuals or “haplotypes.” However, current next-generation sequencing instruments, in conjunction with single-molecule barcoded linked-reads, cannot distinguish long haplotypes directly. Computational reconstruction of haplotypes from pooled sequencing has been attempted in virology, bacterial genomics, metagenomics, and human genetics, using algorithms based on either cross-host genetic sharing or within-host genomic reads. Here, we describe PoolHapX, a flexible computational approach that integrates information from both genetic sharing and genomic sequencing. We demonstrated that PoolHapX outperforms state-of-the-art tools tailored to specific organismal systems, and is robust to within-host evolution. Importantly, together with barcoded linked-reads, PoolHapX can infer whole-chromosome-scale haplotypes from 50 pools each containing 12 different haplotypes. By analyzing real data, we uncovered dynamic variations in the evolutionary processes of within-patient HIV populations previously unobserved in single position-based analysis.

Original languageEnglish
Pages (from-to)2660-2672
Number of pages13
JournalMolecular Biology and Evolution
Volume38
Issue number6
DOIs
Publication statusPublished - Jun 2021
Externally publishedYes

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Haplotype reconstruction
  • Linkage disequilibrium
  • Regularization
  • Within-host evolution

Cite this