Contamination source modeling with SCRuB improves cancer phenotype prediction from microbiome data

George I. Austin, Heekuk Park, Yoli Meydan, Dwayne Seeram, Tanya Sezin, Yue Clare Lou, Brian A. Firek, Michael J. Morowitz, Jillian F. Banfield, Angela M. Christiano, Itsik Pe’er, Anne Catrin Uhlemann, Liat Shenhav, Tal Korem

Research output: Contribution to journalArticleResearchpeer-review

42 Citations (Scopus)

Abstract

Sequencing-based approaches for the analysis of microbial communities are susceptible to contamination, which could mask biological signals or generate artifactual ones. Methods for in silico decontamination using controls are routinely used, but do not make optimal use of information shared across samples and cannot handle taxa that only partially originate in contamination or leakage of biological material into controls. Here we present Source tracking for Contamination Removal in microBiomes (SCRuB), a probabilistic in silico decontamination method that incorporates shared information across multiple samples and controls to precisely identify and remove contamination. We validate the accuracy of SCRuB in multiple data-driven simulations and experiments, including induced contamination, and demonstrate that it outperforms state-of-the-art methods by an average of 15–20 times. We showcase the robustness of SCRuB across multiple ecosystems, data types and sequencing depths. Demonstrating its applicability to microbiome research, SCRuB facilitates improved predictions of host phenotypes, most notably the prediction of treatment response in melanoma patients using decontaminated tumor microbiome data.

Original languageEnglish
Pages (from-to)1820-1828
Number of pages9
JournalNature Biotechnology
Volume41
Issue number12
DOIs
Publication statusPublished - Dec 2023
Externally publishedYes

Cite this