TY - JOUR
T1 - Contamination source modeling with SCRuB improves cancer phenotype prediction from microbiome data
AU - Austin, George I.
AU - Park, Heekuk
AU - Meydan, Yoli
AU - Seeram, Dwayne
AU - Sezin, Tanya
AU - Lou, Yue Clare
AU - Firek, Brian A.
AU - Morowitz, Michael J.
AU - Banfield, Jillian F.
AU - Christiano, Angela M.
AU - Pe’er, Itsik
AU - Uhlemann, Anne Catrin
AU - Shenhav, Liat
AU - Korem, Tal
N1 - Funding Information:
We thank members of the Korem group for useful discussions. We are grateful to G. D. Poore, C. Martino, R. Knight, R. Straussman and I. Livyatan for assistance with analyzing and interpreting data from their studies, and to R. Straussman and I. Livyatan for helpful comments on the paper. In general, we thank all authors and participants involved in the generation of all data used in this study. The study was supported by the center for studies in Physics and Biology at Rockefeller University (L.S.), the Program for Mathematical Genomics at Columbia University (T.K.), the CIFAR Azrieli Global Scholarship in the Humans & the Microbiome Program (T.K.), R01HD106017 (T.K.) and R01CA245894 (A.-C.U.).
Publisher Copyright:
© 2023, The Author(s), under exclusive licence to Springer Nature America, Inc.
PY - 2023/12
Y1 - 2023/12
N2 - Sequencing-based approaches for the analysis of microbial communities are susceptible to contamination, which could mask biological signals or generate artifactual ones. Methods for in silico decontamination using controls are routinely used, but do not make optimal use of information shared across samples and cannot handle taxa that only partially originate in contamination or leakage of biological material into controls. Here we present Source tracking for Contamination Removal in microBiomes (SCRuB), a probabilistic in silico decontamination method that incorporates shared information across multiple samples and controls to precisely identify and remove contamination. We validate the accuracy of SCRuB in multiple data-driven simulations and experiments, including induced contamination, and demonstrate that it outperforms state-of-the-art methods by an average of 15–20 times. We showcase the robustness of SCRuB across multiple ecosystems, data types and sequencing depths. Demonstrating its applicability to microbiome research, SCRuB facilitates improved predictions of host phenotypes, most notably the prediction of treatment response in melanoma patients using decontaminated tumor microbiome data.
AB - Sequencing-based approaches for the analysis of microbial communities are susceptible to contamination, which could mask biological signals or generate artifactual ones. Methods for in silico decontamination using controls are routinely used, but do not make optimal use of information shared across samples and cannot handle taxa that only partially originate in contamination or leakage of biological material into controls. Here we present Source tracking for Contamination Removal in microBiomes (SCRuB), a probabilistic in silico decontamination method that incorporates shared information across multiple samples and controls to precisely identify and remove contamination. We validate the accuracy of SCRuB in multiple data-driven simulations and experiments, including induced contamination, and demonstrate that it outperforms state-of-the-art methods by an average of 15–20 times. We showcase the robustness of SCRuB across multiple ecosystems, data types and sequencing depths. Demonstrating its applicability to microbiome research, SCRuB facilitates improved predictions of host phenotypes, most notably the prediction of treatment response in melanoma patients using decontaminated tumor microbiome data.
UR - https://www.scopus.com/pages/publications/85150027954
U2 - 10.1038/s41587-023-01696-w
DO - 10.1038/s41587-023-01696-w
M3 - Article
C2 - 36928429
AN - SCOPUS:85150027954
SN - 1087-0156
VL - 41
SP - 1820
EP - 1828
JO - Nature Biotechnology
JF - Nature Biotechnology
IS - 12
ER -