TY - JOUR
T1 - Overcoming false-positive gene-category enrichment in the analysis of spatially resolved transcriptomic brain atlas data
AU - Fulcher, Ben D.
AU - Arnatkeviciute, Aurina
AU - Fornito, Alex
N1 - Funding Information:
The authors would like to thank Stuart Oldham for assistance in processing human structural connectivity data, and Oliver Cliff for thoughtful discussion and feedback on the manuscript. We thank Colorgorical123 for assisting with the generation of color palettes. A.F. was supported by the Sylvia and Charles Viertel Charitable Foundation.
Publisher Copyright:
© 2021, The Author(s).
PY - 2021/5/11
Y1 - 2021/5/11
N2 - Transcriptomic atlases have improved our understanding of the correlations between gene-expression patterns and spatially varying properties of brain structure and function. Gene-category enrichment analysis (GCEA) is a common method to identify functional gene categories that drive these associations, using gene-to-category annotation systems like the Gene Ontology (GO). Here, we show that applying standard GCEA methodology to spatial transcriptomic data is affected by substantial false-positive bias, with GO categories displaying an over 500-fold average inflation of false-positive associations with random neural phenotypes in mouse and human. The estimated false-positive rate of a GO category is associated with its rate of being reported as significantly enriched in the literature, suggesting that published reports are affected by this false-positive bias. We show that within-category gene–gene coexpression and spatial autocorrelation are key drivers of the false-positive bias and introduce flexible ensemble-based null models that can account for these effects, made available as a software toolbox.
AB - Transcriptomic atlases have improved our understanding of the correlations between gene-expression patterns and spatially varying properties of brain structure and function. Gene-category enrichment analysis (GCEA) is a common method to identify functional gene categories that drive these associations, using gene-to-category annotation systems like the Gene Ontology (GO). Here, we show that applying standard GCEA methodology to spatial transcriptomic data is affected by substantial false-positive bias, with GO categories displaying an over 500-fold average inflation of false-positive associations with random neural phenotypes in mouse and human. The estimated false-positive rate of a GO category is associated with its rate of being reported as significantly enriched in the literature, suggesting that published reports are affected by this false-positive bias. We show that within-category gene–gene coexpression and spatial autocorrelation are key drivers of the false-positive bias and introduce flexible ensemble-based null models that can account for these effects, made available as a software toolbox.
UR - http://www.scopus.com/inward/record.url?scp=85105771782&partnerID=8YFLogxK
U2 - 10.1038/s41467-021-22862-1
DO - 10.1038/s41467-021-22862-1
M3 - Article
C2 - 33976144
AN - SCOPUS:85105771782
SN - 2041-1723
VL - 12
JO - Nature Communications
JF - Nature Communications
IS - 1
M1 - 2669
ER -