Interpretable prediction models for widespread m6A RNA modification across cell lines and tissues

Ying Zhang, Zhikang Wang, Yiwen Zhang, Shanshan Li, Yuming Guo, Jiangning Song, Dong-Jun Yu

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Motivation: RNA N6-methyladenosine (m6A) in Homo sapiens plays vital roles in a variety of biological functions. Precise identification of m6A modifications is thus essential to elucidation of their biological functions and underlying molecular-level mechanisms. Currently available high-throughput single-nucleotide-resolution m6A modification data considerably accelerated the identification of RNA modification sites through the development of data-driven computational methods. Nevertheless, existing methods have limitations in terms of the coverage of single-nucleotide-resolution cell lines and have poor capability in model interpretations, thereby having limited applicability. Results: In this study, we present CLSM6A, comprising a set of deep learning-based models designed for predicting single-nucleotide-resolution m6A RNA modification sites across eight different cell lines and three tissues. Extensive benchmarking experiments are conducted on well-curated datasets and accordingly, CLSM6A achieves superior performance than current state-of-the-art methods. Furthermore, CLSM6A is capable of interpreting the prediction decision-making process by excavating critical motifs activated by filters and pinpointing highly concerned positions in both forward and backward propagations. CLSM6A exhibits better portability on similar cross-cell line/tissue datasets, reveals a strong association between highly activated motifs and high-impact motifs, and demonstrates complementary attributes of different interpretation strategies.

Original languageEnglish
Article numberbtad709
Number of pages8
JournalBioinformatics
Volume39
Issue number12
DOIs
Publication statusPublished - 1 Dec 2023

Cite this