TripletCell: a deep metric learning framework for accurate annotation of cell types at the single-cell level

Yan Liu, Guo Wei, Long-Chen Shen, Robin B. Gasser, Jiangning Song, Dijun Chen, Dong-Jun Yu

Research output: Contribution to journalArticleResearchpeer-review

4 Citations (Scopus)

Abstract

Single-cell RNA sequencing (scRNA-seq) has significantly accelerated the experimental characterization of distinct cell lineages and types in complex tissues and organisms. Cell-type annotation is of great importance in most of the scRNA-seq analysis pipelines. However, manual cell-type annotation heavily relies on the quality of scRNA-seq data and marker genes, and therefore can be laborious and time-consuming. Furthermore, the heterogeneity of scRNA-seq datasets poses another challenge for accurate cell-type annotation, such as the batch effect induced by different scRNA-seq protocols and samples. To overcome these limitations, here we propose a novel pipeline, termed TripletCell, for cross-species, cross-protocol and cross-sample cell-type annotation. We developed a cell embedding and dimension-reduction module for the feature extraction (FE) in TripletCell, namely TripletCell-FE, to leverage the deep metric learning-based algorithm for the relationships between the reference gene expression matrix and the query cells. Our experimental studies on 21 datasets (covering nine scRNA-seq protocols, two species and three tissues) demonstrate that TripletCell outperformed state-of-the-art approaches for cell-type annotation. More importantly, regardless of protocols or species, TripletCell can deliver outstanding and robust performance in annotating different types of cells. TripletCell is freely available at https://github.com/liuyan3056/TripletCell. We believe that TripletCell is a reliable computational tool for accurately annotating various cell types using scRNA-seq data and will be instrumental in assisting the generation of novel biological hypotheses in cell biology.

Original languageEnglish
Pages (from-to)1-11
Number of pages11
JournalBriefings in Bioinformatics
Volume24
Issue number3
DOIs
Publication statusPublished - May 2023

Keywords

  • batch effect
  • deep metric learning
  • identify novel cell types
  • scRNA-seq data

Cite this