TY - JOUR
T1 - DTFLOW
T2 - Inference and Visualization of Single-cell Pseudotime Trajectory Using Diffusion Propagation
AU - Wei, Jiangyong
AU - Zhou, Tianshou
AU - Zhang, Xinan
AU - Tian, Tianhai
N1 - Funding Information:
This work was supported by the National Natural Science Foundation of China (Grant Nos. 11571368 , 11931019 , 11775314 , and 11871238 ) and the Fundamental Research Funds for the Central Universities , China (Grant No. 2662019QD031 ).
Publisher Copyright:
© 2021 The Authors
Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2021/4
Y1 - 2021/4
N2 - One of the major challenges in single-cell data analysis is the determination of cellular developmental trajectories using single-cell data. Although substantial studies have been conducted in recent years, more effective methods are still strongly needed to infer the developmental processes accurately. This work devises a new method, named DTFLOW, for determining the pseudo-temporal trajectories with multiple branches. DTFLOW consists of two major steps: a new method called Bhattacharyya kernel feature decomposition (BKFD) to reduce the data dimensions, and a novel approach named Reverse Searching on k-nearest neighbor graph (RSKG) to identify the multi-branching processes of cellular differentiation. In BKFD, we first establish a stationary distribution for each cell to represent the transition of cellular developmental states based on the random walk with restart algorithm, and then propose a new distance metric for calculating pseudotime of single cells by introducing the Bhattacharyya kernel matrix. The effectiveness of DTFLOW is rigorously examined by using four single-cell datasets. We compare the efficiency of DTFLOW with the published state-of-the-art methods. Simulation results suggest that DTFLOW has superior accuracy and strong robustness properties for constructing pseudotime trajectories. The Python source code of DTFLOW can be freely accessed at https://github.com/statway/DTFLOW.
AB - One of the major challenges in single-cell data analysis is the determination of cellular developmental trajectories using single-cell data. Although substantial studies have been conducted in recent years, more effective methods are still strongly needed to infer the developmental processes accurately. This work devises a new method, named DTFLOW, for determining the pseudo-temporal trajectories with multiple branches. DTFLOW consists of two major steps: a new method called Bhattacharyya kernel feature decomposition (BKFD) to reduce the data dimensions, and a novel approach named Reverse Searching on k-nearest neighbor graph (RSKG) to identify the multi-branching processes of cellular differentiation. In BKFD, we first establish a stationary distribution for each cell to represent the transition of cellular developmental states based on the random walk with restart algorithm, and then propose a new distance metric for calculating pseudotime of single cells by introducing the Bhattacharyya kernel matrix. The effectiveness of DTFLOW is rigorously examined by using four single-cell datasets. We compare the efficiency of DTFLOW with the published state-of-the-art methods. Simulation results suggest that DTFLOW has superior accuracy and strong robustness properties for constructing pseudotime trajectories. The Python source code of DTFLOW can be freely accessed at https://github.com/statway/DTFLOW.
KW - Bhattacharyya kernel
KW - Manifold learning
KW - Pseudotime trajectory
KW - Single-cell heterogeneity
UR - http://www.scopus.com/inward/record.url?scp=85115137607&partnerID=8YFLogxK
U2 - 10.1016/j.gpb.2020.08.003
DO - 10.1016/j.gpb.2020.08.003
M3 - Article
C2 - 33662626
AN - SCOPUS:85115137607
SN - 1672-0229
VL - 19
SP - 306
EP - 318
JO - Genomics Proteomics and Bioinformatics
JF - Genomics Proteomics and Bioinformatics
IS - 2
ER -