TY - JOUR
T1 - Performing protein fold recognition by exploiting a stack convolutional neural network with the attention mechanism
AU - Han, Ke
AU - Liu, Yan
AU - Xu, Jian
AU - Song, Jiangning
AU - Yu, Dong Jun
N1 - Funding Information:
This work was financially supported by the National Natural Science Foundation of China ( 62072243 , 61772273 , and 61872186 ), Natural Science Foundation of Jiangsu (No. BK20201304 ), Foundation of National Defense Key Laboratory of Science and Technology ( JZX7Y202001SY000901 ), National Health and Medical Research Council of Australia (NHMRC) ( 1127948, 1144652 ), Australian Research Council (ARC) ( LP110200333 and DP120104460 ), US National Institutes of Health ( R01 AI111965 ) and a Major Inter-Disciplinary Research (IDR) project awarded by Monash University .
Publisher Copyright:
© 2022 Elsevier Inc.
PY - 2022/8/15
Y1 - 2022/8/15
N2 - Protein fold recognition is a critical step in protein structure and function prediction, and aims to ascertain the most likely fold type of the query protein. As a typical pattern recognition problem, designing a powerful feature extractor and metric function to extract relevant and representative fold-specific features from protein sequences is the key to improving protein fold recognition. In this study, we propose an effective sequence-based approach, called RattnetFold, to identify protein fold types. The basic concept of RattnetFold is to employ a stack convolutional neural network with the attention mechanism that acts as a feature extractor to extract fold-specific features from protein residue-residue contact maps. Moreover, based on the fold-specific features, we leverage metric learning to project fold-specific features into a subspace where similar proteins are closer together and name this approach RattnetFoldPro. Benchmarking experiments illustrate that RattnetFold and RattnetFoldPro enable the convolutional neural networks to efficiently learn the underlying subtle patterns in residue-residue contact maps, thereby improving the performance of protein fold recognition. An online web server of RattnetFold and the benchmark datasets are freely available at http://csbio.njust.edu.cn/bioinf/rattnetfold/.
AB - Protein fold recognition is a critical step in protein structure and function prediction, and aims to ascertain the most likely fold type of the query protein. As a typical pattern recognition problem, designing a powerful feature extractor and metric function to extract relevant and representative fold-specific features from protein sequences is the key to improving protein fold recognition. In this study, we propose an effective sequence-based approach, called RattnetFold, to identify protein fold types. The basic concept of RattnetFold is to employ a stack convolutional neural network with the attention mechanism that acts as a feature extractor to extract fold-specific features from protein residue-residue contact maps. Moreover, based on the fold-specific features, we leverage metric learning to project fold-specific features into a subspace where similar proteins are closer together and name this approach RattnetFoldPro. Benchmarking experiments illustrate that RattnetFold and RattnetFoldPro enable the convolutional neural networks to efficiently learn the underlying subtle patterns in residue-residue contact maps, thereby improving the performance of protein fold recognition. An online web server of RattnetFold and the benchmark datasets are freely available at http://csbio.njust.edu.cn/bioinf/rattnetfold/.
KW - Attention mechanism
KW - Bioinformatics
KW - Convolutional neural network
KW - Protein fold recognition
KW - Residual learning
UR - http://www.scopus.com/inward/record.url?scp=85129511432&partnerID=8YFLogxK
U2 - 10.1016/j.ab.2022.114695
DO - 10.1016/j.ab.2022.114695
M3 - Article
C2 - 35487269
AN - SCOPUS:85129511432
VL - 651
JO - Analytical Biochemistry
JF - Analytical Biochemistry
SN - 0003-2697
M1 - 114695
ER -