Conotoxins are small disulfide-rich peptides that are invaluable channel-targeted peptides and target neuronal receptors. They show prospects for being potent pharmaceuticals in the treatment of Alzheimer s disease, Parkinson s disease, and epilepsy. Accurate and fast prediction of conotoxin superfamily is very helpful towards the understanding of its biological and pharmacological functions especially in the post-genomic era. In the present study, we have developed a novel approach called PredCSF for predicting the conotoxin superfamily from the amino acid sequence directly based on fusing different kinds of sequential features by using modified one-versus-rest SVMs. The input features to the PredCSF classifiers are composed of physicochemical properties, evolutionary information, predicted second structure and amino acid composition, where the most important features are further screened by random forest feature selection to improve the prediction performance. The prediction results show that PredCSF can obtain an overall accuracy of 90.65 based on a benchmark dataset constructed from the most recent database, which consists of 4 main conotoxin superfamilies and 1 class of non-conotoxin class. Systematic experiments also show that combing different features is helpful for enhancing the prediction power when dealing with complex biological problems. PredCSF is expected to be a powerful tool for in silico identification of novel conotonxins and is freely available for academic use at http://www.csbio.sjtu.edu.cn/bioinf/PredCSF.
|Pages (from-to)||261 - 267|
|Number of pages||7|
|Journal||Protein and Peptide Letters|
|Publication status||Published - 2011|