TY - JOUR
T1 - POSSUM
T2 - A bioinformatics toolkit for generating numerical sequence feature descriptors based on PSSM profiles
AU - Wang, Jiawei
AU - Yang, Bingjiao
AU - Revote, Jerico
AU - Leier, Andre
AU - Marquez-Lago, Tatiana T.
AU - Webb, Geoffrey
AU - Song, Jiangning
AU - Chou, Kuo-Chen
AU - Lithgow, Trevor
PY - 2017/1/1
Y1 - 2017/1/1
N2 - Evolutionary information in the form of a Position-Specific Scoring Matrix (PSSM) is a widely used and highly informative representation of protein sequences. Accordingly, PSSM-based feature descriptors have been successfully applied to improve the performance of various predictors of protein attributes. Even though a number of algorithms have been proposed in previous studies, there is currently no universal web server or toolkit available for generating this wide variety of descriptors. Here, we present POSSUM (Position-Specific Scoring matrix-based feature generator for machine learning), a versatile toolkit with an online web server that can generate 21 types of PSSMbased feature descriptors, thereby addressing a crucial need for bioinformaticians and computational biologists. We envisage that this comprehensive toolkit will be widely used as a powerful tool to facilitate feature extraction, selection, and benchmarking of machine learning-based models, thereby contributing to a more effective analysis and modeling pipeline for bioinformatics research. VC The Author 2017. Published by Oxford University Press. All rights reserved.
AB - Evolutionary information in the form of a Position-Specific Scoring Matrix (PSSM) is a widely used and highly informative representation of protein sequences. Accordingly, PSSM-based feature descriptors have been successfully applied to improve the performance of various predictors of protein attributes. Even though a number of algorithms have been proposed in previous studies, there is currently no universal web server or toolkit available for generating this wide variety of descriptors. Here, we present POSSUM (Position-Specific Scoring matrix-based feature generator for machine learning), a versatile toolkit with an online web server that can generate 21 types of PSSMbased feature descriptors, thereby addressing a crucial need for bioinformaticians and computational biologists. We envisage that this comprehensive toolkit will be widely used as a powerful tool to facilitate feature extraction, selection, and benchmarking of machine learning-based models, thereby contributing to a more effective analysis and modeling pipeline for bioinformatics research. VC The Author 2017. Published by Oxford University Press. All rights reserved.
UR - http://www.scopus.com/inward/record.url?scp=85027310385&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btx302
DO - 10.1093/bioinformatics/btx302
M3 - Article
C2 - 28903538
AN - SCOPUS:85027310385
VL - 33
SP - 2756
EP - 2758
JO - Bioinformatics
JF - Bioinformatics
SN - 1367-4803
IS - 17
ER -