POSSUM: A bioinformatics toolkit for generating numerical sequence feature descriptors based on PSSM profiles

Jiawei Wang, Bingjiao Yang, Jerico Revote, Andre Leier, Tatiana T. Marquez-Lago, Geoffrey Webb, Jiangning Song, Kuo-Chen Chou, Trevor Lithgow

Research output: Contribution to journalArticleResearchpeer-review

54 Citations (Scopus)

Abstract

Evolutionary information in the form of a Position-Specific Scoring Matrix (PSSM) is a widely used and highly informative representation of protein sequences. Accordingly, PSSM-based feature descriptors have been successfully applied to improve the performance of various predictors of protein attributes. Even though a number of algorithms have been proposed in previous studies, there is currently no universal web server or toolkit available for generating this wide variety of descriptors. Here, we present POSSUM (Position-Specific Scoring matrix-based feature generator for machine learning), a versatile toolkit with an online web server that can generate 21 types of PSSMbased feature descriptors, thereby addressing a crucial need for bioinformaticians and computational biologists. We envisage that this comprehensive toolkit will be widely used as a powerful tool to facilitate feature extraction, selection, and benchmarking of machine learning-based models, thereby contributing to a more effective analysis and modeling pipeline for bioinformatics research. VC The Author 2017. Published by Oxford University Press. All rights reserved.

Original languageEnglish
Pages (from-to)2756-2758
Number of pages3
JournalBioinformatics
Volume33
Issue number17
DOIs
Publication statusPublished - 1 Jan 2017

Cite this

Wang, Jiawei ; Yang, Bingjiao ; Revote, Jerico ; Leier, Andre ; Marquez-Lago, Tatiana T. ; Webb, Geoffrey ; Song, Jiangning ; Chou, Kuo-Chen ; Lithgow, Trevor. / POSSUM : A bioinformatics toolkit for generating numerical sequence feature descriptors based on PSSM profiles. In: Bioinformatics. 2017 ; Vol. 33, No. 17. pp. 2756-2758.
@article{4c2524e7873c455b8eccfc071490a92d,
title = "POSSUM: A bioinformatics toolkit for generating numerical sequence feature descriptors based on PSSM profiles",
abstract = "Evolutionary information in the form of a Position-Specific Scoring Matrix (PSSM) is a widely used and highly informative representation of protein sequences. Accordingly, PSSM-based feature descriptors have been successfully applied to improve the performance of various predictors of protein attributes. Even though a number of algorithms have been proposed in previous studies, there is currently no universal web server or toolkit available for generating this wide variety of descriptors. Here, we present POSSUM (Position-Specific Scoring matrix-based feature generator for machine learning), a versatile toolkit with an online web server that can generate 21 types of PSSMbased feature descriptors, thereby addressing a crucial need for bioinformaticians and computational biologists. We envisage that this comprehensive toolkit will be widely used as a powerful tool to facilitate feature extraction, selection, and benchmarking of machine learning-based models, thereby contributing to a more effective analysis and modeling pipeline for bioinformatics research. VC The Author 2017. Published by Oxford University Press. All rights reserved.",
author = "Jiawei Wang and Bingjiao Yang and Jerico Revote and Andre Leier and Marquez-Lago, {Tatiana T.} and Geoffrey Webb and Jiangning Song and Kuo-Chen Chou and Trevor Lithgow",
year = "2017",
month = "1",
day = "1",
doi = "10.1093/bioinformatics/btx302",
language = "English",
volume = "33",
pages = "2756--2758",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press, USA",
number = "17",

}

POSSUM : A bioinformatics toolkit for generating numerical sequence feature descriptors based on PSSM profiles. / Wang, Jiawei; Yang, Bingjiao; Revote, Jerico ; Leier, Andre; Marquez-Lago, Tatiana T.; Webb, Geoffrey ; Song, Jiangning; Chou, Kuo-Chen; Lithgow, Trevor.

In: Bioinformatics, Vol. 33, No. 17, 01.01.2017, p. 2756-2758.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - POSSUM

T2 - A bioinformatics toolkit for generating numerical sequence feature descriptors based on PSSM profiles

AU - Wang, Jiawei

AU - Yang, Bingjiao

AU - Revote, Jerico

AU - Leier, Andre

AU - Marquez-Lago, Tatiana T.

AU - Webb, Geoffrey

AU - Song, Jiangning

AU - Chou, Kuo-Chen

AU - Lithgow, Trevor

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Evolutionary information in the form of a Position-Specific Scoring Matrix (PSSM) is a widely used and highly informative representation of protein sequences. Accordingly, PSSM-based feature descriptors have been successfully applied to improve the performance of various predictors of protein attributes. Even though a number of algorithms have been proposed in previous studies, there is currently no universal web server or toolkit available for generating this wide variety of descriptors. Here, we present POSSUM (Position-Specific Scoring matrix-based feature generator for machine learning), a versatile toolkit with an online web server that can generate 21 types of PSSMbased feature descriptors, thereby addressing a crucial need for bioinformaticians and computational biologists. We envisage that this comprehensive toolkit will be widely used as a powerful tool to facilitate feature extraction, selection, and benchmarking of machine learning-based models, thereby contributing to a more effective analysis and modeling pipeline for bioinformatics research. VC The Author 2017. Published by Oxford University Press. All rights reserved.

AB - Evolutionary information in the form of a Position-Specific Scoring Matrix (PSSM) is a widely used and highly informative representation of protein sequences. Accordingly, PSSM-based feature descriptors have been successfully applied to improve the performance of various predictors of protein attributes. Even though a number of algorithms have been proposed in previous studies, there is currently no universal web server or toolkit available for generating this wide variety of descriptors. Here, we present POSSUM (Position-Specific Scoring matrix-based feature generator for machine learning), a versatile toolkit with an online web server that can generate 21 types of PSSMbased feature descriptors, thereby addressing a crucial need for bioinformaticians and computational biologists. We envisage that this comprehensive toolkit will be widely used as a powerful tool to facilitate feature extraction, selection, and benchmarking of machine learning-based models, thereby contributing to a more effective analysis and modeling pipeline for bioinformatics research. VC The Author 2017. Published by Oxford University Press. All rights reserved.

UR - http://www.scopus.com/inward/record.url?scp=85027310385&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btx302

DO - 10.1093/bioinformatics/btx302

M3 - Article

C2 - 28903538

AN - SCOPUS:85027310385

VL - 33

SP - 2756

EP - 2758

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 17

ER -