iProt-Sub: a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites

Jiangning Song, Yanan Wang, Fuyi Li, Tatsuya Akutsu, Neil D Rawlings, Geoffrey I. Webb, Kuo-Chen Chou

Research output: Contribution to journalReview ArticleResearchpeer-review

68 Citations (Scopus)

Abstract

Regulation of proteolysis plays a critical role in a myriad of important cellular processes. The key to better understanding the mechanisms that control this process is to identify the specific substrates that each protease targets. To address this, we have developed iProt-Sub, a powerful bioinformatics tool for the accurate prediction of protease-specific substrates and their cleavage sites. Importantly, iProt-Sub represents a significantly advanced version of its successful predecessor, PROSPER. It provides optimized cleavage site prediction models with better prediction performance and coverage for more species-specific proteases (4 major protease families and 38 different proteases). iProt-Sub integrates heterogeneous sequence and structural features and uses a two-step feature selection procedure to further remove redundant and irrelevant features in an effort to improve the cleavage site prediction accuracy. Features used by iProt-Sub are encoded by 11 different sequence encoding schemes, including local amino acid sequence profile, secondary structure, solvent accessibility and native disorder, which will allow a more accurate representation of the protease specificity of approximately 38 proteases and training of the prediction models. Benchmarking experiments using cross-validation and independent tests showed that iProt-Sub is able to achieve a better performance than several existing generic tools. We anticipate that iProt-Sub will be a powerful tool for proteome-wide prediction of protease-specific substrates and their cleavage sites, and will facilitate hypothesis-driven functional interrogation of protease-specific substrate cleavage and proteolytic events.

Original languageEnglish
Pages (from-to)638-658
Number of pages21
JournalBriefings in Bioinformatics
Volume20
Issue number2
DOIs
Publication statusPublished - 25 Mar 2019

Keywords

  • protease
  • substrate
  • cleavage site
  • sequence analysis
  • machine learning
  • five-step rule

Cite this

Song, Jiangning ; Wang, Yanan ; Li, Fuyi ; Akutsu, Tatsuya ; Rawlings, Neil D ; Webb, Geoffrey I. ; Chou, Kuo-Chen. / iProt-Sub : a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. In: Briefings in Bioinformatics. 2019 ; Vol. 20, No. 2. pp. 638-658.
@article{2edeb79c153441df9dfbe504571a93a7,
title = "iProt-Sub: a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites",
abstract = "Regulation of proteolysis plays a critical role in a myriad of important cellular processes. The key to better understanding the mechanisms that control this process is to identify the specific substrates that each protease targets. To address this, we have developed iProt-Sub, a powerful bioinformatics tool for the accurate prediction of protease-specific substrates and their cleavage sites. Importantly, iProt-Sub represents a significantly advanced version of its successful predecessor, PROSPER. It provides optimized cleavage site prediction models with better prediction performance and coverage for more species-specific proteases (4 major protease families and 38 different proteases). iProt-Sub integrates heterogeneous sequence and structural features and uses a two-step feature selection procedure to further remove redundant and irrelevant features in an effort to improve the cleavage site prediction accuracy. Features used by iProt-Sub are encoded by 11 different sequence encoding schemes, including local amino acid sequence profile, secondary structure, solvent accessibility and native disorder, which will allow a more accurate representation of the protease specificity of approximately 38 proteases and training of the prediction models. Benchmarking experiments using cross-validation and independent tests showed that iProt-Sub is able to achieve a better performance than several existing generic tools. We anticipate that iProt-Sub will be a powerful tool for proteome-wide prediction of protease-specific substrates and their cleavage sites, and will facilitate hypothesis-driven functional interrogation of protease-specific substrate cleavage and proteolytic events.",
keywords = "protease, substrate, cleavage site, sequence analysis, machine learning, five-step rule",
author = "Jiangning Song and Yanan Wang and Fuyi Li and Tatsuya Akutsu and Rawlings, {Neil D} and Webb, {Geoffrey I.} and Kuo-Chen Chou",
year = "2019",
month = "3",
day = "25",
doi = "10.1093/bib/bby028",
language = "English",
volume = "20",
pages = "638--658",
journal = "Briefings in Bioinformatics",
issn = "1467-5463",
publisher = "Oxford Univ Press",
number = "2",

}

iProt-Sub : a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. / Song, Jiangning; Wang, Yanan; Li, Fuyi; Akutsu, Tatsuya; Rawlings, Neil D; Webb, Geoffrey I.; Chou, Kuo-Chen.

In: Briefings in Bioinformatics, Vol. 20, No. 2, 25.03.2019, p. 638-658.

Research output: Contribution to journalReview ArticleResearchpeer-review

TY - JOUR

T1 - iProt-Sub

T2 - a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites

AU - Song, Jiangning

AU - Wang, Yanan

AU - Li, Fuyi

AU - Akutsu, Tatsuya

AU - Rawlings, Neil D

AU - Webb, Geoffrey I.

AU - Chou, Kuo-Chen

PY - 2019/3/25

Y1 - 2019/3/25

N2 - Regulation of proteolysis plays a critical role in a myriad of important cellular processes. The key to better understanding the mechanisms that control this process is to identify the specific substrates that each protease targets. To address this, we have developed iProt-Sub, a powerful bioinformatics tool for the accurate prediction of protease-specific substrates and their cleavage sites. Importantly, iProt-Sub represents a significantly advanced version of its successful predecessor, PROSPER. It provides optimized cleavage site prediction models with better prediction performance and coverage for more species-specific proteases (4 major protease families and 38 different proteases). iProt-Sub integrates heterogeneous sequence and structural features and uses a two-step feature selection procedure to further remove redundant and irrelevant features in an effort to improve the cleavage site prediction accuracy. Features used by iProt-Sub are encoded by 11 different sequence encoding schemes, including local amino acid sequence profile, secondary structure, solvent accessibility and native disorder, which will allow a more accurate representation of the protease specificity of approximately 38 proteases and training of the prediction models. Benchmarking experiments using cross-validation and independent tests showed that iProt-Sub is able to achieve a better performance than several existing generic tools. We anticipate that iProt-Sub will be a powerful tool for proteome-wide prediction of protease-specific substrates and their cleavage sites, and will facilitate hypothesis-driven functional interrogation of protease-specific substrate cleavage and proteolytic events.

AB - Regulation of proteolysis plays a critical role in a myriad of important cellular processes. The key to better understanding the mechanisms that control this process is to identify the specific substrates that each protease targets. To address this, we have developed iProt-Sub, a powerful bioinformatics tool for the accurate prediction of protease-specific substrates and their cleavage sites. Importantly, iProt-Sub represents a significantly advanced version of its successful predecessor, PROSPER. It provides optimized cleavage site prediction models with better prediction performance and coverage for more species-specific proteases (4 major protease families and 38 different proteases). iProt-Sub integrates heterogeneous sequence and structural features and uses a two-step feature selection procedure to further remove redundant and irrelevant features in an effort to improve the cleavage site prediction accuracy. Features used by iProt-Sub are encoded by 11 different sequence encoding schemes, including local amino acid sequence profile, secondary structure, solvent accessibility and native disorder, which will allow a more accurate representation of the protease specificity of approximately 38 proteases and training of the prediction models. Benchmarking experiments using cross-validation and independent tests showed that iProt-Sub is able to achieve a better performance than several existing generic tools. We anticipate that iProt-Sub will be a powerful tool for proteome-wide prediction of protease-specific substrates and their cleavage sites, and will facilitate hypothesis-driven functional interrogation of protease-specific substrate cleavage and proteolytic events.

KW - protease

KW - substrate

KW - cleavage site

KW - sequence analysis

KW - machine learning

KW - five-step rule

UR - http://www.scopus.com/inward/record.url?scp=85067539718&partnerID=8YFLogxK

U2 - 10.1093/bib/bby028

DO - 10.1093/bib/bby028

M3 - Review Article

C2 - 29897410

AN - SCOPUS:85067539718

VL - 20

SP - 638

EP - 658

JO - Briefings in Bioinformatics

JF - Briefings in Bioinformatics

SN - 1467-5463

IS - 2

ER -