Knowledge-Transfer learning for prediction of matrix metalloprotease substrate-cleavage sites

Ya'nan Wang, Jiangning Song, Tatiana Marquez-Lago, André Leier, Chen Li, Trevor Lithgow, Geoffrey I. Webb, Hong-Bin Shen

Research output: Contribution to journalArticleResearchpeer-review

8 Citations (Scopus)

Abstract

Matrix Metalloproteases (MMPs) are an important family of proteases that play crucial roles in key cellular and disease processes. Therefore, MMPs constitute important targets for drug design, development and delivery. Advanced proteomic technologies have identified type-specific target substrates; however, the complete repertoire of MMP substrates remains uncharacterized. Indeed, computational prediction of substrate-cleavage sites associated with MMPs is a challenging problem. This holds especially true when considering MMPs with few experimentally verified cleavage sites, such as for MMP-2,-3,-7, and-8. To fill this gap, we propose a new knowledge-Transfer computational framework which effectively utilizes the hidden shared knowledge from some MMP types to enhance predictions of other, distinct target substrate-cleavage sites. Our computational framework uses support vector machines combined with transfer machine learning and feature selection. To demonstrate the value of the model, we extracted a variety of substrate sequence-derived features and compared the performance of our method using both 5-fold cross-validation and independent tests. The results show that our transfer-learning-based method provides a robust performance, which is at least comparable to traditional feature-selection methods for prediction of MMP-2,-3,-7,-8,-9 and-12 substrate-cleavage sites on independent tests. The results also demonstrate that our proposed computational framework provides a useful alternative for the characterization of sequence-level determinants of MMP-substrate specificity.

Original languageEnglish
Article number5755
JournalScientific Reports
Volume7
Issue number1
DOIs
Publication statusPublished - 1 Dec 2017

Cite this

@article{013df525c65c410997f373d57bc2734b,
title = "Knowledge-Transfer learning for prediction of matrix metalloprotease substrate-cleavage sites",
abstract = "Matrix Metalloproteases (MMPs) are an important family of proteases that play crucial roles in key cellular and disease processes. Therefore, MMPs constitute important targets for drug design, development and delivery. Advanced proteomic technologies have identified type-specific target substrates; however, the complete repertoire of MMP substrates remains uncharacterized. Indeed, computational prediction of substrate-cleavage sites associated with MMPs is a challenging problem. This holds especially true when considering MMPs with few experimentally verified cleavage sites, such as for MMP-2,-3,-7, and-8. To fill this gap, we propose a new knowledge-Transfer computational framework which effectively utilizes the hidden shared knowledge from some MMP types to enhance predictions of other, distinct target substrate-cleavage sites. Our computational framework uses support vector machines combined with transfer machine learning and feature selection. To demonstrate the value of the model, we extracted a variety of substrate sequence-derived features and compared the performance of our method using both 5-fold cross-validation and independent tests. The results show that our transfer-learning-based method provides a robust performance, which is at least comparable to traditional feature-selection methods for prediction of MMP-2,-3,-7,-8,-9 and-12 substrate-cleavage sites on independent tests. The results also demonstrate that our proposed computational framework provides a useful alternative for the characterization of sequence-level determinants of MMP-substrate specificity.",
author = "Ya'nan Wang and Jiangning Song and Tatiana Marquez-Lago and Andr{\'e} Leier and Chen Li and Trevor Lithgow and Webb, {Geoffrey I.} and Hong-Bin Shen",
year = "2017",
month = "12",
day = "1",
doi = "10.1038/s41598-017-06219-7",
language = "English",
volume = "7",
journal = "Scientific Reports",
issn = "2045-2322",
publisher = "Nature Publishing Group",
number = "1",

}

Knowledge-Transfer learning for prediction of matrix metalloprotease substrate-cleavage sites. / Wang, Ya'nan; Song, Jiangning; Marquez-Lago, Tatiana; Leier, André; Li, Chen; Lithgow, Trevor; Webb, Geoffrey I.; Shen, Hong-Bin.

In: Scientific Reports, Vol. 7, No. 1, 5755, 01.12.2017.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Knowledge-Transfer learning for prediction of matrix metalloprotease substrate-cleavage sites

AU - Wang, Ya'nan

AU - Song, Jiangning

AU - Marquez-Lago, Tatiana

AU - Leier, André

AU - Li, Chen

AU - Lithgow, Trevor

AU - Webb, Geoffrey I.

AU - Shen, Hong-Bin

PY - 2017/12/1

Y1 - 2017/12/1

N2 - Matrix Metalloproteases (MMPs) are an important family of proteases that play crucial roles in key cellular and disease processes. Therefore, MMPs constitute important targets for drug design, development and delivery. Advanced proteomic technologies have identified type-specific target substrates; however, the complete repertoire of MMP substrates remains uncharacterized. Indeed, computational prediction of substrate-cleavage sites associated with MMPs is a challenging problem. This holds especially true when considering MMPs with few experimentally verified cleavage sites, such as for MMP-2,-3,-7, and-8. To fill this gap, we propose a new knowledge-Transfer computational framework which effectively utilizes the hidden shared knowledge from some MMP types to enhance predictions of other, distinct target substrate-cleavage sites. Our computational framework uses support vector machines combined with transfer machine learning and feature selection. To demonstrate the value of the model, we extracted a variety of substrate sequence-derived features and compared the performance of our method using both 5-fold cross-validation and independent tests. The results show that our transfer-learning-based method provides a robust performance, which is at least comparable to traditional feature-selection methods for prediction of MMP-2,-3,-7,-8,-9 and-12 substrate-cleavage sites on independent tests. The results also demonstrate that our proposed computational framework provides a useful alternative for the characterization of sequence-level determinants of MMP-substrate specificity.

AB - Matrix Metalloproteases (MMPs) are an important family of proteases that play crucial roles in key cellular and disease processes. Therefore, MMPs constitute important targets for drug design, development and delivery. Advanced proteomic technologies have identified type-specific target substrates; however, the complete repertoire of MMP substrates remains uncharacterized. Indeed, computational prediction of substrate-cleavage sites associated with MMPs is a challenging problem. This holds especially true when considering MMPs with few experimentally verified cleavage sites, such as for MMP-2,-3,-7, and-8. To fill this gap, we propose a new knowledge-Transfer computational framework which effectively utilizes the hidden shared knowledge from some MMP types to enhance predictions of other, distinct target substrate-cleavage sites. Our computational framework uses support vector machines combined with transfer machine learning and feature selection. To demonstrate the value of the model, we extracted a variety of substrate sequence-derived features and compared the performance of our method using both 5-fold cross-validation and independent tests. The results show that our transfer-learning-based method provides a robust performance, which is at least comparable to traditional feature-selection methods for prediction of MMP-2,-3,-7,-8,-9 and-12 substrate-cleavage sites on independent tests. The results also demonstrate that our proposed computational framework provides a useful alternative for the characterization of sequence-level determinants of MMP-substrate specificity.

UR - http://www.scopus.com/inward/record.url?scp=85024909398&partnerID=8YFLogxK

U2 - 10.1038/s41598-017-06219-7

DO - 10.1038/s41598-017-06219-7

M3 - Article

AN - SCOPUS:85024909398

VL - 7

JO - Scientific Reports

JF - Scientific Reports

SN - 2045-2322

IS - 1

M1 - 5755

ER -