Survey of predictors of propensity for protein production and crystallization with application to predict resolution of crystal structures

Jianzhao Gao, Zhonghua Wu, Gang Hu, Kui Wang, Jiangning Song, Andrzej Joachimiak, Lukasz Kurgan

Research output: Contribution to journalReview ArticleResearchpeer-review

5 Citations (Scopus)

Abstract

Selection of proper targets for the X-ray crystallography will benefit biological research community immensely. Several computational models were proposed to predict propensity of successful protein production and diffraction quality crystallization from protein sequences. We reviewed a comprehensive collection of 22 such predictors that were developed in the last decade. We found that almost all of these models are easily accessible as webservers and/or standalone software and we demonstrated that some of them are widely used by the research community. We empirically evaluated and compared the predictive performance of seven representative methods. The analysis suggests that these methods produce quite accurate propensities for the diffraction-quality crystallization. We also summarized results of the first study of the relation between these predictive propensities and the resolution of the crystallizable proteins. We found that the propensities predicted by several methods are significantly higher for proteins that have high resolution structures compared to those with the low resolution structures. Moreover, we tested a new meta-predictor, MetaXXC, which averages the propensities generated by the three most accurate predictors of the diffraction-quality crystallization. MetaXXC generates putative values of resolution that have modest levels of correlation with the experimental resolutions and it offers the lowest mean absolute error when compared to the seven considered methods. We conclude that protein sequences can be used to fairly accurately predict whether their corresponding protein structures can be solved using X-ray crystallography. Moreover, we also ascertain that sequences can be used to reasonably well predict the resolution of the resulting protein crystals.

Original languageEnglish
Pages (from-to)200-210
Number of pages11
JournalCurrent Protein and Peptide Science
Volume19
Issue number2
DOIs
Publication statusPublished - 2018

Keywords

  • Diffraction quality crystallization
  • Meta prediction
  • Prediction
  • Protein production
  • Protein structure
  • Resolution of protein crystals
  • X-ray crystallography

Cite this

Gao, Jianzhao ; Wu, Zhonghua ; Hu, Gang ; Wang, Kui ; Song, Jiangning ; Joachimiak, Andrzej ; Kurgan, Lukasz. / Survey of predictors of propensity for protein production and crystallization with application to predict resolution of crystal structures. In: Current Protein and Peptide Science. 2018 ; Vol. 19, No. 2. pp. 200-210.
@article{30a3dc48af1e45daac2d5007c598fdc8,
title = "Survey of predictors of propensity for protein production and crystallization with application to predict resolution of crystal structures",
abstract = "Selection of proper targets for the X-ray crystallography will benefit biological research community immensely. Several computational models were proposed to predict propensity of successful protein production and diffraction quality crystallization from protein sequences. We reviewed a comprehensive collection of 22 such predictors that were developed in the last decade. We found that almost all of these models are easily accessible as webservers and/or standalone software and we demonstrated that some of them are widely used by the research community. We empirically evaluated and compared the predictive performance of seven representative methods. The analysis suggests that these methods produce quite accurate propensities for the diffraction-quality crystallization. We also summarized results of the first study of the relation between these predictive propensities and the resolution of the crystallizable proteins. We found that the propensities predicted by several methods are significantly higher for proteins that have high resolution structures compared to those with the low resolution structures. Moreover, we tested a new meta-predictor, MetaXXC, which averages the propensities generated by the three most accurate predictors of the diffraction-quality crystallization. MetaXXC generates putative values of resolution that have modest levels of correlation with the experimental resolutions and it offers the lowest mean absolute error when compared to the seven considered methods. We conclude that protein sequences can be used to fairly accurately predict whether their corresponding protein structures can be solved using X-ray crystallography. Moreover, we also ascertain that sequences can be used to reasonably well predict the resolution of the resulting protein crystals.",
keywords = "Diffraction quality crystallization, Meta prediction, Prediction, Protein production, Protein structure, Resolution of protein crystals, X-ray crystallography",
author = "Jianzhao Gao and Zhonghua Wu and Gang Hu and Kui Wang and Jiangning Song and Andrzej Joachimiak and Lukasz Kurgan",
year = "2018",
doi = "10.2174/1389203718666170921114437",
language = "English",
volume = "19",
pages = "200--210",
journal = "Current Protein and Peptide Science",
issn = "1389-2037",
publisher = "Bentham Science Publishers",
number = "2",

}

Survey of predictors of propensity for protein production and crystallization with application to predict resolution of crystal structures. / Gao, Jianzhao; Wu, Zhonghua; Hu, Gang; Wang, Kui; Song, Jiangning; Joachimiak, Andrzej; Kurgan, Lukasz.

In: Current Protein and Peptide Science, Vol. 19, No. 2, 2018, p. 200-210.

Research output: Contribution to journalReview ArticleResearchpeer-review

TY - JOUR

T1 - Survey of predictors of propensity for protein production and crystallization with application to predict resolution of crystal structures

AU - Gao, Jianzhao

AU - Wu, Zhonghua

AU - Hu, Gang

AU - Wang, Kui

AU - Song, Jiangning

AU - Joachimiak, Andrzej

AU - Kurgan, Lukasz

PY - 2018

Y1 - 2018

N2 - Selection of proper targets for the X-ray crystallography will benefit biological research community immensely. Several computational models were proposed to predict propensity of successful protein production and diffraction quality crystallization from protein sequences. We reviewed a comprehensive collection of 22 such predictors that were developed in the last decade. We found that almost all of these models are easily accessible as webservers and/or standalone software and we demonstrated that some of them are widely used by the research community. We empirically evaluated and compared the predictive performance of seven representative methods. The analysis suggests that these methods produce quite accurate propensities for the diffraction-quality crystallization. We also summarized results of the first study of the relation between these predictive propensities and the resolution of the crystallizable proteins. We found that the propensities predicted by several methods are significantly higher for proteins that have high resolution structures compared to those with the low resolution structures. Moreover, we tested a new meta-predictor, MetaXXC, which averages the propensities generated by the three most accurate predictors of the diffraction-quality crystallization. MetaXXC generates putative values of resolution that have modest levels of correlation with the experimental resolutions and it offers the lowest mean absolute error when compared to the seven considered methods. We conclude that protein sequences can be used to fairly accurately predict whether their corresponding protein structures can be solved using X-ray crystallography. Moreover, we also ascertain that sequences can be used to reasonably well predict the resolution of the resulting protein crystals.

AB - Selection of proper targets for the X-ray crystallography will benefit biological research community immensely. Several computational models were proposed to predict propensity of successful protein production and diffraction quality crystallization from protein sequences. We reviewed a comprehensive collection of 22 such predictors that were developed in the last decade. We found that almost all of these models are easily accessible as webservers and/or standalone software and we demonstrated that some of them are widely used by the research community. We empirically evaluated and compared the predictive performance of seven representative methods. The analysis suggests that these methods produce quite accurate propensities for the diffraction-quality crystallization. We also summarized results of the first study of the relation between these predictive propensities and the resolution of the crystallizable proteins. We found that the propensities predicted by several methods are significantly higher for proteins that have high resolution structures compared to those with the low resolution structures. Moreover, we tested a new meta-predictor, MetaXXC, which averages the propensities generated by the three most accurate predictors of the diffraction-quality crystallization. MetaXXC generates putative values of resolution that have modest levels of correlation with the experimental resolutions and it offers the lowest mean absolute error when compared to the seven considered methods. We conclude that protein sequences can be used to fairly accurately predict whether their corresponding protein structures can be solved using X-ray crystallography. Moreover, we also ascertain that sequences can be used to reasonably well predict the resolution of the resulting protein crystals.

KW - Diffraction quality crystallization

KW - Meta prediction

KW - Prediction

KW - Protein production

KW - Protein structure

KW - Resolution of protein crystals

KW - X-ray crystallography

UR - http://www.scopus.com/inward/record.url?scp=85042674327&partnerID=8YFLogxK

U2 - 10.2174/1389203718666170921114437

DO - 10.2174/1389203718666170921114437

M3 - Review Article

VL - 19

SP - 200

EP - 210

JO - Current Protein and Peptide Science

JF - Current Protein and Peptide Science

SN - 1389-2037

IS - 2

ER -