Crysalis: An integrated server for computational analysis and design of protein crystallization

Huilin Wang, Liubin Feng, Ziding Zhang, Geoffrey I. Webb, Donghai Lin, Jiangning Song

Research output: Contribution to journalArticleResearchpeer-review

40 Citations (Scopus)

Abstract

The failure of multi-step experimental procedures to yield diffraction-quality crystals is a major bottleneck in protein structure determination. Accordingly, several bioinformatics methods have been successfully developed and employed to select crystallizable proteins. Unfortunately, the majority of existing in silico methods only allow the prediction of crystallization propensity, seldom enabling computational design of protein mutants that can be targeted for enhancing protein crystallizability. Here, we present Crysalis, an integrated crystallization analysis tool that builds on support-vector regression (SVR) models to facilitate computational protein crystallization prediction, analysis, and design. More specifically, the functionality of this new tool includes: (1) rapid selection of target crystallizable proteins at the proteome level, (2) identification of site non-optimality for protein crystallization and systematic analysis of all potential single-point mutations that might enhance protein crystallization propensity, and (3) annotation of target protein based on predicted structural properties. We applied the design mode of Crysalis to identify site non-optimality for protein crystallization on a proteome-scale, focusing on proteins currently classified as non-crystallizable. Our results revealed that site non-optimality is based on biases related to residues, predicted structures, physicochemical properties, and sequence loci, which provides in-depth understanding of the features influencing protein crystallization.

Original languageEnglish
Article number21383
Number of pages14
JournalScientific Reports
Volume6
DOIs
Publication statusPublished - 24 Feb 2016

Keywords

  • machine learning
  • protein function predictions
  • protein sequence analyses
  • proteome informatics

Cite this