TY - JOUR
T1 - Taxonomic landscape of the dark proteomes
T2 - whole-proteome scale interplay between structural darkness, intrinsic disorder, and crystallization propensity
AU - Hu, Gang
AU - Wang, Kui
AU - Song, Jiangning
AU - Uversky, Vladimir N.
AU - Kurgan, Lukasz
PY - 2018/11/1
Y1 - 2018/11/1
N2 - Growth rate of the protein sequence universe dramatically exceeds the speed of expansion for the protein structure universe, generating an immense dark proteome that includes proteins with unknown structure. A whole-proteome scale analysis of 5.4 million proteins from 987 proteomes in the three domains of life and viruses to systematically dissect an interplay between structural coverage, degree of putative intrinsic disorder, and predicted propensity for structure determination is performed. It has been found that Archaean and Bacterial proteomes have relatively high structural coverage and low amounts of disorder, whereas Eukaryotic and Viral proteomes are characterized by a broad spread of structural coverage and higher disorder levels. The analysis reveals that dark proteomes (i.e., proteomes containing high fractions of proteins with unknown structure) have significantly elevated amounts of intrinsic disorder and are predicted to be difficult to solve structurally. Although the majority of dark proteomes are of viral origin, many dark viral proteomes have at least modest crystallization propensity and only a handful of them are enriched in the intrinsic disorder. The disorder, structural coverage, and propensity are mapped for structural determination onto a novel proteome-level sequence similarity network to analyze the interplay of these characteristics in the taxonomic landscape.
AB - Growth rate of the protein sequence universe dramatically exceeds the speed of expansion for the protein structure universe, generating an immense dark proteome that includes proteins with unknown structure. A whole-proteome scale analysis of 5.4 million proteins from 987 proteomes in the three domains of life and viruses to systematically dissect an interplay between structural coverage, degree of putative intrinsic disorder, and predicted propensity for structure determination is performed. It has been found that Archaean and Bacterial proteomes have relatively high structural coverage and low amounts of disorder, whereas Eukaryotic and Viral proteomes are characterized by a broad spread of structural coverage and higher disorder levels. The analysis reveals that dark proteomes (i.e., proteomes containing high fractions of proteins with unknown structure) have significantly elevated amounts of intrinsic disorder and are predicted to be difficult to solve structurally. Although the majority of dark proteomes are of viral origin, many dark viral proteomes have at least modest crystallization propensity and only a handful of them are enriched in the intrinsic disorder. The disorder, structural coverage, and propensity are mapped for structural determination onto a novel proteome-level sequence similarity network to analyze the interplay of these characteristics in the taxonomic landscape.
KW - dark proteomes
KW - intrinsic disorder
KW - protein universe
KW - structural darkness
KW - X-ray crystallography
UR - http://www.scopus.com/inward/record.url?scp=85054631517&partnerID=8YFLogxK
U2 - 10.1002/pmic.201800243
DO - 10.1002/pmic.201800243
M3 - Article
C2 - 30198635
AN - SCOPUS:85054631517
SN - 1615-9853
VL - 18
JO - Proteomics
JF - Proteomics
IS - 21-22
M1 - 1800243
ER -