TY - JOUR
T1 - Data-driven large-scale genomic analysis reveals an intricate phylogenetic and functional landscape in J-domain proteins
AU - Malinverni, Duccio
AU - Zamuner, Stefano
AU - Rebeaud, Mathieu E.
AU - Barducci, Alessandro
A2 - Nillegoda, Nadinath B.
A2 - Rios, Paolo De Los
N1 - Funding Information:
ACKNOWLEDGMENTS. P.D.L.R. and D.M. thanks the Swiss National Science Foundation for financial support under grant number 200020_163042. D.M. thanks the Swiss National Science Foundation for financial support under grant number P2ELP3_181910. D.M. thanks ASLAC for support on this project. N.B.N. thanks National Health and Medical Research Council of Australia Investigator Grant APP1197021 and Recruitment Grant from Monash University Faculty of Medicine Nursing and Health Sciences with funding from the State Government of Victoria and the Australian Government.
Publisher Copyright:
Copyright © 2023 the Author(s).
PY - 2023/8/8
Y1 - 2023/8/8
N2 - The 70-kD heat shock protein (Hsp70) chaperone system is a central hub of the proteostasis network that helps maintain protein homeostasis in all organisms. The recruitment of Hsp70 to perform different and specific cellular functions is regulated by the J-domain protein (JDP) co-chaperone family carrying the small namesake J-domain, required to interact and drive the ATPase cycle of Hsp70s. Besides the J-domain, prokaryotic and eukaryotic JDPs display a staggering diversity in domain architecture, function, and cellular localization. Very little is known about the overall JDP family, despite their essential role in cellular proteostasis, development, and its link to a broad range of human diseases. In this work, we leverage the exponentially increasing number of JDP gene sequences identified across all kingdoms owing to the advancements in sequencing technology and provide a broad overview of the JDP repertoire. Using an automated classification scheme based on artificial neural networks (ANNs), we demonstrate that the sequences of J-domains carry sufficient discriminatory information to reliably recover the phylogeny, localization, and domain composition of the corresponding full-length JDP. By harnessing the interpretability of the ANNs, we find that many of the discriminatory sequence positions match residues that form the interaction interface between the J-domain and Hsp70. This reveals that key residues within the J-domains have coevolved with their obligatory Hsp70 partners to build chaperone circuits for specific functions in cells.
AB - The 70-kD heat shock protein (Hsp70) chaperone system is a central hub of the proteostasis network that helps maintain protein homeostasis in all organisms. The recruitment of Hsp70 to perform different and specific cellular functions is regulated by the J-domain protein (JDP) co-chaperone family carrying the small namesake J-domain, required to interact and drive the ATPase cycle of Hsp70s. Besides the J-domain, prokaryotic and eukaryotic JDPs display a staggering diversity in domain architecture, function, and cellular localization. Very little is known about the overall JDP family, despite their essential role in cellular proteostasis, development, and its link to a broad range of human diseases. In this work, we leverage the exponentially increasing number of JDP gene sequences identified across all kingdoms owing to the advancements in sequencing technology and provide a broad overview of the JDP repertoire. Using an automated classification scheme based on artificial neural networks (ANNs), we demonstrate that the sequences of J-domains carry sufficient discriminatory information to reliably recover the phylogeny, localization, and domain composition of the corresponding full-length JDP. By harnessing the interpretability of the ANNs, we find that many of the discriminatory sequence positions match residues that form the interaction interface between the J-domain and Hsp70. This reveals that key residues within the J-domains have coevolved with their obligatory Hsp70 partners to build chaperone circuits for specific functions in cells.
KW - artificial neural networks
KW - Hsp40 co-chaperones
KW - J-domain proteins
KW - large-scale data analysis
KW - protein homeostasis
UR - http://www.scopus.com/inward/record.url?scp=85167844891&partnerID=8YFLogxK
U2 - 10.1073/pnas.2218217120
DO - 10.1073/pnas.2218217120
M3 - Article
C2 - 37523524
AN - SCOPUS:85167844891
SN - 0027-8424
VL - 120
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
IS - 32
M1 - e2218217120
ER -