TY - JOUR
T1 - Membrane Permeating Macrocycles
T2 - Design Guidelines from Machine Learning
AU - Williams-Noonan, Billy J.
AU - Speer, Melissa N.
AU - Le, Tu C.
AU - Sadek, Maiada M.
AU - Thompson, Philip E.
AU - Norton, Raymond S.
AU - Yuriev, Elizabeth
AU - Barlow, Nicholas
AU - Chalmers, David K.
AU - Yarovsky, Irene
N1 - Funding Information:
I.Y. acknowledges funding from the Australian Research Council under the Discovery Project scheme (grant DP190102290), computational resources provided by the National Computational Infrastructure of Australia (NCMAS, grant e87), and the HPC-GPGPU facility hosted at the University of Melbourne established with the assistance of the ARC LIEF Grant (LE170100200). R.S.N. and P.E.T. acknowledge funding from the National Health and Medical Research Council of Australia (grant APP1099428).
Publisher Copyright:
© 2022 American Chemical Society.
PY - 2022/9/30
Y1 - 2022/9/30
N2 - The ability to predict cell-permeable candidate molecules has great potential to assist drug discovery projects. Large molecules that lie beyond the Rule of Five (bRo5) are increasingly important as drug candidates and tool molecules for chemical biology. However, such large molecules usually do not cross cell membranes and cannot access intracellular targets or be developed as orally bioavailable drugs. Here, we describe a random forest (RF) machine learning model for the prediction of passive membrane permeation rates developed using a set of over 1000 bRo5 macrocyclic compounds. The model is based on easily calculated chemical features/descriptors as independent variables. Our random forest (RF) model substantially outperforms a multiple linear regression model based on the same features and achieves better performance metrics than previously reported models using the same underlying data. These features include: (1) polar surface area in water, (2) the octanol-water partitioning coefficient, (3) the number of hydrogen-bond donors, (4) the sum of the topological distances between nitrogen atoms, (5) the sum of the topological distances between nitrogen and oxygen atoms, and (6) the multiple molecular path count of order 2. The last three features represent molecular flexibility, the ability of the molecule to adopt different conformations in the aqueous and membrane interior phases, and the molecular "chameleonicity."Guided by the model, we propose design guidelines for membrane-permeating macrocycles. It is anticipated that this model will be useful in guiding the design of large, bioactive molecules for medicinal chemistry and chemical biology applications.
AB - The ability to predict cell-permeable candidate molecules has great potential to assist drug discovery projects. Large molecules that lie beyond the Rule of Five (bRo5) are increasingly important as drug candidates and tool molecules for chemical biology. However, such large molecules usually do not cross cell membranes and cannot access intracellular targets or be developed as orally bioavailable drugs. Here, we describe a random forest (RF) machine learning model for the prediction of passive membrane permeation rates developed using a set of over 1000 bRo5 macrocyclic compounds. The model is based on easily calculated chemical features/descriptors as independent variables. Our random forest (RF) model substantially outperforms a multiple linear regression model based on the same features and achieves better performance metrics than previously reported models using the same underlying data. These features include: (1) polar surface area in water, (2) the octanol-water partitioning coefficient, (3) the number of hydrogen-bond donors, (4) the sum of the topological distances between nitrogen atoms, (5) the sum of the topological distances between nitrogen and oxygen atoms, and (6) the multiple molecular path count of order 2. The last three features represent molecular flexibility, the ability of the molecule to adopt different conformations in the aqueous and membrane interior phases, and the molecular "chameleonicity."Guided by the model, we propose design guidelines for membrane-permeating macrocycles. It is anticipated that this model will be useful in guiding the design of large, bioactive molecules for medicinal chemistry and chemical biology applications.
UR - http://www.scopus.com/inward/record.url?scp=85139437464&partnerID=8YFLogxK
U2 - 10.1021/acs.jcim.2c00809
DO - 10.1021/acs.jcim.2c00809
M3 - Article
C2 - 36178379
AN - SCOPUS:85139437464
SN - 1549-9596
VL - 62
SP - 4605
EP - 4619
JO - Journal of Chemical Information and Modeling
JF - Journal of Chemical Information and Modeling
IS - 19
ER -