TY - JOUR
T1 - Inclusion of More Physics Leads to Less Data
T2 - Learning the Interaction Energy as a Function of Electron Deformation Density with Limited Training Data
AU - Low, Kaycee
AU - Coote, Michelle L.
AU - Izgorodina, Ekaterina I.
N1 - Funding Information:
The authors acknowledge a generous allocation of computational resources from the Monash eResearch Centre and the National Computational Infrastructure. M.L.C. gratefully acknowledges an Australian Research Council (ARC) Laureate Fellowship (FL170100041). K.L. was supported through an Australian Government Research Training Program scholarship.
Publisher Copyright:
© 2022 American Chemical Society. All rights reserved.
PY - 2022/3/8
Y1 - 2022/3/8
N2 - Machine learning (ML) approaches to predicting quantum mechanical (QM) properties have made great strides toward achieving the computational chemist's holy grail of structure-based property prediction. In contrast to direct ML methods, which encode a molecule with only structural information, in this work, we show that QM descriptors improve ML predictions of dimer interaction energy, both in terms of accuracy and data efficiency, by incorporating electronic information into the descriptor. We present the electron deformation density interaction energy machine learning (EDDIE-ML) model, which predicts the interaction energy as a function of Hartree-Fock electron deformation density. We compare its performance with leading direct ML schemes and modern DFT methods for the prediction of interaction energies for dimers of varying charge type, size, and intermolecular separation. Under a low-data regime, EDDIE-ML outperforms other direct ML schemes and is the only model readily transferrable to larger, more complex systems including base pair trimers and porous cages. The underlying physical connection between the density and interaction energy enables EDDIE-ML to reach an accuracy comparable to modern DFT functionals in fewer training data points compared to other ML methods.
AB - Machine learning (ML) approaches to predicting quantum mechanical (QM) properties have made great strides toward achieving the computational chemist's holy grail of structure-based property prediction. In contrast to direct ML methods, which encode a molecule with only structural information, in this work, we show that QM descriptors improve ML predictions of dimer interaction energy, both in terms of accuracy and data efficiency, by incorporating electronic information into the descriptor. We present the electron deformation density interaction energy machine learning (EDDIE-ML) model, which predicts the interaction energy as a function of Hartree-Fock electron deformation density. We compare its performance with leading direct ML schemes and modern DFT methods for the prediction of interaction energies for dimers of varying charge type, size, and intermolecular separation. Under a low-data regime, EDDIE-ML outperforms other direct ML schemes and is the only model readily transferrable to larger, more complex systems including base pair trimers and porous cages. The underlying physical connection between the density and interaction energy enables EDDIE-ML to reach an accuracy comparable to modern DFT functionals in fewer training data points compared to other ML methods.
UR - http://www.scopus.com/inward/record.url?scp=85125397641&partnerID=8YFLogxK
U2 - 10.1021/acs.jctc.1c01264
DO - 10.1021/acs.jctc.1c01264
M3 - Article
C2 - 35175045
AN - SCOPUS:85125397641
SN - 1549-9618
VL - 18
SP - 1607
EP - 1618
JO - Journal of Chemical Theory and Computation
JF - Journal of Chemical Theory and Computation
IS - 3
ER -