Accurate computational prediction of melting points and aqueous solubilities of organic compounds would be very useful but is notoriously difficult. Predicting the lattice energies of compounds is key to understanding and predicting their melting behavior and ultimately their solubility behavior. We report robust, predictive, quantitative structure-property relationship (QSPR) models for enthalpies of sublimation, crystal lattice energies, and melting points for a very large and structurally diverse set of small organic compounds. Sparse Bayesian feature selection and machine learning methods were employed to select the most relevant molecular descriptors for the model and to generate parsimonious quantitative models. The final enthalpy of sublimation model is a four-parameter multilinear equation that has an r2 value of 0.96 and an average absolute error of 7.9 ? 0.3 kJ.mol-1. The melting point model can predict this property with a standard error of 45 ? 1 K and r2 value of 0.79. Given the size and diversity of the training data, these conceptually transparent and accurate models can be used to predict sublimation enthalpy, lattice energy, and melting points of organic compounds in general.
|Number of pages||7|
|Journal||Journal of Chemical Information and Modeling|
|Publication status||Published - 2013|