Abstract
We consider the problem of ligand affinity prediction as a regression task, typically with few labelled examples, many unlabelled instances, and multiple views on the data. In chemoinformatics, the prediction of binding affinities for protein ligands is an important but also challenging task. As protein-ligand bonds trigger biochemical reactions, their characterisation is a crucial step in the process of drug discovery and design. However, the practical determination of ligand affinities is very expensive, whereas unlabelled compounds are available in abundance. Additionally, many different vectorial representations for compounds (molecular fingerprints) exist that cover different sets of features. To this task we propose to apply a co-regularisation approach, which extracts information from unlabelled examples by ensuring that individual models trained on different fingerprints make similar predictions. We extend support vector regression similarly to the existing co-regularised least squares regression (CoRLSR) and obtain a co-regularised support vector regression (CoSVR). We empirically evaluate the performance of CoSVR on various protein-ligand datasets. We show that CoSVR outperforms CoRLSR as well as existing state-of-The-Art approaches that do not take unlabelled molecules into account. Additionally, we provide a theoretical bound on the Rademacher complexity for CoSVR.
Original language | English |
---|---|
Title of host publication | Proceedings - 16th IEEE International Conference on Data Mining Workshops, ICDMW 2016 |
Editors | Carlotta Domeniconi, Francesco Gullo |
Place of Publication | Piscataway NJ USA |
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Pages | 261-268 |
Number of pages | 8 |
ISBN (Electronic) | 9781509054725, 9781509059102 |
ISBN (Print) | 9781509059119 |
DOIs | |
Publication status | Published - 2016 |
Externally published | Yes |
Event | IEEE International Conference on Data Mining Workshops 2016 - Barcelona, Spain Duration: 12 Dec 2016 → 15 Dec 2016 Conference number: 16th https://icdm2016.eurecat.cat/ |
Conference
Conference | IEEE International Conference on Data Mining Workshops 2016 |
---|---|
Abbreviated title | ICDMW 2016 |
Country/Territory | Spain |
City | Barcelona |
Period | 12/12/16 → 15/12/16 |
Internet address |
Keywords
- Co-regularisation
- Kernel methods
- Ligand affinity prediction
- Multi-view
- Support vector regression