### Abstract

Graph classification has drawn great interests in recent years due to the increasing number of applications involving objects with complex structure relationships. To date, all existing graph classification algorithms assume, explicitly or implicitly, that misclassifying instances in different classes incurs an equal amount of cost (or risk), which is often not the case in real-life applications (where misclassifying a certain class of samples, such as diseased patients, is subject to more expensive costs than others). Although cost-sensitive learning has been extensively studied, all methods are based on data with instance-feature representation. Graphs, however, do not have features available for learning and the feature space of graph data is likely infinite and needs to be carefully explored in order to favor classes with a higher cost. In this paper, we propose, CogBoost, a fast cost-sensitive graph classification algorithm, which aims to minimize the misclassification costs (instead of the errors) and achieve fast learning speed for large scale graph data sets. To minimize the misclassification costs, CogBoost iteratively selects the most discriminative subgraph by considering costs of different classes, and then solves a linear programming problem in each iteration by using Bayes decision rule based optimal loss function. In addition, a cutting plane algorithm is derived to speed up the solving of linear programs for fast learning on large scale data sets. Experiments and comparisons on real-world large graph data sets demonstrate the effectiveness and the efficiency of our algorithm.

Original language | English |
---|---|

Article number | 7006795 |

Pages (from-to) | 2933-2946 |

Number of pages | 14 |

Journal | IEEE Transactions on Knowledge and Data Engineering |

Volume | 27 |

Issue number | 11 |

DOIs | |

Publication status | Published - Nov 2015 |

Externally published | Yes |

### Keywords

- boosting
- cost-sensitive learning
- cutting plane algorithm
- Graph classification
- large scale graphs
- subgraphs

### Cite this

*IEEE Transactions on Knowledge and Data Engineering*,

*27*(11), 2933-2946. [7006795]. https://doi.org/10.1109/TKDE.2015.2391115

}

*IEEE Transactions on Knowledge and Data Engineering*, vol. 27, no. 11, 7006795, pp. 2933-2946. https://doi.org/10.1109/TKDE.2015.2391115

**CogBoost : boosting for fast cost-sensitive graph classification.** / Pan, Shirui; Wu, Jia; Zhu, Xingquan.

Research output: Contribution to journal › Article › Research › peer-review

TY - JOUR

T1 - CogBoost

T2 - boosting for fast cost-sensitive graph classification

AU - Pan, Shirui

AU - Wu, Jia

AU - Zhu, Xingquan

PY - 2015/11

Y1 - 2015/11

N2 - Graph classification has drawn great interests in recent years due to the increasing number of applications involving objects with complex structure relationships. To date, all existing graph classification algorithms assume, explicitly or implicitly, that misclassifying instances in different classes incurs an equal amount of cost (or risk), which is often not the case in real-life applications (where misclassifying a certain class of samples, such as diseased patients, is subject to more expensive costs than others). Although cost-sensitive learning has been extensively studied, all methods are based on data with instance-feature representation. Graphs, however, do not have features available for learning and the feature space of graph data is likely infinite and needs to be carefully explored in order to favor classes with a higher cost. In this paper, we propose, CogBoost, a fast cost-sensitive graph classification algorithm, which aims to minimize the misclassification costs (instead of the errors) and achieve fast learning speed for large scale graph data sets. To minimize the misclassification costs, CogBoost iteratively selects the most discriminative subgraph by considering costs of different classes, and then solves a linear programming problem in each iteration by using Bayes decision rule based optimal loss function. In addition, a cutting plane algorithm is derived to speed up the solving of linear programs for fast learning on large scale data sets. Experiments and comparisons on real-world large graph data sets demonstrate the effectiveness and the efficiency of our algorithm.

AB - Graph classification has drawn great interests in recent years due to the increasing number of applications involving objects with complex structure relationships. To date, all existing graph classification algorithms assume, explicitly or implicitly, that misclassifying instances in different classes incurs an equal amount of cost (or risk), which is often not the case in real-life applications (where misclassifying a certain class of samples, such as diseased patients, is subject to more expensive costs than others). Although cost-sensitive learning has been extensively studied, all methods are based on data with instance-feature representation. Graphs, however, do not have features available for learning and the feature space of graph data is likely infinite and needs to be carefully explored in order to favor classes with a higher cost. In this paper, we propose, CogBoost, a fast cost-sensitive graph classification algorithm, which aims to minimize the misclassification costs (instead of the errors) and achieve fast learning speed for large scale graph data sets. To minimize the misclassification costs, CogBoost iteratively selects the most discriminative subgraph by considering costs of different classes, and then solves a linear programming problem in each iteration by using Bayes decision rule based optimal loss function. In addition, a cutting plane algorithm is derived to speed up the solving of linear programs for fast learning on large scale data sets. Experiments and comparisons on real-world large graph data sets demonstrate the effectiveness and the efficiency of our algorithm.

KW - boosting

KW - cost-sensitive learning

KW - cutting plane algorithm

KW - Graph classification

KW - large scale graphs

KW - subgraphs

UR - http://www.scopus.com/inward/record.url?scp=84937812859&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2015.2391115

DO - 10.1109/TKDE.2015.2391115

M3 - Article

VL - 27

SP - 2933

EP - 2946

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 11

M1 - 7006795

ER -