CogBoost: boosting for fast cost-sensitive graph classification

Shirui Pan, Jia Wu, Xingquan Zhu

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Graph classification has drawn great interests in recent years due to the increasing number of applications involving objects with complex structure relationships. To date, all existing graph classification algorithms assume, explicitly or implicitly, that misclassifying instances in different classes incurs an equal amount of cost (or risk), which is often not the case in real-life applications (where misclassifying a certain class of samples, such as diseased patients, is subject to more expensive costs than others). Although cost-sensitive learning has been extensively studied, all methods are based on data with instance-feature representation. Graphs, however, do not have features available for learning and the feature space of graph data is likely infinite and needs to be carefully explored in order to favor classes with a higher cost. In this paper, we propose, CogBoost, a fast cost-sensitive graph classification algorithm, which aims to minimize the misclassification costs (instead of the errors) and achieve fast learning speed for large scale graph data sets. To minimize the misclassification costs, CogBoost iteratively selects the most discriminative subgraph by considering costs of different classes, and then solves a linear programming problem in each iteration by using Bayes decision rule based optimal loss function. In addition, a cutting plane algorithm is derived to speed up the solving of linear programs for fast learning on large scale data sets. Experiments and comparisons on real-world large graph data sets demonstrate the effectiveness and the efficiency of our algorithm.

Original languageEnglish
Article number7006795
Pages (from-to)2933-2946
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Volume27
Issue number11
DOIs
Publication statusPublished - Nov 2015
Externally publishedYes

Keywords

  • boosting
  • cost-sensitive learning
  • cutting plane algorithm
  • Graph classification
  • large scale graphs
  • subgraphs

Cite this

@article{61170376541f4599bcd160a87b8bb4c8,
title = "CogBoost: boosting for fast cost-sensitive graph classification",
abstract = "Graph classification has drawn great interests in recent years due to the increasing number of applications involving objects with complex structure relationships. To date, all existing graph classification algorithms assume, explicitly or implicitly, that misclassifying instances in different classes incurs an equal amount of cost (or risk), which is often not the case in real-life applications (where misclassifying a certain class of samples, such as diseased patients, is subject to more expensive costs than others). Although cost-sensitive learning has been extensively studied, all methods are based on data with instance-feature representation. Graphs, however, do not have features available for learning and the feature space of graph data is likely infinite and needs to be carefully explored in order to favor classes with a higher cost. In this paper, we propose, CogBoost, a fast cost-sensitive graph classification algorithm, which aims to minimize the misclassification costs (instead of the errors) and achieve fast learning speed for large scale graph data sets. To minimize the misclassification costs, CogBoost iteratively selects the most discriminative subgraph by considering costs of different classes, and then solves a linear programming problem in each iteration by using Bayes decision rule based optimal loss function. In addition, a cutting plane algorithm is derived to speed up the solving of linear programs for fast learning on large scale data sets. Experiments and comparisons on real-world large graph data sets demonstrate the effectiveness and the efficiency of our algorithm.",
keywords = "boosting, cost-sensitive learning, cutting plane algorithm, Graph classification, large scale graphs, subgraphs",
author = "Shirui Pan and Jia Wu and Xingquan Zhu",
year = "2015",
month = "11",
doi = "10.1109/TKDE.2015.2391115",
language = "English",
volume = "27",
pages = "2933--2946",
journal = "IEEE Transactions on Knowledge and Data Engineering",
issn = "1041-4347",
publisher = "IEEE, Institute of Electrical and Electronics Engineers",
number = "11",

}

CogBoost : boosting for fast cost-sensitive graph classification. / Pan, Shirui; Wu, Jia; Zhu, Xingquan.

In: IEEE Transactions on Knowledge and Data Engineering, Vol. 27, No. 11, 7006795, 11.2015, p. 2933-2946.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - CogBoost

T2 - boosting for fast cost-sensitive graph classification

AU - Pan, Shirui

AU - Wu, Jia

AU - Zhu, Xingquan

PY - 2015/11

Y1 - 2015/11

N2 - Graph classification has drawn great interests in recent years due to the increasing number of applications involving objects with complex structure relationships. To date, all existing graph classification algorithms assume, explicitly or implicitly, that misclassifying instances in different classes incurs an equal amount of cost (or risk), which is often not the case in real-life applications (where misclassifying a certain class of samples, such as diseased patients, is subject to more expensive costs than others). Although cost-sensitive learning has been extensively studied, all methods are based on data with instance-feature representation. Graphs, however, do not have features available for learning and the feature space of graph data is likely infinite and needs to be carefully explored in order to favor classes with a higher cost. In this paper, we propose, CogBoost, a fast cost-sensitive graph classification algorithm, which aims to minimize the misclassification costs (instead of the errors) and achieve fast learning speed for large scale graph data sets. To minimize the misclassification costs, CogBoost iteratively selects the most discriminative subgraph by considering costs of different classes, and then solves a linear programming problem in each iteration by using Bayes decision rule based optimal loss function. In addition, a cutting plane algorithm is derived to speed up the solving of linear programs for fast learning on large scale data sets. Experiments and comparisons on real-world large graph data sets demonstrate the effectiveness and the efficiency of our algorithm.

AB - Graph classification has drawn great interests in recent years due to the increasing number of applications involving objects with complex structure relationships. To date, all existing graph classification algorithms assume, explicitly or implicitly, that misclassifying instances in different classes incurs an equal amount of cost (or risk), which is often not the case in real-life applications (where misclassifying a certain class of samples, such as diseased patients, is subject to more expensive costs than others). Although cost-sensitive learning has been extensively studied, all methods are based on data with instance-feature representation. Graphs, however, do not have features available for learning and the feature space of graph data is likely infinite and needs to be carefully explored in order to favor classes with a higher cost. In this paper, we propose, CogBoost, a fast cost-sensitive graph classification algorithm, which aims to minimize the misclassification costs (instead of the errors) and achieve fast learning speed for large scale graph data sets. To minimize the misclassification costs, CogBoost iteratively selects the most discriminative subgraph by considering costs of different classes, and then solves a linear programming problem in each iteration by using Bayes decision rule based optimal loss function. In addition, a cutting plane algorithm is derived to speed up the solving of linear programs for fast learning on large scale data sets. Experiments and comparisons on real-world large graph data sets demonstrate the effectiveness and the efficiency of our algorithm.

KW - boosting

KW - cost-sensitive learning

KW - cutting plane algorithm

KW - Graph classification

KW - large scale graphs

KW - subgraphs

UR - http://www.scopus.com/inward/record.url?scp=84937812859&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2015.2391115

DO - 10.1109/TKDE.2015.2391115

M3 - Article

VL - 27

SP - 2933

EP - 2946

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 11

M1 - 7006795

ER -