Hashing for adaptive real-time graph stream classification with concept drifts

Lianhua Chi, Bin Li, Xingquan Zhu, Shirui Pan, Ling Chen

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Many applications involve processing networked streaming data in a timely manner. Graph stream classification aims to learn a classification model from a stream of graphs with only one-pass of data, requiring real-time processing in training and prediction. This is a nontrivial task, as many existing methods require multipass of the graph stream to extract subgraph structures as features for graph classification which does not simultaneously satisfy 'one-pass' and 'real-time' requirements. In this paper, we propose an adaptive real-time graph stream classification method to address this challenge. We partition the unbounded graph stream data into consecutive graph chunks, each consisting of a fixed number of graphs and delivering a corresponding chunk-level classifier. We employ a random hashing function to compress the original node set of graphs in each chunk for fast feature detection when training chunk-level classifiers. Furthermore, a differential hashing strategy is applied to map unlimited increasing features (i.e., cliques) into a fixed-size feature space which is then used as a feature vector for stochastic learning. Finally, the chunk-level classifiers are weighted in an ensemble learning model for graph classification. The proposed method substantially speeds up the graph feature extraction and avoids unbounded graph feature growth. Moreover, it effectively offsets concept drifts in graph stream classification. Experiments on real-world and synthetic graph streams demonstrate that our method significantly outperforms existing methods in both classification accuracy and learning efficiency.

Original languageEnglish
Pages (from-to)1591-1604
Number of pages14
JournalIEEE Transactions on Cybernetics
Volume48
Issue number5
DOIs
Publication statusPublished - May 2018
Externally publishedYes

Keywords

  • Cliques
  • concept drifts
  • graph stream classification
  • hashing

Cite this

Chi, Lianhua ; Li, Bin ; Zhu, Xingquan ; Pan, Shirui ; Chen, Ling. / Hashing for adaptive real-time graph stream classification with concept drifts. In: IEEE Transactions on Cybernetics. 2018 ; Vol. 48, No. 5. pp. 1591-1604.
@article{18f408ed63c545fe9cf87f71e8289ba1,
title = "Hashing for adaptive real-time graph stream classification with concept drifts",
abstract = "Many applications involve processing networked streaming data in a timely manner. Graph stream classification aims to learn a classification model from a stream of graphs with only one-pass of data, requiring real-time processing in training and prediction. This is a nontrivial task, as many existing methods require multipass of the graph stream to extract subgraph structures as features for graph classification which does not simultaneously satisfy 'one-pass' and 'real-time' requirements. In this paper, we propose an adaptive real-time graph stream classification method to address this challenge. We partition the unbounded graph stream data into consecutive graph chunks, each consisting of a fixed number of graphs and delivering a corresponding chunk-level classifier. We employ a random hashing function to compress the original node set of graphs in each chunk for fast feature detection when training chunk-level classifiers. Furthermore, a differential hashing strategy is applied to map unlimited increasing features (i.e., cliques) into a fixed-size feature space which is then used as a feature vector for stochastic learning. Finally, the chunk-level classifiers are weighted in an ensemble learning model for graph classification. The proposed method substantially speeds up the graph feature extraction and avoids unbounded graph feature growth. Moreover, it effectively offsets concept drifts in graph stream classification. Experiments on real-world and synthetic graph streams demonstrate that our method significantly outperforms existing methods in both classification accuracy and learning efficiency.",
keywords = "Cliques, concept drifts, graph stream classification, hashing",
author = "Lianhua Chi and Bin Li and Xingquan Zhu and Shirui Pan and Ling Chen",
year = "2018",
month = "5",
doi = "10.1109/TCYB.2017.2708979",
language = "English",
volume = "48",
pages = "1591--1604",
journal = "IEEE Transactions on Cybernetics",
issn = "2168-2267",
number = "5",

}

Hashing for adaptive real-time graph stream classification with concept drifts. / Chi, Lianhua; Li, Bin; Zhu, Xingquan; Pan, Shirui; Chen, Ling.

In: IEEE Transactions on Cybernetics, Vol. 48, No. 5, 05.2018, p. 1591-1604.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Hashing for adaptive real-time graph stream classification with concept drifts

AU - Chi, Lianhua

AU - Li, Bin

AU - Zhu, Xingquan

AU - Pan, Shirui

AU - Chen, Ling

PY - 2018/5

Y1 - 2018/5

N2 - Many applications involve processing networked streaming data in a timely manner. Graph stream classification aims to learn a classification model from a stream of graphs with only one-pass of data, requiring real-time processing in training and prediction. This is a nontrivial task, as many existing methods require multipass of the graph stream to extract subgraph structures as features for graph classification which does not simultaneously satisfy 'one-pass' and 'real-time' requirements. In this paper, we propose an adaptive real-time graph stream classification method to address this challenge. We partition the unbounded graph stream data into consecutive graph chunks, each consisting of a fixed number of graphs and delivering a corresponding chunk-level classifier. We employ a random hashing function to compress the original node set of graphs in each chunk for fast feature detection when training chunk-level classifiers. Furthermore, a differential hashing strategy is applied to map unlimited increasing features (i.e., cliques) into a fixed-size feature space which is then used as a feature vector for stochastic learning. Finally, the chunk-level classifiers are weighted in an ensemble learning model for graph classification. The proposed method substantially speeds up the graph feature extraction and avoids unbounded graph feature growth. Moreover, it effectively offsets concept drifts in graph stream classification. Experiments on real-world and synthetic graph streams demonstrate that our method significantly outperforms existing methods in both classification accuracy and learning efficiency.

AB - Many applications involve processing networked streaming data in a timely manner. Graph stream classification aims to learn a classification model from a stream of graphs with only one-pass of data, requiring real-time processing in training and prediction. This is a nontrivial task, as many existing methods require multipass of the graph stream to extract subgraph structures as features for graph classification which does not simultaneously satisfy 'one-pass' and 'real-time' requirements. In this paper, we propose an adaptive real-time graph stream classification method to address this challenge. We partition the unbounded graph stream data into consecutive graph chunks, each consisting of a fixed number of graphs and delivering a corresponding chunk-level classifier. We employ a random hashing function to compress the original node set of graphs in each chunk for fast feature detection when training chunk-level classifiers. Furthermore, a differential hashing strategy is applied to map unlimited increasing features (i.e., cliques) into a fixed-size feature space which is then used as a feature vector for stochastic learning. Finally, the chunk-level classifiers are weighted in an ensemble learning model for graph classification. The proposed method substantially speeds up the graph feature extraction and avoids unbounded graph feature growth. Moreover, it effectively offsets concept drifts in graph stream classification. Experiments on real-world and synthetic graph streams demonstrate that our method significantly outperforms existing methods in both classification accuracy and learning efficiency.

KW - Cliques

KW - concept drifts

KW - graph stream classification

KW - hashing

UR - http://www.scopus.com/inward/record.url?scp=85028696143&partnerID=8YFLogxK

U2 - 10.1109/TCYB.2017.2708979

DO - 10.1109/TCYB.2017.2708979

M3 - Article

VL - 48

SP - 1591

EP - 1604

JO - IEEE Transactions on Cybernetics

JF - IEEE Transactions on Cybernetics

SN - 2168-2267

IS - 5

ER -