CFOND: consensus factorization for co-clustering networked data

Ting Guo, Shirui Pan, Xingquan Zhu, Chengqi Zhang

Research output: Contribution to journalArticleResearchpeer-review

11 Citations (Scopus)

Abstract

Networked data are common in domains where instances are characterized by both feature values and inter-dependency relationships. Finding cluster structures for networked instances and discovering representative features for each cluster represent a special co-clustering task usefully for many real-world applications, such as automatic categorization of scientific publications and finding representative key-words for each cluster. To date, although co-clustering has been commonly used for finding clusters for both instances and features, all existing methods are focused on instance-feature values, without leveraging valuable topology relationships between instances to help boost co-clustering performance. In this paper, we propose CFOND, a consensus factorization based framework for co-clustering networked data. We argue that feature values and linkages provide useful information from different perspectives, yet they are not always consistent and therefore need to be carefully aligned for best clustering results. In the paper, we advocate a consensus factorization principle, which simultaneously factorizes information from three aspects: network topology structures, instance-feature content relationships, and feature-feature correlations. The consensus factorization ensures that the final cluster structures are consistent across information from the three aspects with minimum errors. CFOND enjoys sound theoretical basis and proved convergence, and its performance is validated on real-world networks.

Original languageEnglish
Pages (from-to)706-719
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Volume31
Issue number4
DOIs
Publication statusPublished - 1 Apr 2019
Externally publishedYes

Keywords

  • Co-clustering
  • Couplings
  • Data mining
  • Linear programming
  • Manifolds
  • Merging
  • Network topology
  • Networked data
  • Networks
  • Nonnegative Matrix Factorization
  • Topology

Cite this