Automatic clustering constraints derivation from object-oriented software using weighted complex network with graph theory analysis

Chun Yong Chong, Sai Peck Lee

Research output: Contribution to journalArticleResearchpeer-review

23 Citations (Scopus)

Abstract

Constrained clustering or semi-supervised clustering has received a lot of attention due to its flexibility of incorporating minimal supervision of domain experts or side information to help improve clustering results of classic unsupervised clustering techniques. In the domain of software remodularisation, classic unsupervised software clustering techniques have proven to be useful to aid in recovering a high-level abstraction of the software design of poorly documented or designed software systems. However, there is a lack of work that integrates constrained clustering for the same purpose to help improve the modularity of software systems. Nevertheless, due to time and budget constraints, it is laborious and unrealistic for domain experts who have prior knowledge about the software to review each and every software artifact and provide supervision on an on-demand basis. We aim to fill this research gap by proposing an automated approach to derive clustering constraints from the implicit structure of software system based on graph theory analysis of the analysed software. Evaluations conducted on 40 open-source object-oriented software systems show that the proposed approach can serve as an alternative solution to derive clustering constraints in situations where domain experts are non-existent, thus helping to improve the overall accuracy of clustering results.

Original languageEnglish
Pages (from-to)28-53
Number of pages26
JournalJournal of Systems and Software
Volume133
DOIs
Publication statusPublished - Nov 2017

Keywords

  • Complex network
  • Constrained clustering
  • Graph theory
  • Software clustering
  • Software remodularisation

Cite this