An overview of clustering methods with guidelines for application in mental health research

Caroline Gao, Dominic Dwyer, Ye Zhu, Catherine L. Smith, Lan Du, Kate M. Filia, Johanna Bayer, Jana M. Menssink, Teresa Wang, Christoph Bergmeir, Stephen Wood, Sue M. Cotton

Research output: Contribution to journalReview ArticleResearchpeer-review

2 Citations (Scopus)


Cluster analyzes have been widely used in mental health research to decompose inter-individual heterogeneity by identifying more homogeneous subgroups of individuals. However, despite advances in new algorithms and increasing popularity, there is little guidance on model choice, analytical framework and reporting requirements. In this paper, we aimed to address this gap by introducing the philosophy, design, advantages/disadvantages and implementation of major algorithms that are particularly relevant in mental health research. Extensions of basic models, such as kernel methods, deep learning, semi-supervised clustering, and clustering ensembles are subsequently introduced. How to choose algorithms to address common issues as well as methods for pre-clustering data processing, clustering evaluation and validation are then discussed. Importantly, we also provide general guidance on clustering workflow and reporting requirements. To facilitate the implementation of different algorithms, we provide information on R functions and libraries.

Original languageEnglish
Article number115265
Number of pages28
JournalPsychiatry Research
Publication statusPublished - Sept 2023


  • Cluster analysis
  • Clustering
  • Machine learning
  • Mental health research
  • Unsupervised learning

Cite this