Clustering patient medical records via sparse subspace representation

Budhaditya Saha, Duc Son Pham, Dinh Phung, Svetha Venkatesh

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

4 Citations (Scopus)


The health industry is facing increasing challenge with "big data" as traditional methods fail to manage the scale and complexity. This paper examines clustering of patient records for chronic diseases to facilitate a better construction of care plans. We solve this problem under the framework of subspace clustering. Our novel contribution lies in the exploitation of sparse representation to discover subspaces automatically and a domain-specific construction of weighting matrices for patient records. We show the new formulation is readily solved by extending existing ℓ1 -regularized optimization algorithms. Using a cohort of both diabetes and stroke data we show that we outperform existing benchmark clustering techniques in the literature.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 17th Pacific-Asia Conference, PAKDD 2013, Proceedings
Number of pages12
EditionPART 2
Publication statusPublished - 1 Dec 2013
EventPacific-Asia Conference on Knowledge Discovery and Data Mining 2013 - Gold Coast, Australia
Duration: 14 Apr 201317 Apr 2013
Conference number: 17th (Proceedings)

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume7819 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


ConferencePacific-Asia Conference on Knowledge Discovery and Data Mining 2013
Abbreviated titlePAKDD 2013
CityGold Coast
Internet address


  • Medical data
  • Sparse representation
  • Subspace clustering

Cite this