Single factor analysis in MML mixture modelling

Russell T. Edwards, David L. Dowe

    Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

    21 Citations (Scopus)

    Abstract

    Mixture modelling concerns the unsupervised discovery of clusters within data. Most current clustering algorithms assume that variables within classes are uncorrelated. We present a method for producing and evaluating models which account for inter-attribute correlation within classes with a single Gaussian linear factor. The method used is Minimum Message Length (MML), an invariant, information-theoretic Bayesian hypothesis evaluation criterion. Our work extends and unifies that of Wallace and Boulton (1968) and Wallace and Freeman (1992), concerned respectively with MML mixture modelling and MML single factor analysis. Results on simulated data are comparable to those of Wallace and Freeman (1992), outperforming Maximum Likelihood. We include an application of mixture modelling with single factors on spectral data from the Infrared Astronomical Satellite. Our model shows fewer unnecessary classes than that produced by AutoClass (Goebel et. al. 1989) due to the use of factors in modelling correlation.

    Original languageEnglish
    Title of host publicationResearch and Development in Knowledge Discovery and Data Mining - 2nd Pacific-Asia Conference, PAKDD 1998, Proceedings
    EditorsXindong Wu, Ramamohanarao Kotagiri, Kevin B. Korb
    PublisherSpringer
    Pages96-109
    Number of pages14
    ISBN (Print)3540643834, 9783540643838
    DOIs
    Publication statusPublished - 1998
    Event2nd Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 1998 - Melbourne, Australia
    Duration: 15 Apr 199817 Apr 1998

    Publication series

    NameLecture Notes in Computer Science
    PublisherSpringer
    Volume1394
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference2nd Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 1998
    CountryAustralia
    CityMelbourne
    Period15/04/9817/04/98

    Keywords

    • Induction in KDD
    • Minimum message length
    • MML
    • Noise handling
    • Statistical and machine learning

    Cite this

    Edwards, R. T., & Dowe, D. L. (1998). Single factor analysis in MML mixture modelling. In X. Wu, R. Kotagiri, & K. B. Korb (Eds.), Research and Development in Knowledge Discovery and Data Mining - 2nd Pacific-Asia Conference, PAKDD 1998, Proceedings (pp. 96-109). (Lecture Notes in Computer Science ; Vol. 1394). Springer. https://doi.org/10.1007/3-540-64383-4_9