Single factor analysis in MML mixture modelling

Russell T. Edwards, David L. Dowe

    Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

    26 Citations (Scopus)

    Abstract

    Mixture modelling concerns the unsupervised discovery of clusters within data. Most current clustering algorithms assume that variables within classes are uncorrelated. We present a method for producing and evaluating models which account for inter-attribute correlation within classes with a single Gaussian linear factor. The method used is Minimum Message Length (MML), an invariant, information-theoretic Bayesian hypothesis evaluation criterion. Our work extends and unifies that of Wallace and Boulton (1968) and Wallace and Freeman (1992), concerned respectively with MML mixture modelling and MML single factor analysis. Results on simulated data are comparable to those of Wallace and Freeman (1992), outperforming Maximum Likelihood. We include an application of mixture modelling with single factors on spectral data from the Infrared Astronomical Satellite. Our model shows fewer unnecessary classes than that produced by AutoClass (Goebel et. al. 1989) due to the use of factors in modelling correlation.

    Original languageEnglish
    Title of host publicationResearch and Development in Knowledge Discovery and Data Mining - 2nd Pacific-Asia Conference, PAKDD 1998, Proceedings
    EditorsXindong Wu, Ramamohanarao Kotagiri, Kevin B. Korb
    PublisherSpringer
    Pages96-109
    Number of pages14
    ISBN (Print)3540643834, 9783540643838
    DOIs
    Publication statusPublished - 1998
    EventPacific-Asia Conference on Knowledge Discovery and Data Mining 1998 - Melbourne, Australia
    Duration: 15 Apr 199817 Apr 1998
    Conference number: 2nd
    https://link.springer.com/book/10.1007/3-540-64383-4 (Proceedings)

    Publication series

    NameLecture Notes in Computer Science
    PublisherSpringer
    Volume1394
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    ConferencePacific-Asia Conference on Knowledge Discovery and Data Mining 1998
    Abbreviated titlePAKDD 1988
    Country/TerritoryAustralia
    CityMelbourne
    Period15/04/9817/04/98
    Internet address

    Keywords

    • Induction in KDD
    • Minimum message length
    • MML
    • Noise handling
    • Statistical and machine learning

    Cite this