Parallel fuzzy c-means clustering for large data sets

Terence Kwok, Kate A Smith, Sebastian Lozano, David Taniar

    Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

    83 Citations (Scopus)

    Abstract

    The parallel fuzzy c-means (PFCM) algorithm for clustering large data sets is proposed in this paper. The proposed algorithm is designed to run on parallel computers of the Single Program Multiple Data (SPMD) model type with the Message Passing Interface (MPI). A comparison is made between PFCM and an existing parallel k-means (PKM) algorithm in terms of their parallelisation capability and scalability. In an implementation of PFCM to cluster a large data set from an insurance company, the proposed algorithm is demonstrated to have almost ideal speedups as well as an excellent scaleup with respect to the size of the data sets.
    Original languageEnglish
    Title of host publicationEuro-Par 2002 Parallel Processing
    Subtitle of host publication8th International Euro-Par Conference Paderborn, Germany, August 27-30, 2002 Proceedings
    EditorsBurkhard Monien, Rainer Feldmann
    Place of PublicationBerlin Germany
    PublisherSpringer
    Pages365-374
    Number of pages10
    ISBN (Print)3540440496
    DOIs
    Publication statusPublished - 2002
    EventInternational European Conference on Parallel Processing 2002 - Paderborn, Germany
    Duration: 27 Aug 200230 Aug 2002
    Conference number: 8th
    https://link.springer.com/book/10.1007%2F3-540-45706-2 (Proceedings)

    Publication series

    NameLecture Notes in Computer Science
    PublisherSpringer
    Volume2400
    ISSN (Print)0302-9743

    Conference

    ConferenceInternational European Conference on Parallel Processing 2002
    Abbreviated titleEuro-Par 2002
    Country/TerritoryGermany
    CityPaderborn
    Period27/08/0230/08/02
    Internet address

    Cite this