Distributed pattern recognition within DHT-based networks for imbalanced datasets

Amiza Amir, Bala Srinivasan, Asad I. Khan

    Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

    Abstract

    This paper studies the accuracy of a fully-distributed pattern recognition algorithm, namely the P2P-GN, with imbalanced datasets problem which is often neglected by most of the available distributed algorithms. A major distinction of the P2P-GN compared to the other approaches is that it forms a single global classifier, instead of building many local classifiers (one at every site). Fine-granularity components of the classifier are distributed across the network by using Distributed Hash Table (DHT) - which provides efficient linking to these components and ensures the system remains fully-distributed. Our experimental results also show that the P2P-GN can produce highly accurate results despite imbalanced data distribution compared to other distributed algorithms (Ivote-DPV, ensemble ID3, ensemble k-NN, and ensemble naive Bayes).

    Original languageEnglish
    Title of host publication1st IEEE International Conference on Computer Communication and the Internet (ICCCI 2016)
    Subtitle of host publication13 Oct - 15 Oct, 2016, Wuhan, China [Proceedings]
    EditorsRod Kennedy
    Place of PublicationPiscataway, NJ
    PublisherIEEE, Institute of Electrical and Electronics Engineers
    Pages10-13
    Number of pages4
    ISBN (Electronic)9781467385152, 9781467385138
    ISBN (Print)9781467385145
    DOIs
    Publication statusPublished - 8 Dec 2016
    EventIEEE International Conference on Computer Communication and the Internet (ICCCI 2016) - Central China Normal University, Wuhan, China
    Duration: 13 Oct 201615 Oct 2016
    Conference number: 1st
    http://www.iccci.org/

    Conference

    ConferenceIEEE International Conference on Computer Communication and the Internet (ICCCI 2016)
    Abbreviated titleICCCI 2016
    CountryChina
    CityWuhan
    Period13/10/1615/10/16
    Internet address

    Keywords

    • Distributed classification
    • Distributed machine learning
    • Efficient distributed algorithm
    • P2P-based classification

    Cite this