A study on the effect of class distribution using cost-sensitive learning

Kai Ming Ting

    Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

    7 Citations (Scopus)

    Abstract

    This paper investigates the effect of class distribution on the predictive performance of classification models using cost-sensitive learning, rather than the sampling approach employed previously by a similar study. The predictive performance is measured using the cost space representation, which is a dual to the ROC representation. This study shows that distributions which range between the natural distribution and the balanced distribution can also produce the best models, contrary to the finding of the previous study. In addition, we find that the best models are larger in size than those trained using the natural distribution. We also show two different ways to achieve the same effect of the corrected probability estimates proposed by the previous study.
    Original languageEnglish
    Title of host publicationDiscovery Science
    Subtitle of host publication5th International Conference, DS 2002 Lubeck, Germany, November 24-26, 2002 Proceedings
    Place of PublicationBerlin Germany
    PublisherSpringer
    Pages98-112
    Number of pages15
    ISBN (Print)3540001883
    DOIs
    Publication statusPublished - 2002
    EventInternational Conference on Discovery Science 2002 - Lubeck, Germany
    Duration: 24 Nov 200226 Nov 2002
    Conference number: 5th

    Publication series

    NameLecture Notes in Computer Science
    PublisherSpringer
    Volume2534
    ISSN (Print)0302-9743

    Conference

    ConferenceInternational Conference on Discovery Science 2002
    Abbreviated titleDS 2002
    Country/TerritoryGermany
    CityLubeck
    Period24/11/0226/11/02

    Cite this