Skip to main navigation Skip to search Skip to main content

Stochastic attribute selection committees with multiple boosting: Learning more accurate and more stable classifier committees

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Classifier learning is a key technique for KDD. Approaches to learning classifier committees, including Boosting, Bagging, Sasc, and SascB, have demonstrated great success in increasing the prediction accuracy of decision trees. Boosting and Bagging create different classifiers by modifying the distribution of the training set. Sasc adopts a different method. It generates committees by stochastic manipulation of the set of attributes considered at each node during tree induction, but keeping the distribution of the training set unchanged. SascB, a combination of Boosting and Sasc, has shown the ability to further increase, on average, the prediction accuracy of decision trees. It has been found that the performance of SascB and Boosting is more variable than that of Sasc, although SascB is more accurate than the others on average. In this paper, we present a novel method to reduce variability of SascB and Boosting, and further increase their average accuracy. It generates multiple committees by incorporating Bagging into SascB. As well as improving stability and average accuracy, the resulting method is amenable to parallel or distributed processing, while Boosting and SascB are not. This is an important characteristic for datamining in large datasets.

Original languageEnglish
Title of host publicationMethodologies for Knowledge Discovery and Data Mining - 3rd Pacific-Asia Conference, PAKDD 1999, Proceedings
EditorsLizhu Zhou, Ning Zhong
PublisherSpringer
Pages123-132
Number of pages10
ISBN (Print)3540658661, 9783540658665
Publication statusPublished - 1 Jan 1999
Externally publishedYes
EventPacific-Asia Conference on Knowledge Discovery and Data Mining 1999 - Beijing, China
Duration: 26 Apr 199928 Apr 1999
Conference number: 3rd
https://link.springer.com/book/10.1007/3-540-48912-6 (Proceedings)

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1574
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferencePacific-Asia Conference on Knowledge Discovery and Data Mining 1999
Abbreviated titlePAKDD 1999
Country/TerritoryChina
CityBeijing
Period26/04/9928/04/99
Internet address

Cite this