TY - GEN
T1 - Stochastic attribute selection committees
AU - Zheng, Zijian
AU - Webb, Geoffrey I.
PY - 1998/1/1
Y1 - 1998/1/1
N2 - Classifier committee learning methods generate multiple classifters to form a committee by repeated application of a single base learning algorithm. The committee members vote to decide the final classification. Two such methods, Bagging and Boosting, have shown great success with decision tree learning. They create different classifiers by modifying the distribution of the training set. This paper studies a different approach: Stochastic Attribute Selection Committee learning of decision trees. It generates classifier committees by stochastically modifying the set of attributes but keeping the distribution of the training set unchanged. An empirical evaluation of a variant of this method, namely Snsc, in a representative collection of natural domains shows that the SASC method can significantly reduce the error rate of decision tree learning. On average SASC is more accurate than Bagging and less accurate than Boosting, although a one-tailed sign-test fails to show that these differences are significant at a level of 0.05. In addition, it is found that, like Bagging, Snsc is more stable than Boosting in terms of less frequently obtaining significantly higher error rates than C4.5 and, when error is raised, producing lower error rate increases. Moreover, like Bagging, Snsc is amenable to parallel and distributed processing while Boosting is not.
AB - Classifier committee learning methods generate multiple classifters to form a committee by repeated application of a single base learning algorithm. The committee members vote to decide the final classification. Two such methods, Bagging and Boosting, have shown great success with decision tree learning. They create different classifiers by modifying the distribution of the training set. This paper studies a different approach: Stochastic Attribute Selection Committee learning of decision trees. It generates classifier committees by stochastically modifying the set of attributes but keeping the distribution of the training set unchanged. An empirical evaluation of a variant of this method, namely Snsc, in a representative collection of natural domains shows that the SASC method can significantly reduce the error rate of decision tree learning. On average SASC is more accurate than Bagging and less accurate than Boosting, although a one-tailed sign-test fails to show that these differences are significant at a level of 0.05. In addition, it is found that, like Bagging, Snsc is more stable than Boosting in terms of less frequently obtaining significantly higher error rates than C4.5 and, when error is raised, producing lower error rate increases. Moreover, like Bagging, Snsc is amenable to parallel and distributed processing while Boosting is not.
UR - http://www.scopus.com/inward/record.url?scp=80052751840&partnerID=8YFLogxK
M3 - Conference Paper
AN - SCOPUS:80052751840
SN - 3540651381
SN - 9783540651383
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 321
EP - 332
BT - Advanced Topics in Artificial Intelligence - 11th Australian Joint Conference on Artificial Intelligence, AI 1998, Selected Papers
A2 - Antoniou, Grigoris
A2 - Slaney, John
PB - Springer
T2 - 11th Australian Joint Conference on Artificial Intelligence, AI 98
Y2 - 13 July 1998 through 17 July 1998
ER -