Naïve Bayes is one of the most popular data mining algorithms. Its efficiency comes from the assumption of attribute independence, although this might be violated in many real-world data sets. Many efforts have been done to mitigate the assumption, among which attribute selection is an important approach. However, conventional efforts to perform attribute selection in naïve Bayes suffer from heavy computational overhead. This paper proposes an efficient selective naïve Bayes algorithm, which adopts only some of the attributes to construct selective naïve Bayes models. These models are built in such a way that each one is a trivial extension of another. The most predictive selective naïve Bayes model can be selected by the measures of incremental leave-one-out cross validation. As a result, attributes can be selected by efficient model selection. Empirical results demonstrate that the selective naïve Bayes shows superior classification accuracy, yet at the same time maintains the simplicity and efficiency.
- Attribute selection
- Leave-one-out cross validation
- Model selection
- Naïve Bayes