A novel selective naïve Bayes algorithm

Shenglei Chen, Geoffrey I. Webb, Linyuan Liu, Xin Ma

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Naïve Bayes is one of the most popular data mining algorithms. Its efficiency comes from the assumption of attribute independence, although this might be violated in many real-world data sets. Many efforts have been done to mitigate the assumption, among which attribute selection is an important approach. However, conventional efforts to perform attribute selection in naïve Bayes suffer from heavy computational overhead. This paper proposes an efficient selective naïve Bayes algorithm, which adopts only some of the attributes to construct selective naïve Bayes models. These models are built in such a way that each one is a trivial extension of another. The most predictive selective naïve Bayes model can be selected by the measures of incremental leave-one-out cross validation. As a result, attributes can be selected by efficient model selection. Empirical results demonstrate that the selective naïve Bayes shows superior classification accuracy, yet at the same time maintains the simplicity and efficiency.

Original languageEnglish
Article number105361
Number of pages12
JournalKnowledge-Based Systems
DOIs
Publication statusAccepted/In press - 16 Dec 2019

Keywords

  • Attribute selection
  • Leave-one-out cross validation
  • Model selection
  • Naïve Bayes

Cite this

Chen, Shenglei ; Webb, Geoffrey I. ; Liu, Linyuan ; Ma, Xin. / A novel selective naïve Bayes algorithm. In: Knowledge-Based Systems. 2019.
@article{b5dbf13554d1468a9d49fac88fdf22b3,
title = "A novel selective na{\"i}ve Bayes algorithm",
abstract = "Na{\"i}ve Bayes is one of the most popular data mining algorithms. Its efficiency comes from the assumption of attribute independence, although this might be violated in many real-world data sets. Many efforts have been done to mitigate the assumption, among which attribute selection is an important approach. However, conventional efforts to perform attribute selection in na{\"i}ve Bayes suffer from heavy computational overhead. This paper proposes an efficient selective na{\"i}ve Bayes algorithm, which adopts only some of the attributes to construct selective na{\"i}ve Bayes models. These models are built in such a way that each one is a trivial extension of another. The most predictive selective na{\"i}ve Bayes model can be selected by the measures of incremental leave-one-out cross validation. As a result, attributes can be selected by efficient model selection. Empirical results demonstrate that the selective na{\"i}ve Bayes shows superior classification accuracy, yet at the same time maintains the simplicity and efficiency.",
keywords = "Attribute selection, Leave-one-out cross validation, Model selection, Na{\"i}ve Bayes",
author = "Shenglei Chen and Webb, {Geoffrey I.} and Linyuan Liu and Xin Ma",
year = "2019",
month = "12",
day = "16",
doi = "10.1016/j.knosys.2019.105361",
language = "English",
journal = "Knowledge-Based Systems",
issn = "0950-7051",
publisher = "Elsevier",

}

A novel selective naïve Bayes algorithm. / Chen, Shenglei; Webb, Geoffrey I.; Liu, Linyuan; Ma, Xin.

In: Knowledge-Based Systems, 16.12.2019.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - A novel selective naïve Bayes algorithm

AU - Chen, Shenglei

AU - Webb, Geoffrey I.

AU - Liu, Linyuan

AU - Ma, Xin

PY - 2019/12/16

Y1 - 2019/12/16

N2 - Naïve Bayes is one of the most popular data mining algorithms. Its efficiency comes from the assumption of attribute independence, although this might be violated in many real-world data sets. Many efforts have been done to mitigate the assumption, among which attribute selection is an important approach. However, conventional efforts to perform attribute selection in naïve Bayes suffer from heavy computational overhead. This paper proposes an efficient selective naïve Bayes algorithm, which adopts only some of the attributes to construct selective naïve Bayes models. These models are built in such a way that each one is a trivial extension of another. The most predictive selective naïve Bayes model can be selected by the measures of incremental leave-one-out cross validation. As a result, attributes can be selected by efficient model selection. Empirical results demonstrate that the selective naïve Bayes shows superior classification accuracy, yet at the same time maintains the simplicity and efficiency.

AB - Naïve Bayes is one of the most popular data mining algorithms. Its efficiency comes from the assumption of attribute independence, although this might be violated in many real-world data sets. Many efforts have been done to mitigate the assumption, among which attribute selection is an important approach. However, conventional efforts to perform attribute selection in naïve Bayes suffer from heavy computational overhead. This paper proposes an efficient selective naïve Bayes algorithm, which adopts only some of the attributes to construct selective naïve Bayes models. These models are built in such a way that each one is a trivial extension of another. The most predictive selective naïve Bayes model can be selected by the measures of incremental leave-one-out cross validation. As a result, attributes can be selected by efficient model selection. Empirical results demonstrate that the selective naïve Bayes shows superior classification accuracy, yet at the same time maintains the simplicity and efficiency.

KW - Attribute selection

KW - Leave-one-out cross validation

KW - Model selection

KW - Naïve Bayes

UR - http://www.scopus.com/inward/record.url?scp=85077154737&partnerID=8YFLogxK

U2 - 10.1016/j.knosys.2019.105361

DO - 10.1016/j.knosys.2019.105361

M3 - Article

AN - SCOPUS:85077154737

JO - Knowledge-Based Systems

JF - Knowledge-Based Systems

SN - 0950-7051

M1 - 105361

ER -