An evaluation of classifier-specific filter measure performance for feature selection

Cecille Freeman, Dana Kulić, Otman Basir

Research output: Contribution to journalArticleResearchpeer-review

57 Citations (Scopus)

Abstract

Feature selection is an important part of classifier design. There are many possible methods for searching and evaluating feature subsets, but little consensus on which methods are best. This paper examines a number of filter-based feature subset evaluation measures with the goal of assessing their performance with respect to specific classifiers. This work tests 16 common filter measures for use with K-nearest neighbors and support vector machine classifiers. The measures are tested on 20 real and 20 artificial data sets, which are designed to probe for specific challenges. The strengths and weaknesses of each measure are discussed with respect to the specific challenges and correlation with classifier accuracy. The results highlight several challenging problems with a number of common filter measures. The results indicate that the best filter measure is classifier-specific. K-nearest neighbors classifiers work well with subset-based RELIEF, correlation feature selection or conditional mutual information maximization, whereas Fisher's interclass separability criterion and conditional mutual information maximization work better for support vector machines. Despite the large number and variety of feature selection measures proposed in the literature, no single measure is guaranteed to outperform the others, even within a single classifier, and the overall performance of a feature selection method cannot be characterized independently of the subsequent classifier.

Original languageEnglish
Pages (from-to)1812-1826
Number of pages15
JournalPattern Recognition
Volume48
Issue number5
DOIs
Publication statusPublished - 1 May 2015
Externally publishedYes

Keywords

  • Classification
  • Feature selection
  • Filter measures

Cite this