TY - JOUR
T1 - A novel aggregate gene selection method for microarray data classification
AU - Nguyen, Thanh
AU - Khosravi, Abbas
AU - Creighton, Douglas
AU - Nahavandi, Saeid
N1 - Funding Information:
This research is supported by the Australian Research Council (discovery grant DP120102112 ) and the Centre for Intelligent Systems Research (CISR) at Deakin University .
Publisher Copyright:
© 2015 Elsevier B.V.
PY - 2015/8/1
Y1 - 2015/8/1
N2 - This paper introduces a novel method for gene selection based on a modification of analytic hierarchy process (AHP). The modified AHP (MAHP) is able to deal with quantitative factors that are statistics of five individual gene ranking methods: two-sample t-test, entropy test, receiver operating characteristic curve, Wilcoxon test, and signal to noise ratio. The most prominent discriminant genes serve as inputs to a range of classifiers including linear discriminant analysis, k-nearest neighbors, probabilistic neural network, support vector machine, and multilayer perceptron. Gene subsets selected by MAHP are compared with those of four competing approaches: information gain, symmetrical uncertainty, Bhattacharyya distance and ReliefF. Four benchmark microarray datasets: diffuse large B-cell lymphoma, leukemia cancer, prostate and colon are utilized for experiments. As the number of samples in microarray data datasets are limited, the leave one out cross validation strategy is applied rather than the traditional cross validation. Experimental results demonstrate the significant dominance of the proposed MAHP against the competing methods in terms of both accuracy and stability. With a benefit of inexpensive computational cost, MAHP is useful for cancer diagnosis using DNA gene expression profiles in the real clinical practice.
AB - This paper introduces a novel method for gene selection based on a modification of analytic hierarchy process (AHP). The modified AHP (MAHP) is able to deal with quantitative factors that are statistics of five individual gene ranking methods: two-sample t-test, entropy test, receiver operating characteristic curve, Wilcoxon test, and signal to noise ratio. The most prominent discriminant genes serve as inputs to a range of classifiers including linear discriminant analysis, k-nearest neighbors, probabilistic neural network, support vector machine, and multilayer perceptron. Gene subsets selected by MAHP are compared with those of four competing approaches: information gain, symmetrical uncertainty, Bhattacharyya distance and ReliefF. Four benchmark microarray datasets: diffuse large B-cell lymphoma, leukemia cancer, prostate and colon are utilized for experiments. As the number of samples in microarray data datasets are limited, the leave one out cross validation strategy is applied rather than the traditional cross validation. Experimental results demonstrate the significant dominance of the proposed MAHP against the competing methods in terms of both accuracy and stability. With a benefit of inexpensive computational cost, MAHP is useful for cancer diagnosis using DNA gene expression profiles in the real clinical practice.
KW - Analytic hierarchy process
KW - Classification
KW - Gene expression profiles
KW - Gene selection
KW - Microarray data
UR - http://www.scopus.com/inward/record.url?scp=84928923086&partnerID=8YFLogxK
U2 - 10.1016/j.patrec.2015.03.018
DO - 10.1016/j.patrec.2015.03.018
M3 - Article
AN - SCOPUS:84928923086
SN - 0167-8655
VL - 60-61
SP - 16
EP - 23
JO - Pattern Recognition Letters
JF - Pattern Recognition Letters
ER -