TY - JOUR
T1 - Improving defect prediction with deep forest
AU - Zhou, Tianchi
AU - Sun, Xiaobing
AU - Xia, Xin
AU - Li, Bin
AU - Chen, Xiang
PY - 2019/10
Y1 - 2019/10
N2 - Context: Software defect prediction is important to ensure the quality of software. Nowadays, many supervised learning techniques have been applied to identify defective instances (e.g., methods, classes, and modules). Objective: However, the performance of these supervised learning techniques are still far from satisfactory, and it will be important to design more advanced techniques to improve the performance of defect prediction models. Method: We propose a new deep forest model to build the defect prediction model (DPDF). This model can identify more important defect features by using a new cascade strategy, which transforms random forest classifiers into a layer-by-layer structure. This design takes full advantage of ensemble learning and deep learning. Results: We evaluate our approach on 25 open source projects from four public datasets (i.e., NASA, PROMISE, AEEEM and Relink). Experimental results show that our approach increases AUC value by 5% compared with the best traditional machine learning algorithms. Conclusion: The deep strategy in DPDF is effective for software defect prediction.
AB - Context: Software defect prediction is important to ensure the quality of software. Nowadays, many supervised learning techniques have been applied to identify defective instances (e.g., methods, classes, and modules). Objective: However, the performance of these supervised learning techniques are still far from satisfactory, and it will be important to design more advanced techniques to improve the performance of defect prediction models. Method: We propose a new deep forest model to build the defect prediction model (DPDF). This model can identify more important defect features by using a new cascade strategy, which transforms random forest classifiers into a layer-by-layer structure. This design takes full advantage of ensemble learning and deep learning. Results: We evaluate our approach on 25 open source projects from four public datasets (i.e., NASA, PROMISE, AEEEM and Relink). Experimental results show that our approach increases AUC value by 5% compared with the best traditional machine learning algorithms. Conclusion: The deep strategy in DPDF is effective for software defect prediction.
KW - Cascade strategy
KW - Deep forest
KW - Empirical evaluation
KW - Software defect prediction
UR - http://www.scopus.com/inward/record.url?scp=85068541005&partnerID=8YFLogxK
U2 - 10.1016/j.infsof.2019.07.003
DO - 10.1016/j.infsof.2019.07.003
M3 - Article
AN - SCOPUS:85068541005
SN - 0950-5849
VL - 114
SP - 204
EP - 216
JO - Information and Software Technology
JF - Information and Software Technology
ER -