Abstract
Defect prediction models are classifiers that are trained to identify defect-prone software modules. Such classifiers have configurable parameters that control their characteristics (e.g., the number of trees in a random forest classifier). Recent studies show that these classifiers may underperform due to the use of suboptimal default parameter settings. However, it is impractical to assess all of the possible settings in the parameter spaces. In this paper, we investigate the performance of defect prediction models where Caret-an automated parameter optimization technique-has been applied. Through a case study of 18 datasets from systems that span both proprietary and open source domains, we find that (1) Caret improves the AUC performance of defect prediction models by as much as 40 percentage points; (2) Caret-optimized classifiers are at least as stable as (with 35% of them being more stable than) classifiers that are trained using the default settings; and (3) Caret increases the likelihood of producing a top-performing classifier by as much as 83%. Hence, we conclude that parameter settings can indeed have a large impact on the performance of defect prediction models, suggesting that researchers should experiment with the parameters of the classification techniques. Since automated parameter optimization techniques like Caret yield substantially benefits in terms of performance improvement and stability, while incurring a manageable additional computational cost, they should be included in future defect prediction studies.
Original language | English |
---|---|
Title of host publication | Proceedings - 2016 IEEE/ACM 38th IEEE International Conference on Software Engineering Companion, ICSE 2016 |
Subtitle of host publication | 14-22 May 2016 Austin, Texas, USA |
Editors | Willem Visser, Laurie Williams |
Place of Publication | New York NY USA |
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Pages | 321-332 |
Number of pages | 12 |
ISBN (Electronic) | 9781450339001, 9781450342056 |
DOIs | |
Publication status | Published - 2016 |
Externally published | Yes |
Event | International Conference on Software Engineering 2016 - Renaissance Austin Hotel, Austin, United States of America Duration: 14 May 2016 → 22 May 2016 Conference number: 38th http://2016.icse.cs.txstate.edu/ https://ieeexplore.ieee.org/xpl/conhome/7878354/proceeding (Proceedings) |
Conference
Conference | International Conference on Software Engineering 2016 |
---|---|
Abbreviated title | ICSE 2016 |
Country/Territory | United States of America |
City | Austin |
Period | 14/05/16 → 22/05/16 |
Internet address |
Keywords
- Classification techniques
- Experimental design
- Parameter optimization
- Software defect prediction