Automated parameter optimization of classification techniques for defect prediction models

Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, Kenichi Matsumoto

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

229 Citations (Scopus)

Abstract

Defect prediction models are classifiers that are trained to identify defect-prone software modules. Such classifiers have configurable parameters that control their characteristics (e.g., the number of trees in a random forest classifier). Recent studies show that these classifiers may underperform due to the use of suboptimal default parameter settings. However, it is impractical to assess all of the possible settings in the parameter spaces. In this paper, we investigate the performance of defect prediction models where Caret-an automated parameter optimization technique-has been applied. Through a case study of 18 datasets from systems that span both proprietary and open source domains, we find that (1) Caret improves the AUC performance of defect prediction models by as much as 40 percentage points; (2) Caret-optimized classifiers are at least as stable as (with 35% of them being more stable than) classifiers that are trained using the default settings; and (3) Caret increases the likelihood of producing a top-performing classifier by as much as 83%. Hence, we conclude that parameter settings can indeed have a large impact on the performance of defect prediction models, suggesting that researchers should experiment with the parameters of the classification techniques. Since automated parameter optimization techniques like Caret yield substantially benefits in terms of performance improvement and stability, while incurring a manageable additional computational cost, they should be included in future defect prediction studies.

Original languageEnglish
Title of host publicationProceedings - 2016 IEEE/ACM 38th IEEE International Conference on Software Engineering Companion, ICSE 2016
Subtitle of host publication14-22 May 2016 Austin, Texas, USA
EditorsWillem Visser, Laurie Williams
Place of PublicationNew York NY USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages321-332
Number of pages12
ISBN (Electronic)9781450339001, 9781450342056
DOIs
Publication statusPublished - 2016
Externally publishedYes
EventInternational Conference on Software Engineering 2016 - Renaissance Austin Hotel, Austin, United States of America
Duration: 14 May 201622 May 2016
Conference number: 38th
http://2016.icse.cs.txstate.edu/
https://ieeexplore.ieee.org/xpl/conhome/7878354/proceeding (Proceedings)

Conference

ConferenceInternational Conference on Software Engineering 2016
Abbreviated titleICSE 2016
Country/TerritoryUnited States of America
CityAustin
Period14/05/1622/05/16
Internet address

Keywords

  • Classification techniques
  • Experimental design
  • Parameter optimization
  • Software defect prediction

Cite this