EFSPredictor: predicting configuration bugs with ensemble feature selection

Bowen Xu, David Lo, Xin Xia, Ashish Sureka, Shanping Li

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

9 Citations (Scopus)

Abstract

The configuration of a system determines the system behavior and wrong configuration settings can adversely impact system's availability, performance, and correctness. We refer to these wrong configuration settings as configuration bugs. The importance of configuration bugs has prompted many researchers to study it, and past studies can be grouped into three categories: detection, localization, and fixing of configuration bugs. In the work, we focus on the detection of configuration bugs, in particular, we follow the line-of-work that tries to predict if a bug report is caused by a wrong configuration setting. Automatically prediction of whether a bug is a configuration bug can help developers reduce debugging effort. We propose a novel approach named EFSPredictor which applies ensemble feature selection on the natural-language description of a bug report. It uses different feature selection approaches (e.g., ChiSquare, GainRatio and Relief) which output different ranked lists of textual features. Next, to obtain a set of representative textual features, EFSPredictor first assigns different scores to the features outputted by these feature selection approaches. Next, for each feature, EFSPredictor sums up the scores outputted by the multiple ranked lists, and outputs the top features (e.g., 25% of the total number of features) as the selected features. Finally, EFSPredictor builds a prediction model based on the selected features. We conduct experiments on 5 bug report datasets (i.e., accumulo, activemq, camel, flume, and wicket) containing a total of 3,203 bugs. The experiment results show that, on average across the 5 projects, EFSPredictor achieves an F1-score to 0.57, which improves the state-of-the-art approach proposed by Xia et al. by 14%.

Original languageEnglish
Title of host publicationProceedings - 22nd Asia-Pacific Software Engineering Conference, APSEC 2015
Subtitle of host publication1–4 December 2015 New Delhi, India
EditorsJing Sun, Y. Raghu Reddy, Arun Bahulkar, Anjaneyulu Pasala
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages206-213
Number of pages8
ISBN (Electronic)9781467396448
DOIs
Publication statusPublished - 2015
Externally publishedYes
EventAsia-Pacific Software Engineering Conference 2015 - New Delhi, India
Duration: 1 Dec 20154 Dec 2015
Conference number: 22nd
https://isoft.acm.org/apsec2015/
https://ieeexplore.ieee.org/xpl/conhome/7467057/proceeding (Proceedings)

Conference

ConferenceAsia-Pacific Software Engineering Conference 2015
Abbreviated titleAPSEC 2015
CountryIndia
CityNew Delhi
Period1/12/154/12/15
Internet address

Keywords

  • Configuration Bugs
  • Data Mining
  • Ensemble Feature Selection

Cite this