Implications of ceiling effects in defect predictors

Tim Menzies, Burak Turhan, Ayse Bener, Gregory Gay, Bojan Cukic, Yue Jiang

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

135 Citations (Scopus)


Context: There are many methods that input static code features and output a predictor for faulty code modules. These data mining methods have hit a "performance ceiling"; i.e., some inherent upper bound on the amount of information offered by, say, static code features when identifying modules which contain faults. Objective: We seek an explanation for this ceiling effect. Perhaps static code features have "limited information content"; i.e. their information can be quickly and completely discovered by even simple learners. Method: An initial literature review documents the ceiling effect in other work. Next, using three sub-sampling techniques (under-, over-, and micro-sampling), we look for the lower useful bound on the number of training instances. Results: Using micro-sampling, we find that as few as 50 instances yield as much information as larger training sets. Conclusions: We have found much evidence for the limited information hypothesis. Further progress in learning defect predictors may not come from better algorithms. Rather, we need to be improving the information content of the training data, perhaps with case-based reasoning methods.

Original languageEnglish
Title of host publication30th International Conference on Software Engineering, ICSE 2008 Co-located Workshops - Proceedings of the 4th International Workshop on Predictor Models in Software Engineering, PROMISE'08
Number of pages8
Publication statusPublished - 8 Dec 2008
Externally publishedYes
EventInternational Conference on Software Engineering 2008 - Leipzig, Germany
Duration: 10 May 200818 May 2008
Conference number: 30th (Proceedings)


ConferenceInternational Conference on Software Engineering 2008
Abbreviated titleICSE 2008
Internet address


  • Defect prediction
  • Naive bayes
  • Over-sampling
  • Under-sampling

Cite this