Constructing defect predictors and communicating the outcomes to practitioners

Taneli Taipale, Mika Qvist, Burak Turhan

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

5 Citations (Scopus)


Background: An alternative to expert-based decisions is to take data-driven decisions and software analytics is the key enabler for this evidence-based management approach. Defect prediction is one popular application area of software analytics, however with serious challenges to deploy into practice. Goal: We aim at developing and deploying a defect prediction model for guiding practitioners to focus their activities on the most problematic parts of the software and improve the efficiency of the testing process. Method: We present a pilot study, where we developed a defect prediction model and different modes of information representation of the data and the model outcomes, namely: commit hotness ranking, error probability mapping to the source and visualization of interactions among teams through errors. We also share the challenges and lessons learned in the process. Result: In terms of standard performance measures, the constructed defect prediction model performs similar to those reported in earlier studies, e.g. 80% of errors can be detected by inspecting 30% of the source. However, the feedback from practitioners indicates that such performance figures are not useful to have an impact in their daily work. Pointing out most problematic source files, even isolating error-prone sections within files are regarded as stating the obvious by the practitioners, though the latter is found to be helpful for activities such as refactoring. On the other hand, visualizing the interactions among teams, based on the errors introduced and fixed, turns out to be the most helpful representation as it helps pinpointing communication related issues within and across teams. Conclusion: The constructed predictor can give accurate information about the most error prone parts. Creating practical representations from this data is possible, but takes effort. The error prediction research done in Elektrobit Wireless Ltd is concluded to be useful and we will further improve the presentations made from the error prediction data.

Original languageEnglish
Title of host publication2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Number of pages6
ISBN (Print)9781479911448
Publication statusPublished - 1 Dec 2013
Externally publishedYes
EventInternational Symposium on Empirical Software Engineering and Measurement 2013 - Baltimore, United States of America
Duration: 10 Oct 201311 Oct 2013


ConferenceInternational Symposium on Empirical Software Engineering and Measurement 2013
Abbreviated titleESEM 2013
Country/TerritoryUnited States of America


  • data-driven decisions
  • error prediction
  • machine learning algorithms
  • prediction algorithms
  • software testing

Cite this