Constructing defect predictors and communicating the outcomes to practitioners

Taneli Taipale, Mika Qvist, Burak Turhan

Research output: Contribution to journalConference articleResearchpeer-review

Abstract

Background: An alternative to expert-based decisions is to take data-driven decisions and software analytics is the key enabler for this evidence-based management approach. Defect prediction is one popular application area of software analytics, however with serious challenges to deploy into practice. Goal: We aim at developing and deploying a defect prediction model for guiding practitioners to focus their activities on the most problematic parts of the software and improve the efficiency of the testing process. Method: We present a pilot study, where we developed a defect prediction model and different modes of information representation of the data and the model outcomes, namely: commit hotness ranking, error probability mapping to the source and visualization of interactions among teams through errors. We also share the challenges and lessons learned in the process. Result: In terms of standard performance measures, the constructed defect prediction model performs similar to those reported in earlier studies, e.g. 80% of errors can be detected by inspecting 30% of the source. However, the feedback from practitioners indicates that such performance figures are not useful to have an impact in their daily work. Pointing out most problematic source files, even isolating error-prone sections within files are regarded as stating the obvious by the practitioners, though the latter is found to be helpful for activities such as refactoring. On the other hand, visualizing the interactions among teams, based on the errors introduced and fixed, turns out to be the most helpful representation as it helps pinpointing communication related issues within and across teams. Conclusion: The constructed predictor can give accurate information about the most error prone parts. Creating practical representations from this data is possible, but takes effort. The error prediction research done in Elektrobit Wireless Ltd is concluded to be useful and we will further improve the presentations made from the error prediction data.

Keywords

  • data-driven decisions
  • error prediction
  • machine learning algorithms
  • prediction algorithms
  • software testing

Cite this

@article{3e926d9921e44b2fb2c76db9c4a24837,
title = "Constructing defect predictors and communicating the outcomes to practitioners",
abstract = "Background: An alternative to expert-based decisions is to take data-driven decisions and software analytics is the key enabler for this evidence-based management approach. Defect prediction is one popular application area of software analytics, however with serious challenges to deploy into practice. Goal: We aim at developing and deploying a defect prediction model for guiding practitioners to focus their activities on the most problematic parts of the software and improve the efficiency of the testing process. Method: We present a pilot study, where we developed a defect prediction model and different modes of information representation of the data and the model outcomes, namely: commit hotness ranking, error probability mapping to the source and visualization of interactions among teams through errors. We also share the challenges and lessons learned in the process. Result: In terms of standard performance measures, the constructed defect prediction model performs similar to those reported in earlier studies, e.g. 80{\%} of errors can be detected by inspecting 30{\%} of the source. However, the feedback from practitioners indicates that such performance figures are not useful to have an impact in their daily work. Pointing out most problematic source files, even isolating error-prone sections within files are regarded as stating the obvious by the practitioners, though the latter is found to be helpful for activities such as refactoring. On the other hand, visualizing the interactions among teams, based on the errors introduced and fixed, turns out to be the most helpful representation as it helps pinpointing communication related issues within and across teams. Conclusion: The constructed predictor can give accurate information about the most error prone parts. Creating practical representations from this data is possible, but takes effort. The error prediction research done in Elektrobit Wireless Ltd is concluded to be useful and we will further improve the presentations made from the error prediction data.",
keywords = "data-driven decisions, error prediction, machine learning algorithms, prediction algorithms, software testing",
author = "Taneli Taipale and Mika Qvist and Burak Turhan",
year = "2013",
month = "12",
day = "1",
doi = "10.1109/ESEM.2013.45",
language = "English",
pages = "357--362",
journal = "International Symposium on Empirical Software Engineering and Measurement",
issn = "1949-3770",

}

Constructing defect predictors and communicating the outcomes to practitioners. / Taipale, Taneli; Qvist, Mika; Turhan, Burak.

In: International Symposium on Empirical Software Engineering and Measurement, 01.12.2013, p. 357-362.

Research output: Contribution to journalConference articleResearchpeer-review

TY - JOUR

T1 - Constructing defect predictors and communicating the outcomes to practitioners

AU - Taipale, Taneli

AU - Qvist, Mika

AU - Turhan, Burak

PY - 2013/12/1

Y1 - 2013/12/1

N2 - Background: An alternative to expert-based decisions is to take data-driven decisions and software analytics is the key enabler for this evidence-based management approach. Defect prediction is one popular application area of software analytics, however with serious challenges to deploy into practice. Goal: We aim at developing and deploying a defect prediction model for guiding practitioners to focus their activities on the most problematic parts of the software and improve the efficiency of the testing process. Method: We present a pilot study, where we developed a defect prediction model and different modes of information representation of the data and the model outcomes, namely: commit hotness ranking, error probability mapping to the source and visualization of interactions among teams through errors. We also share the challenges and lessons learned in the process. Result: In terms of standard performance measures, the constructed defect prediction model performs similar to those reported in earlier studies, e.g. 80% of errors can be detected by inspecting 30% of the source. However, the feedback from practitioners indicates that such performance figures are not useful to have an impact in their daily work. Pointing out most problematic source files, even isolating error-prone sections within files are regarded as stating the obvious by the practitioners, though the latter is found to be helpful for activities such as refactoring. On the other hand, visualizing the interactions among teams, based on the errors introduced and fixed, turns out to be the most helpful representation as it helps pinpointing communication related issues within and across teams. Conclusion: The constructed predictor can give accurate information about the most error prone parts. Creating practical representations from this data is possible, but takes effort. The error prediction research done in Elektrobit Wireless Ltd is concluded to be useful and we will further improve the presentations made from the error prediction data.

AB - Background: An alternative to expert-based decisions is to take data-driven decisions and software analytics is the key enabler for this evidence-based management approach. Defect prediction is one popular application area of software analytics, however with serious challenges to deploy into practice. Goal: We aim at developing and deploying a defect prediction model for guiding practitioners to focus their activities on the most problematic parts of the software and improve the efficiency of the testing process. Method: We present a pilot study, where we developed a defect prediction model and different modes of information representation of the data and the model outcomes, namely: commit hotness ranking, error probability mapping to the source and visualization of interactions among teams through errors. We also share the challenges and lessons learned in the process. Result: In terms of standard performance measures, the constructed defect prediction model performs similar to those reported in earlier studies, e.g. 80% of errors can be detected by inspecting 30% of the source. However, the feedback from practitioners indicates that such performance figures are not useful to have an impact in their daily work. Pointing out most problematic source files, even isolating error-prone sections within files are regarded as stating the obvious by the practitioners, though the latter is found to be helpful for activities such as refactoring. On the other hand, visualizing the interactions among teams, based on the errors introduced and fixed, turns out to be the most helpful representation as it helps pinpointing communication related issues within and across teams. Conclusion: The constructed predictor can give accurate information about the most error prone parts. Creating practical representations from this data is possible, but takes effort. The error prediction research done in Elektrobit Wireless Ltd is concluded to be useful and we will further improve the presentations made from the error prediction data.

KW - data-driven decisions

KW - error prediction

KW - machine learning algorithms

KW - prediction algorithms

KW - software testing

UR - http://www.scopus.com/inward/record.url?scp=84893240486&partnerID=8YFLogxK

U2 - 10.1109/ESEM.2013.45

DO - 10.1109/ESEM.2013.45

M3 - Conference article

SP - 357

EP - 362

JO - International Symposium on Empirical Software Engineering and Measurement

T2 - International Symposium on Empirical Software Engineering and Measurement

JF - International Symposium on Empirical Software Engineering and Measurement

SN - 1949-3770

M1 - 6681379

ER -