A correlation study between automated program repair and test-suite metrics

Jooyong Yi, Shin Hwei Tan, Sergey Mechtaev, Marcel Böhme, Abhik Roychoudhury

Research output: Contribution to journalArticleResearchpeer-review

3 Citations (Scopus)

Abstract

Automated program repair is increasingly gaining traction, due to its potential to reduce debugging cost greatly. The feasibility of automated program repair has been shown in a number of works, and the research focus is gradually shifting toward the quality of generated patches. One promising direction is to control the quality of generated patches by controlling the quality of test-suites used for automated program repair. In this paper, we ask the following research question: “Can traditional test-suite metrics proposed for the purpose of software testing also be used for the purpose of automated program repair?” We empirically investigate whether traditional test-suite metrics such as statement/branch coverage and mutation score are effective in controlling the reliability of generated repairs (the likelihood that repairs cause regression errors). We conduct the largest-scale experiments of this kind to date with real-world software, and for the first time perform a correlation study between various test-suite metrics and the reliability of generated repairs. Our results show that in general, with the increase of traditional test suite metrics, the reliability of repairs tend to increase. In particular, such a trend is most strongly observed in statement coverage. Our results imply that the traditional test suite metrics proposed for software testing can also be used for automated program repair to improve the reliability of repairs.

Original languageEnglish
Pages (from-to)2948-2979
Number of pages32
JournalEmpirical Software Engineering
Volume23
Issue number5
DOIs
Publication statusPublished - Oct 2018
Externally publishedYes

Keywords

  • Automated program repair
  • Correlation
  • Empirical evaluation
  • Test suite

Cite this

Yi, Jooyong ; Tan, Shin Hwei ; Mechtaev, Sergey ; Böhme, Marcel ; Roychoudhury, Abhik. / A correlation study between automated program repair and test-suite metrics. In: Empirical Software Engineering. 2018 ; Vol. 23, No. 5. pp. 2948-2979.
@article{31ce0fe8c91444498c253c2a0db8983e,
title = "A correlation study between automated program repair and test-suite metrics",
abstract = "Automated program repair is increasingly gaining traction, due to its potential to reduce debugging cost greatly. The feasibility of automated program repair has been shown in a number of works, and the research focus is gradually shifting toward the quality of generated patches. One promising direction is to control the quality of generated patches by controlling the quality of test-suites used for automated program repair. In this paper, we ask the following research question: “Can traditional test-suite metrics proposed for the purpose of software testing also be used for the purpose of automated program repair?” We empirically investigate whether traditional test-suite metrics such as statement/branch coverage and mutation score are effective in controlling the reliability of generated repairs (the likelihood that repairs cause regression errors). We conduct the largest-scale experiments of this kind to date with real-world software, and for the first time perform a correlation study between various test-suite metrics and the reliability of generated repairs. Our results show that in general, with the increase of traditional test suite metrics, the reliability of repairs tend to increase. In particular, such a trend is most strongly observed in statement coverage. Our results imply that the traditional test suite metrics proposed for software testing can also be used for automated program repair to improve the reliability of repairs.",
keywords = "Automated program repair, Correlation, Empirical evaluation, Test suite",
author = "Jooyong Yi and Tan, {Shin Hwei} and Sergey Mechtaev and Marcel B{\"o}hme and Abhik Roychoudhury",
year = "2018",
month = "10",
doi = "10.1007/s10664-017-9552-y",
language = "English",
volume = "23",
pages = "2948--2979",
journal = "Empirical Software Engineering",
issn = "1382-3256",
publisher = "Springer-Verlag London Ltd.",
number = "5",

}

A correlation study between automated program repair and test-suite metrics. / Yi, Jooyong; Tan, Shin Hwei; Mechtaev, Sergey; Böhme, Marcel; Roychoudhury, Abhik.

In: Empirical Software Engineering, Vol. 23, No. 5, 10.2018, p. 2948-2979.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - A correlation study between automated program repair and test-suite metrics

AU - Yi, Jooyong

AU - Tan, Shin Hwei

AU - Mechtaev, Sergey

AU - Böhme, Marcel

AU - Roychoudhury, Abhik

PY - 2018/10

Y1 - 2018/10

N2 - Automated program repair is increasingly gaining traction, due to its potential to reduce debugging cost greatly. The feasibility of automated program repair has been shown in a number of works, and the research focus is gradually shifting toward the quality of generated patches. One promising direction is to control the quality of generated patches by controlling the quality of test-suites used for automated program repair. In this paper, we ask the following research question: “Can traditional test-suite metrics proposed for the purpose of software testing also be used for the purpose of automated program repair?” We empirically investigate whether traditional test-suite metrics such as statement/branch coverage and mutation score are effective in controlling the reliability of generated repairs (the likelihood that repairs cause regression errors). We conduct the largest-scale experiments of this kind to date with real-world software, and for the first time perform a correlation study between various test-suite metrics and the reliability of generated repairs. Our results show that in general, with the increase of traditional test suite metrics, the reliability of repairs tend to increase. In particular, such a trend is most strongly observed in statement coverage. Our results imply that the traditional test suite metrics proposed for software testing can also be used for automated program repair to improve the reliability of repairs.

AB - Automated program repair is increasingly gaining traction, due to its potential to reduce debugging cost greatly. The feasibility of automated program repair has been shown in a number of works, and the research focus is gradually shifting toward the quality of generated patches. One promising direction is to control the quality of generated patches by controlling the quality of test-suites used for automated program repair. In this paper, we ask the following research question: “Can traditional test-suite metrics proposed for the purpose of software testing also be used for the purpose of automated program repair?” We empirically investigate whether traditional test-suite metrics such as statement/branch coverage and mutation score are effective in controlling the reliability of generated repairs (the likelihood that repairs cause regression errors). We conduct the largest-scale experiments of this kind to date with real-world software, and for the first time perform a correlation study between various test-suite metrics and the reliability of generated repairs. Our results show that in general, with the increase of traditional test suite metrics, the reliability of repairs tend to increase. In particular, such a trend is most strongly observed in statement coverage. Our results imply that the traditional test suite metrics proposed for software testing can also be used for automated program repair to improve the reliability of repairs.

KW - Automated program repair

KW - Correlation

KW - Empirical evaluation

KW - Test suite

UR - http://www.scopus.com/inward/record.url?scp=85030169360&partnerID=8YFLogxK

U2 - 10.1007/s10664-017-9552-y

DO - 10.1007/s10664-017-9552-y

M3 - Article

VL - 23

SP - 2948

EP - 2979

JO - Empirical Software Engineering

JF - Empirical Software Engineering

SN - 1382-3256

IS - 5

ER -