Automating change-level self-admitted technical debt determination

Meng Yan, Xin Xia, Emad Shihab, David Lo, Jianwei Yin, Xiaohu Yang

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Self-Admitted Technical Debt (SATD) refers to technical debt that is introduced intentionally. Previous studies that identify SATD at the file-level in isolation cannot describe the TD context related to multiple files. Therefore, it is more beneficial to identify the SATD once a change is being made. We refer to this type of TD identification as “Change-level SATD Determination”, and identifying SATD at the change-level can help to manage and control TD by understanding the TD context through tracing the introducing changes. In this paper, we propose a change-level SATD Determination mode by extracting 25 features from software changes that are divided into three dimensions, namely diffusion, history and message, respectively. To evaluate the effectiveness of our proposed model, we perform an empirical study on 7 open source projects containing a total of 100,011 software changes. The experimental results show that our model achieves a promising and better performance than four baselines in terms of AUC and cost-effectiveness. On average across the 7 experimental projects, our model achieves AUC of 0.82, cost-effectiveness of 0.80, which is a significant improvement over the comparison baselines used. In addition, we found that “Diffusion” is the most discriminative dimension for determining TD-introducing changes

Original languageEnglish
Number of pages18
JournalIEEE Transactions on Software Engineering
DOIs
Publication statusAccepted/In press - 2019

Keywords

  • Change-level Determination
  • Self-admitted Technical Debt
  • Software Change
  • Technical Debt

Cite this

Yan, Meng ; Xia, Xin ; Shihab, Emad ; Lo, David ; Yin, Jianwei ; Yang, Xiaohu. / Automating change-level self-admitted technical debt determination. In: IEEE Transactions on Software Engineering. 2019.
@article{34672fadd13e43e68d14e6b20ff41362,
title = "Automating change-level self-admitted technical debt determination",
abstract = "Self-Admitted Technical Debt (SATD) refers to technical debt that is introduced intentionally. Previous studies that identify SATD at the file-level in isolation cannot describe the TD context related to multiple files. Therefore, it is more beneficial to identify the SATD once a change is being made. We refer to this type of TD identification as “Change-level SATD Determination”, and identifying SATD at the change-level can help to manage and control TD by understanding the TD context through tracing the introducing changes. In this paper, we propose a change-level SATD Determination mode by extracting 25 features from software changes that are divided into three dimensions, namely diffusion, history and message, respectively. To evaluate the effectiveness of our proposed model, we perform an empirical study on 7 open source projects containing a total of 100,011 software changes. The experimental results show that our model achieves a promising and better performance than four baselines in terms of AUC and cost-effectiveness. On average across the 7 experimental projects, our model achieves AUC of 0.82, cost-effectiveness of 0.80, which is a significant improvement over the comparison baselines used. In addition, we found that “Diffusion” is the most discriminative dimension for determining TD-introducing changes",
keywords = "Change-level Determination, Self-admitted Technical Debt, Software Change, Technical Debt",
author = "Meng Yan and Xin Xia and Emad Shihab and David Lo and Jianwei Yin and Xiaohu Yang",
year = "2019",
doi = "10.1109/TSE.2018.2831232",
language = "English",
journal = "IEEE Transactions on Software Engineering",
issn = "0098-5589",
publisher = "Publ by IEEE",

}

Automating change-level self-admitted technical debt determination. / Yan, Meng; Xia, Xin; Shihab, Emad; Lo, David; Yin, Jianwei; Yang, Xiaohu.

In: IEEE Transactions on Software Engineering, 2019.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Automating change-level self-admitted technical debt determination

AU - Yan, Meng

AU - Xia, Xin

AU - Shihab, Emad

AU - Lo, David

AU - Yin, Jianwei

AU - Yang, Xiaohu

PY - 2019

Y1 - 2019

N2 - Self-Admitted Technical Debt (SATD) refers to technical debt that is introduced intentionally. Previous studies that identify SATD at the file-level in isolation cannot describe the TD context related to multiple files. Therefore, it is more beneficial to identify the SATD once a change is being made. We refer to this type of TD identification as “Change-level SATD Determination”, and identifying SATD at the change-level can help to manage and control TD by understanding the TD context through tracing the introducing changes. In this paper, we propose a change-level SATD Determination mode by extracting 25 features from software changes that are divided into three dimensions, namely diffusion, history and message, respectively. To evaluate the effectiveness of our proposed model, we perform an empirical study on 7 open source projects containing a total of 100,011 software changes. The experimental results show that our model achieves a promising and better performance than four baselines in terms of AUC and cost-effectiveness. On average across the 7 experimental projects, our model achieves AUC of 0.82, cost-effectiveness of 0.80, which is a significant improvement over the comparison baselines used. In addition, we found that “Diffusion” is the most discriminative dimension for determining TD-introducing changes

AB - Self-Admitted Technical Debt (SATD) refers to technical debt that is introduced intentionally. Previous studies that identify SATD at the file-level in isolation cannot describe the TD context related to multiple files. Therefore, it is more beneficial to identify the SATD once a change is being made. We refer to this type of TD identification as “Change-level SATD Determination”, and identifying SATD at the change-level can help to manage and control TD by understanding the TD context through tracing the introducing changes. In this paper, we propose a change-level SATD Determination mode by extracting 25 features from software changes that are divided into three dimensions, namely diffusion, history and message, respectively. To evaluate the effectiveness of our proposed model, we perform an empirical study on 7 open source projects containing a total of 100,011 software changes. The experimental results show that our model achieves a promising and better performance than four baselines in terms of AUC and cost-effectiveness. On average across the 7 experimental projects, our model achieves AUC of 0.82, cost-effectiveness of 0.80, which is a significant improvement over the comparison baselines used. In addition, we found that “Diffusion” is the most discriminative dimension for determining TD-introducing changes

KW - Change-level Determination

KW - Self-admitted Technical Debt

KW - Software Change

KW - Technical Debt

UR - http://www.scopus.com/inward/record.url?scp=85046362904&partnerID=8YFLogxK

U2 - 10.1109/TSE.2018.2831232

DO - 10.1109/TSE.2018.2831232

M3 - Article

JO - IEEE Transactions on Software Engineering

JF - IEEE Transactions on Software Engineering

SN - 0098-5589

ER -