DeepLineDP: towards a deep learning approach for line-level defect prediction

Chanathip Pornprasit, Chakkrit Tantithamthavorn

Research output: Contribution to journalArticleResearchpeer-review

4 Citations (Scopus)

Abstract

Defect prediction is proposed to assist practitioners effectively prioritize limited Software Quality Assurance (SQA) resources on the most risky files that are likely to have post-release software defects. However, there exist two main limitations in prior studies: (1) the granularity levels of defect predictions are still coarse-grained and (2) the surrounding tokens and surrounding lines have not yet been fully utilized. In this paper, we perform a survey study to better understand how practitioners perform code inspection in modern code review process, and their perception on a line-level defect prediction. According to the responses from 36 practitioners, we found that 50% of them spent at least 10 minutes to more than one hour to review a single file, while 64% of them still perceived that code inspection activity is challenging to extremely challenging. In addition, 64% of the respondents perceived that a line-level defect prediction tool would potentially be helpful in identifying defective lines. Motivated by the practitioners' perspective, we present DeepLineDP, a deep learning approach to automatically learn the semantic properties of the surrounding tokens and lines in order to identify defective files and defective lines. Through a case study of 32 releases of 9 software projects, we find that the risk score of code tokens varies greatly depending on their location. Our DeepLineDP is 14%-24% more accurate than other file-level defect prediction approaches; is 50%-250% more cost-effective than other line-level defect prediction approaches; and achieves a reasonable performance when transferred to other software projects. These findings confirm that the surrounding tokens and surrounding lines should be considered to identify the fine-grained locations of defective files (i.e., defective lines).

Original languageEnglish
Pages (from-to)84-98
Number of pages15
JournalIEEE Transactions on Software Engineering
Volume49
Issue number1
DOIs
Publication statusPublished - 1 Jan 2023

Keywords

  • Codes
  • Deep learning
  • Deep Learning
  • Explainable AI
  • Inspection
  • Line-level Defect Prediction
  • Predictive models
  • Semantics
  • Social networking (online)
  • Software
  • Software Quality Assurance

Cite this