DeepLineDP: towards a deep learning approach for line-level defect prediction

Chanathip Pornprasit, Chakkrit Tantithamthavorn

Research output: Contribution to journalArticleResearchpeer-review

82 Citations (Scopus)

Abstract

Defect prediction is proposed to assist practitioners effectively prioritize limited Software Quality Assurance (SQA) resources on the most risky files that are likely to have post-release software defects. However, there exist two main limitations in prior studies: (1) the granularity levels of defect predictions are still coarse-grained and (2) the surrounding tokens and surrounding lines have not yet been fully utilized. In this paper, we perform a survey study to better understand how practitioners perform code inspection in modern code review process, and their perception on a line-level defect prediction. According to the responses from 36 practitioners, we found that 50% of them spent at least 10 minutes to more than one hour to review a single file, while 64% of them still perceived that code inspection activity is challenging to extremely challenging. In addition, 64% of the respondents perceived that a line-level defect prediction tool would potentially be helpful in identifying defective lines. Motivated by the practitioners' perspective, we present DeepLineDP, a deep learning approach to automatically learn the semantic properties of the surrounding tokens and lines in order to identify defective files and defective lines. Through a case study of 32 releases of 9 software projects, we find that the risk score of code tokens varies greatly depending on their location. Our DeepLineDP is 14%-24% more accurate than other file-level defect prediction approaches; is 50%-250% more cost-effective than other line-level defect prediction approaches; and achieves a reasonable performance when transferred to other software projects. These findings confirm that the surrounding tokens and surrounding lines should be considered to identify the fine-grained locations of defective files (i.e., defective lines).

Original languageEnglish
Pages (from-to)84-98
Number of pages15
JournalIEEE Transactions on Software Engineering
Volume49
Issue number1
DOIs
Publication statusPublished - 1 Jan 2023

Keywords

  • Codes
  • Deep learning
  • Deep Learning
  • Explainable AI
  • Inspection
  • Line-level Defect Prediction
  • Predictive models
  • Semantics
  • Social networking (online)
  • Software
  • Software Quality Assurance

Cite this