JITLine: a simpler, better, faster, finer-grained Just-In-Time defect prediction

Chanathip Pornprasit, Chakkrit Kla Tantithamthavorn

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

A Just-In-Time (JIT) defect prediction model is a classifier to predict if a commit is defect-introducing. Recently, CC2Vec - a deep learning approach for Just-In-Time defect prediction - has been proposed. However, CC2Vec requires the whole dataset (i.e., training + testing) for model training, assuming that all unlabelled testing datasets would be available beforehand, which does not follow the key principles of just-in-time defect predictions. Our replication study shows that, after excluding the testing dataset for model training, the F-measure of CC2Vec is decreased by 38.5% for OpenStack and 45.7% for Qt, highlighting the negative impact of excluding the testing dataset for Just-In-Time defect prediction. In addition, CC2Vec cannot perform fine-grained predictions at the line level (i.e., which lines are most risky for a given commit).In this paper, we propose JITLine - a Just-In-Time defect prediction approach for predicting defect-introducing commits and identifying lines that are associated with that defect-introducing commit (i.e., defective lines). Through a case study of 37, 524 commits from OpenStack and Qt, we find that our JITLine approach is at least 26%-38% more accurate (F-measure), 17%-51% more cost-effective (PCI@20%LOC), 70-100 times faster than the state-of-the-art approaches (i.e., CC2Vec and DeepJIT) and the fine-grained predictions at the line level by our approach are 133%-150% more accurate (Top-10 Accuracy) than the baseline NLP approach. Therefore, our JITLine approach may help practitioners to better prioritize defect-introducing commits and better identify defective lines.

Original languageEnglish
Title of host publicationProceedings - 2021 IEEE/ACM 18th International Conference on Mining Software Repositories, MSR 2021
EditorsKelly Blincoe, Meiyappan Nagappan
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages369-379
Number of pages11
ISBN (Electronic)9781728187105
ISBN (Print)9781665429856
DOIs
Publication statusPublished - 2021
EventIEEE International Working Conference on Mining Software Repositories 2021 - Online, Madrid, Spain
Duration: 22 May 202130 May 2021
Conference number: 18th
https://ieeexplore-ieee-org.ezproxy.lib.monash.edu.au/xpl/conhome/9463061/proceeding (Proceedings)

Publication series

NameProceedings - 2021 IEEE/ACM 18th International Conference on Mining Software Repositories, MSR 2021
PublisherIEEE, Institute of Electrical and Electronics Engineers
ISSN (Print)2574-3848
ISSN (Electronic)2574-3864

Conference

ConferenceIEEE International Working Conference on Mining Software Repositories 2021
Abbreviated titleMSR 2021
Country/TerritorySpain
CityMadrid
Period22/05/2130/05/21
Internet address

Keywords

  • Explainable AI
  • Just In Time Defect Prediction
  • Software Quality Assurance

Cite this