Lessons learned from using a deep tree-based model for software defect prediction in practice

Hoa Khanh Dam, Trang Pham, Shien Wee Ng, Truyen Tran, John Grundy, Aditya Ghose, Taeksu Kim, Chul Joo Kim

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

51 Citations (Scopus)


Defects are common in software systems and cause many problems for software users. Different methods have been developed to make early prediction about the most likely defective modules in large codebases. Most focus on designing features (e.g. complexity metrics) that correlate with potentially defective code. Those approaches however do not sufficiently capture the syntax and multiple levels of semantics of source code, a potentially important capability for building accurate prediction models. In this paper, we report on our experience of deploying a new deep learning tree-based defect prediction model in practice. This model is built upon the tree-structured Long Short Term Memory network which directly matches with the Abstract Syntax Tree representation of source code. We discuss a number of lessons learned from developing the model and evaluating it on two datasets, one from open source projects contributed by our industry partner Samsung and the other from the public PROMISE repository.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE/ACM 16th International Conference on Mining Software Repositories, MSR 2019
EditorsBram Adams, Sonia Haiduc
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Number of pages12
ISBN (Electronic)9781728134123
ISBN (Print)9781728133706
Publication statusPublished - 2019
EventIEEE International Working Conference on Mining Software Repositories 2019 - Montreal, Canada
Duration: 26 May 201927 May 2019
Conference number: 16th
https://ieeexplore.ieee.org/xpl/conhome/8804710/proceeding (Proceedings)


ConferenceIEEE International Working Conference on Mining Software Repositories 2019
Abbreviated titleMSR 2019
Internet address


  • Deep learning
  • Defect prediction

Cite this