Weighted static code attributes for software defect prediction

Burak Turhan, Ayse Bener

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

2 Citations (Scopus)


It has been recently shown that defect predictors built on the combination of log-filtering, InfoGain attribute selection and Naive Bayes learner, outperform rule based learners. Naive Bayes is a well known statistical technique that assumes the 'independence' and 'equal importance' of attributes, which are not true in many problems. This paper addresses the 'equal importance' of attributes assumption of Naive Bayes. We show that with simple heuristics, relevant weights can be assigned to attributes according to their importance which improves defect prediction performance. Furthermore, our proposed heuristics have linear time computational complexities whereas choosing the optimal subset of attributes requires an exhaustive search in the attribute space. We compare the weighted Naive Bayes and the standard Naive Bayes predictors' performances on publicly available datasets both from Nasa and various small and medium enterprises (SMEs) in Turkey. Our results indicate that assigning weights to static code attributes may increase the prediction performance significantly, while removing the need for feature subset selection.

Original languageEnglish
Title of host publication20th International Conference on Software Engineering and Knowledge Engineering, SEKE 2008
Number of pages6
Publication statusPublished - 1 Dec 2008
Externally publishedYes
EventInternational Conference on Software Engineering and Knowledge Engineering, SEKE 2008 - San Francisco Bay, United States of America
Duration: 1 Jul 20083 Jul 2008
Conference number: 20th


ConferenceInternational Conference on Software Engineering and Knowledge Engineering, SEKE 2008
Abbreviated titleSEKE 2008
CountryUnited States of America
CitySan Francisco Bay


  • Complexity measures
  • Methods for SQA and V&V
  • Metrics/measurement

Cite this