Ground truth deficiencies in software engineering: when codifying the past can be counterproductive

Eray Tuzun, Hakan Erdogmus, Maria Teresa Baldassarre, Michael Felderer, Robert Feldt, Burak Turhan

Research output: Contribution to journalArticleResearchpeer-review

3 Citations (Scopus)


Many software engineering tools build and evaluate their models based on historical data to support development and process decisions. These models help us answer numerous interesting questions, but have their own caveats. In a real-life setting, the objective function of human decision-makers for a given task might be influenced by a whole host of factors that stem from their cognitive biases, subverting the ideal objective function required for an optimally functioning system. Relying on this data as ground truth may give rise to systems that end up automating software engineering decisions by mimicking past sub-optimal behaviour. We illustrate this phenomenon and suggest mitigation strategies to raise awareness.

Original languageEnglish
Pages (from-to)85-95
Number of pages13
JournalIEEE Software
Issue number3
Publication statusPublished - May 2022


  • Computer bugs
  • Data models
  • Machine learning
  • Software
  • Software engineering
  • Task analysis
  • Tools

Cite this