Applying information theory to software evolution

Adriano Torres, Sebastian Baltes, Christoph Treude, Markus Wagner

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Although information theory has found success in disciplines, the literature on its applications to software evolution is limit. We are still missing artifacts that leverage the data and tooling available to measure how the information content of a project can be a proxy for its complexity. In this work, we explore two definitions of entropy, one structural and one textual, and apply it to the historical progression of the commit history of 25 open source projects. We produce evidence that they generally are highly correlated. We also observed that they display weak and unstable correlations with other complexity metrics. Our preliminary investigation of outliers shows an unexpected high frequency of events where there is considerable change in the information content of the project, suggesting that such outliers may inform a definition of surprisal.

Original languageEnglish
Title of host publicationProceedings - 2023 IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering, NLBSE 2023
EditorsSebastiano Panichella, Andrea Di Sorbo
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages48-55
Number of pages8
ISBN (Electronic)9798350301786
ISBN (Print)9798350301793
DOIs
Publication statusPublished - 2023
EventIEEE/ACM International Workshop on Natural Language-Based Software Engineering 2023 - Melbourne, Australia
Duration: 20 May 202320 May 2023
Conference number: 2nd
https://ieeexplore.ieee.org/xpl/conhome/10189115/proceeding (Proceedings)

Conference

ConferenceIEEE/ACM International Workshop on Natural Language-Based Software Engineering 2023
Abbreviated titleNLBSE 2023
Country/TerritoryAustralia
CityMelbourne
Period20/05/2320/05/23
Internet address

Keywords

  • entropy
  • Information theory
  • software engineering

Cite this