Regression greybox fuzzing

Xiaogang Zhu, Marcel Böhme

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

52 Citations (Scopus)

Abstract

What you change is what you fuzz! In an empirical study of all fuzzer-generated bug reports in OSSFuzz, we found that four in every five bugs have been introduced by recent code changes. That is, 77% of 23k bugs are regressions. For a newly added project, there is usually an initial burst of new reports at 2-3 bugs per day. However, after that initial burst, and after weeding out most of the existing bugs, we still get a constant rate of 3-4 bug reports per week. The constant rate can only be explained by an increasing regression rate. Indeed, the probability that a reported bug is a regression (i.e., we could identify the bug-introducing commit) increases from 20% for the first bug to 92% after a few hundred bug reports. In this paper, we introduce regression greybox fuzzing (RGF) a fuzzing approach that focuses on code that has changed more recently or more often. However, for any active software project, it is impractical to fuzz sufficiently each code commit individually. Instead, we propose to fuzz all commits simultaneously, but code present in more (recent) commits with higher priority. We observe that most code is never changed and relatively old. So, we identify means to strengthen the signal from executed code-of-interest. We also extend the concept of power schedules to the bytes of a seed and introduce Ant Colony Optimization to assign more energy to those bytes which promise to generate more interesting inputs. Our large-scale fuzzing experiment demonstrates the validity of our main hypothesis and the efficiency of regression greybox fuzzing. We conducted our experiments in a reproducible manner within Fuzzbench, an extensible fuzzer evaluation platform. Our experiments involved 3+ CPU-years worth of fuzzing campaigns and 20 bugs in 15 open-source C programs available on OSSFuzz.

Original languageEnglish
Title of host publicationProceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security
EditorsHyoungshick Kim, Jin B. Hong
Place of PublicationNew York NY USA
PublisherAssociation for Computing Machinery (ACM)
Pages2169-2182
Number of pages14
ISBN (Electronic)9781450384544
DOIs
Publication statusPublished - 2021
EventACM Conference on Computer and Communications Security 2021 - Online, Korea, South
Duration: 15 Nov 202119 Nov 2021
Conference number: 27th
https://dl.acm.org/doi/proceedings/10.1145/3460120 (Proceedings)

Publication series

NameProceedings of the ACM Conference on Computer and Communications Security
PublisherAssociation for Computing Machinery (ACM)
ISSN (Print)1543-7221

Conference

ConferenceACM Conference on Computer and Communications Security 2021
Abbreviated titleCCS 2021
Country/TerritoryKorea, South
Period15/11/2119/11/21
Internet address

Keywords

  • defect prediction
  • differential testing
  • greybox fuzzing
  • regression testing

Cite this