Maximal divergence sequential auto-encoder for binary software vulnerability detection*

Tue Le, Tuan Vu Nguyen, Trung Le, Dinh Phung, Paul Montague, Olivier De Vel, Lizhen Qu

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

21 Citations (Scopus)

Abstract

Due to the sharp increase in the severity of the threat imposed by software vulnerabilities, the detection of vulnerabilities in binary code has become an important concern in the software industry, such as the embedded systems industry, and in the field of computer security. However, most of the works in binary code vulnerability detection has relied on handcrafted features which are manually chosen by a select few domain experts. In this paper, we attempt to alleviate this severe binary vulnerability detection bottleneck by leveraging recent advances in deep learning representations and propose the Maximal Divergence Sequential Auto-Encoder. In particular, latent codes representing vulnerable and non-vulnerable binaries are encouraged to be maximally divergent, while still being able to maintain crucial information from the original binaries. We conducted extensive experiments to compare and contrast our proposed methods with the baselines, and the results indicate that our proposed methods outperform the baselines in all performance measures of interest.

Original languageEnglish
Title of host publicationInternational Conference on Learning Representations 2019
EditorsAlexander Rush
Place of PublicationLa Jolla CA USA
PublisherInternational Conference on Learning Representations (ICLR)
Number of pages15
ISBN (Print)9783800743629
Publication statusPublished - 2019
EventInternational Conference on Learning Representations 2019 - Ernest N. Morial Convention Center, New Orleans, United States of America
Duration: 6 May 20199 May 2019
Conference number: 7th
https://iclr.cc/Conferences/2019
https://openreview.net/group?id=ICLR.cc/2019/Conference (Proceedings)

Conference

ConferenceInternational Conference on Learning Representations 2019
Abbreviated titleICLR 2019
Country/TerritoryUnited States of America
CityNew Orleans
Period6/05/199/05/19
OtherThe International Conference on Learning Representations (ICLR) is the premier gathering of professionals dedicated to the advancement of the branch of artificial intelligence called representation learning, but generally referred to as deep learning.

ICLR is globally renowned for presenting and publishing cutting-edge research on all aspects of deep learning used in the fields of artificial intelligence, statistics and data science, as well as important application areas such as machine vision, computational biology, speech recognition, text understanding, gaming, and robotics.
Internet address

Keywords

  • Vulnerabilities Detection
  • Sequential Auto-Encoder
  • Separable Representation

Cite this