Code Action Network for binary function scope identification

Van Nguyen, Trung Le, Tue Le, Khanh Nguyen, Olivier de Vel, Paul Montague, John Grundy, Dinh Phung

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Function identification is a preliminary step in binary analysis for many applications from malware detection, common vulnerability detection and binary instrumentation to name a few. In this paper, we propose the Code Action Network (CAN) whose key idea is to encode the task of function scope identification to a sequence of three action states NI (i.e., next inclusion), NE (i.e., next exclusion), and FE (i.e., function end) to efficiently and effectively tackle function scope identification, the hardest and most crucial task in function identification. A bidirectional Recurrent Neural Network is trained to match binary programs with their sequence of action states. To work out function scopes in a binary, this binary is first fed to a trained CAN to output its sequence of action states which can be further decoded to know the function scopes in the binary. We undertake extensive experiments to compare our proposed method with other state-of-the-art baselines. Experimental results demonstrate that our proposed method outperforms the state-of-the-art baselines in terms of predictive performance on real-world datasets which include binaries from well-known libraries.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining
Subtitle of host publication24th Pacific-Asia Conference, PAKDD 2020 Singapore, May 11–14, 2020 Proceedings, Part I
EditorsHady W. Lauw, Raymond Chi-Wing Wong, Alexandros Ntoulas, Ee-Peng Lim, See-Kiong Ng, Sinno Jialin Pan
Place of PublicationCham Switzerland
PublisherSpringer
Pages712-725
Number of pages14
ISBN (Electronic)9783030474263
ISBN (Print)9783030474256
DOIs
Publication statusPublished - 2020
EventPacific-Asia Conference on Knowledge Discovery and Data Mining 2020 - Singapore, Singapore
Duration: 11 May 202014 May 2020
Conference number: 24th
https://pakdd2020.org (Website)
https://link.springer.com/book/10.1007/978-3-030-47426-3 (Conference Papers)

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12084 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferencePacific-Asia Conference on Knowledge Discovery and Data Mining 2020
Abbreviated titlePAKDD 2020
CountrySingapore
CitySingapore
Period11/05/2014/05/20
Internet address

Keywords

  • Cyber security
  • Deep learning
  • Function scope identification
  • Machine learning

Cite this

Nguyen, V., Le, T., Le, T., Nguyen, K., de Vel, O., Montague, P., Grundy, J., & Phung, D. (2020). Code Action Network for binary function scope identification. In H. W. Lauw, R. C-W. Wong, A. Ntoulas, E-P. Lim, S-K. Ng, & S. J. Pan (Eds.), Advances in Knowledge Discovery and Data Mining: 24th Pacific-Asia Conference, PAKDD 2020 Singapore, May 11–14, 2020 Proceedings, Part I (pp. 712-725). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12084 LNAI). Springer. https://doi.org/10.1007/978-3-030-47426-3_55