Abstract
Function identification is a preliminary step in binary analysis for many applications from malware detection, common vulnerability detection and binary instrumentation to name a few. In this paper, we propose the Code Action Network (CAN) whose key idea is to encode the task of function scope identification to a sequence of three action states NI (i.e., next inclusion), NE (i.e., next exclusion), and FE (i.e., function end) to efficiently and effectively tackle function scope identification, the hardest and most crucial task in function identification. A bidirectional Recurrent Neural Network is trained to match binary programs with their sequence of action states. To work out function scopes in a binary, this binary is first fed to a trained CAN to output its sequence of action states which can be further decoded to know the function scopes in the binary. We undertake extensive experiments to compare our proposed method with other state-of-the-art baselines. Experimental results demonstrate that our proposed method outperforms the state-of-the-art baselines in terms of predictive performance on real-world datasets which include binaries from well-known libraries.
Original language | English |
---|---|
Title of host publication | Advances in Knowledge Discovery and Data Mining |
Subtitle of host publication | 24th Pacific-Asia Conference, PAKDD 2020 Singapore, May 11–14, 2020 Proceedings, Part I |
Editors | Hady W. Lauw, Raymond Chi-Wing Wong, Alexandros Ntoulas, Ee-Peng Lim, See-Kiong Ng, Sinno Jialin Pan |
Place of Publication | Cham Switzerland |
Publisher | Springer |
Pages | 712-725 |
Number of pages | 14 |
ISBN (Electronic) | 9783030474263 |
ISBN (Print) | 9783030474256 |
DOIs | |
Publication status | Published - 2020 |
Event | Pacific-Asia Conference on Knowledge Discovery and Data Mining 2020 - Singapore, Singapore Duration: 11 May 2020 → 14 May 2020 Conference number: 24th https://pakdd2020.org (Website) https://link.springer.com/book/10.1007/978-3-030-47426-3 (Proceedings) |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 12084 LNAI |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | Pacific-Asia Conference on Knowledge Discovery and Data Mining 2020 |
---|---|
Abbreviated title | PAKDD 2020 |
Country/Territory | Singapore |
City | Singapore |
Period | 11/05/20 → 14/05/20 |
Internet address |
|
Keywords
- Cyber security
- Deep learning
- Function scope identification
- Machine learning