Abstract
Owing to the ubiquity of computer software, software vulnerability detection (SVD) has become an important problem in the software industry and computer security. One of the most crucial issues in SVD is coping with the scarcity of labeled vulnerabilities in projects that require the laborious manual labeling of code by software security experts. One possible solution is to employ deep domain adaptation (DA) which has recently witnessed enormous success in transferring learning from structural labeled to unlabeled data sources. Generative adversarial network (GAN) is a technique that attempts to bridge the gap between source and target data in the joint space and emerges as a building block to develop deep DA approaches with state-of-the-art performance. However, deep DA approaches using the GAN principle to close the gap are subject to the mode collapsing problem that negatively impacts the predictive performance. Our aim in this paper is to propose Dual Generator-Discriminator Deep Code Domain Adaptation Network (Dual-GD-DDAN) for tackling the problem of transfer learning from labeled to unlabeled software projects in SVD to resolve the mode collapsing problem faced in previous approaches. The experimental results on real-world software projects show that our method outperforms state-of-the-art baselines by a wide margin.
Original language | English |
---|---|
Title of host publication | Advances in Knowledge Discovery and Data Mining |
Subtitle of host publication | 24th Pacific-Asia Conference, PAKDD 2020 Singapore, May 11–14, 2020 Proceedings, Part I |
Editors | Hady W. Lauw, Raymond Chi-Wing Wong, Alexandros Ntoulas, Ee-Peng Lim, See-Kiong Ng, Sinno Jialin Pan |
Place of Publication | Cham Switzerland |
Publisher | Springer |
Pages | 699-711 |
Number of pages | 13 |
ISBN (Electronic) | 9783030474263 |
ISBN (Print) | 9783030474256 |
DOIs | |
Publication status | Published - 2020 |
Event | Pacific-Asia Conference on Knowledge Discovery and Data Mining 2020 - Singapore, Singapore Duration: 11 May 2020 → 14 May 2020 Conference number: 24th https://pakdd2020.org (Website) https://link.springer.com/book/10.1007/978-3-030-47426-3 (Proceedings) |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Volume | 12084 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | Pacific-Asia Conference on Knowledge Discovery and Data Mining 2020 |
---|---|
Abbreviated title | PAKDD 2020 |
Country/Territory | Singapore |
City | Singapore |
Period | 11/05/20 → 14/05/20 |
Internet address |
|
Keywords
- Cyber security
- Deep learning
- Domain adaptation
- Machine learning
- Software vulnerability detection