Abstract
Software vulnerabilities (SVs) have emerged as a prevalent and crucial concern for safety-critical systems. This has spurred significant advancements in utilizing AI-based methods, including machine learning and deep learning, for software vulnerability detection (SVD). While AI-based methods have shown promising performance in SVD, their effectiveness on real-world, complex, and diverse source code datasets remains limited in practice. To tackle this challenge, in this paper, we propose a novel framework that enhances the capability of large language models to learn and utilize semantic and syntactic relationships from source code data for SVD. As a result, our proposed SAFE approach can enable the acquisition of fundamental knowledge from source code data while adeptly utilizing crucial relationships, i.e., semantic and syntactic associations, to improve the effectiveness of solving the SVD problem. The rigorous and extensive experimental results on three real-world challenging datasets (i.e., Devign, ReVeal, and D2A) demonstrate the superiority of our approach over eight effective and state-of-the-art baselines. In summary, on average, our SAFE approach achieves higher performances from 4.79% to 11.57% for F1-measure and from 16.93% to 26.24% for Recall compared to the baseline methods across all the datasets used.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 20th ACM ASIA Conference on Computer and Communications Security |
| Editors | Dinh Tien Tuan Anh, Tong Van Van |
| Place of Publication | New York NY USA |
| Publisher | Association for Computing Machinery (ACM) |
| Pages | 392-406 |
| Number of pages | 15 |
| ISBN (Electronic) | 9798400714108 |
| DOIs | |
| Publication status | Published - 2025 |
| Event | ACM ASIA Conference on Computer and Communications Security 2025 - Hanoi, Vietnam Duration: 25 Aug 2025 → 29 Aug 2025 Conference number: 20th https://dl.acm.org/doi/proceedings/10.1145/3708821 (Proceedings) https://asiaccs2025.hust.edu.vn/ (Website) |
Publication series
| Name | Proceedings of the ACM Conference on Computer and Communications Security |
|---|---|
| Publisher | Association for Computing Machinery (ACM) |
| ISSN (Print) | 1543-7221 |
Conference
| Conference | ACM ASIA Conference on Computer and Communications Security 2025 |
|---|---|
| Abbreviated title | ASIA CCS 2025 |
| Country/Territory | Vietnam |
| City | Hanoi |
| Period | 25/08/25 → 29/08/25 |
| Internet address |
|
Keywords
- Deep learning
- Knowledge distillation
- Large language models
- Software vulnerability detection