SAFE: A Novel Approach For Software Vulnerability Detection from Enhancing The Capability of Large Language Models

Van Nguyen, Surya Nepal, Xingliang Yuan, Tingmin Wu, Carsten Rudolph

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Software vulnerabilities (SVs) have emerged as a prevalent and crucial concern for safety-critical systems. This has spurred significant advancements in utilizing AI-based methods, including machine learning and deep learning, for software vulnerability detection (SVD). While AI-based methods have shown promising performance in SVD, their effectiveness on real-world, complex, and diverse source code datasets remains limited in practice. To tackle this challenge, in this paper, we propose a novel framework that enhances the capability of large language models to learn and utilize semantic and syntactic relationships from source code data for SVD. As a result, our proposed SAFE approach can enable the acquisition of fundamental knowledge from source code data while adeptly utilizing crucial relationships, i.e., semantic and syntactic associations, to improve the effectiveness of solving the SVD problem. The rigorous and extensive experimental results on three real-world challenging datasets (i.e., Devign, ReVeal, and D2A) demonstrate the superiority of our approach over eight effective and state-of-the-art baselines. In summary, on average, our SAFE approach achieves higher performances from 4.79% to 11.57% for F1-measure and from 16.93% to 26.24% for Recall compared to the baseline methods across all the datasets used.

Original languageEnglish
Title of host publicationProceedings of the 20th ACM ASIA Conference on Computer and Communications Security
EditorsDinh Tien Tuan Anh, Tong Van Van
Place of PublicationNew York NY USA
PublisherAssociation for Computing Machinery (ACM)
Pages392-406
Number of pages15
ISBN (Electronic)9798400714108
DOIs
Publication statusPublished - 2025
EventACM ASIA Conference on Computer and Communications Security 2025 - Hanoi, Vietnam
Duration: 25 Aug 202529 Aug 2025
Conference number: 20th
https://dl.acm.org/doi/proceedings/10.1145/3708821 (Proceedings)
https://asiaccs2025.hust.edu.vn/ (Website)

Publication series

NameProceedings of the ACM Conference on Computer and Communications Security
PublisherAssociation for Computing Machinery (ACM)
ISSN (Print)1543-7221

Conference

ConferenceACM ASIA Conference on Computer and Communications Security 2025
Abbreviated titleASIA CCS 2025
Country/TerritoryVietnam
CityHanoi
Period25/08/2529/08/25
Internet address

Keywords

  • Deep learning
  • Knowledge distillation
  • Large language models
  • Software vulnerability detection

Cite this