ChatGPT for vulnerability detection, classification, and repair: How far are we?

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

50 Citations (Scopus)

Abstract

Large language models (LLMs) like ChatGPT (i.e., gpt-3.5-turbo and gpt-4) exhibited remarkable advancement in a range of software engineering tasks associated with source code such as code review and code generation. In this paper, we undertake a comprehensive study by instructing ChatGPT for four prevalent vulnerability tasks: function and line-level vulnerability prediction, vulnerability classification, severity estimation, and vulnerability repair. We compare ChatGPT with state-of-the-art language models designed for software vulnerability purposes. Through an empirical assessment employing extensive real-world datasets featuring over 190,000 C/C++ functions, we found that ChatGPT achieves limited performance, trailing behind other language models in vulnerability contexts by a significant margin. The experimental outcomes highlight the challenging nature of vulnerability prediction tasks, requiring domain-specific expertise. Despite ChatGPT's substantial model scale, exceeding that of source code-pre-trained language models (e.g., CodeBERT) by a factor of 14,000, the process of fine-tuning remains imperative for ChatGPT to generalize for vulnerability prediction tasks. We publish the studied dataset, experimental prompts for ChatGPT, and experimental results at https://github.com/awsm-research/ChatGPT4Vul.

Original languageEnglish
Title of host publicationProceedings - 2023 30th Asia-Pacific Software Engineering Conference, APSEC 2023
EditorsEunkyoung Jee, Soojin Park
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages632-636
Number of pages5
ISBN (Electronic)9798350344172
ISBN (Print)9798350344189
DOIs
Publication statusPublished - 2023
EventAsia-Pacific Software Engineering Conference 2023 - Seoul, Korea, South
Duration: 4 Dec 20237 Dec 2023
Conference number: 30th
https://ieeexplore.ieee.org/xpl/conhome/10479191/proceeding (Proceedings)
https://conf.researchr.org/home/apsec-2023 (Website)

Conference

ConferenceAsia-Pacific Software Engineering Conference 2023
Abbreviated titleAPSEC 2023
Country/TerritoryKorea, South
CitySeoul
Period4/12/237/12/23
Internet address

Keywords

  • ChatGPT
  • Cybersecurity
  • Large Language Models
  • Software Security
  • Software Vulnerability

Cite this