Categorizing and predicting invalid vulnerabilities on common vulnerabilities and exposures

Qiuyuan Chen, Lingfeng Bao, Li Li, Xin Xia, Liang Cai

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearch

Abstract

To share vulnerability information across separate databases, tools, and services, newly identified vulnerabilities are recurrently reported to Common Vulnerabilities and Exposures (CVE) database.Unfortunately, not all vulnerability reports will be accepted. Some of them might get rejected or be accepted with disputations.In this work, we refer to those rejected or disputed CVEs as invalid vulnerability reports. Invalid vulnerability reports not only cause unnecessary efforts to confirm the vulnerability but also impact the reputation of the software vendors. In this paper, we aim to understand the root causes of invalid vulnerability reports and build a prediction model to automatically identify them.To this end, we first leverage card sorting to categorize invalid vulnerability reports, from which six main reasons are observed for rejected and disputed CVEs, respectively.Then, we propose a text mining approach to predict the invalid vulnerability reports. Our experiments reveal that the proposed text mining approach can achieve an AUC score of 0.87 for predicting invalid vulnerabilities. We also discuss the implications of our study: our categorization can be used to guide new committer to avoid these traps; some root causes of invalid CVEs can be avoided by using automatic techniques or optimizing reviewing mechanism; invalid vulnerability reports data should not be neglected.
Original languageEnglish
Title of host publicationProceedings - 25th Asia-Pacific Software Engineering Conference, APSEC 2018
Subtitle of host publication4–7 December 2018 Nara, Japan
EditorsHironori Washizaki, Hongyu Zhang
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages345-354
Number of pages10
ISBN (Electronic)9781728119700
ISBN (Print)9781728119717
DOIs
Publication statusPublished - 2018
EventAsia-Pacific Software Engineering Conference 2018 - http://www.apsec2018.org/, Nara, Japan
Duration: 4 Dec 20187 Dec 2018
Conference number: 25th

Conference

ConferenceAsia-Pacific Software Engineering Conference 2018
Abbreviated titleAPSEC 2018
CountryJapan
CityNara
Period4/12/187/12/18

Keywords

  • invalid CVE
  • prediction model
  • reason categorization

Cite this

Chen, Q., Bao, L., Li, L., Xia, X., & Cai, L. (2018). Categorizing and predicting invalid vulnerabilities on common vulnerabilities and exposures. In H. Washizaki, & H. Zhang (Eds.), Proceedings - 25th Asia-Pacific Software Engineering Conference, APSEC 2018: 4–7 December 2018 Nara, Japan (pp. 345-354). [8719428] Piscataway NJ USA: IEEE, Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/APSEC.2018.00049
Chen, Qiuyuan ; Bao, Lingfeng ; Li, Li ; Xia, Xin ; Cai, Liang. / Categorizing and predicting invalid vulnerabilities on common vulnerabilities and exposures. Proceedings - 25th Asia-Pacific Software Engineering Conference, APSEC 2018: 4–7 December 2018 Nara, Japan. editor / Hironori Washizaki ; Hongyu Zhang. Piscataway NJ USA : IEEE, Institute of Electrical and Electronics Engineers, 2018. pp. 345-354
@inproceedings{277f590c3eea4e49b9b0ae465274b2cc,
title = "Categorizing and predicting invalid vulnerabilities on common vulnerabilities and exposures",
abstract = "To share vulnerability information across separate databases, tools, and services, newly identified vulnerabilities are recurrently reported to Common Vulnerabilities and Exposures (CVE) database.Unfortunately, not all vulnerability reports will be accepted. Some of them might get rejected or be accepted with disputations.In this work, we refer to those rejected or disputed CVEs as invalid vulnerability reports. Invalid vulnerability reports not only cause unnecessary efforts to confirm the vulnerability but also impact the reputation of the software vendors. In this paper, we aim to understand the root causes of invalid vulnerability reports and build a prediction model to automatically identify them.To this end, we first leverage card sorting to categorize invalid vulnerability reports, from which six main reasons are observed for rejected and disputed CVEs, respectively.Then, we propose a text mining approach to predict the invalid vulnerability reports. Our experiments reveal that the proposed text mining approach can achieve an AUC score of 0.87 for predicting invalid vulnerabilities. We also discuss the implications of our study: our categorization can be used to guide new committer to avoid these traps; some root causes of invalid CVEs can be avoided by using automatic techniques or optimizing reviewing mechanism; invalid vulnerability reports data should not be neglected.",
keywords = "invalid CVE, prediction model, reason categorization",
author = "Qiuyuan Chen and Lingfeng Bao and Li Li and Xin Xia and Liang Cai",
year = "2018",
doi = "10.1109/APSEC.2018.00049",
language = "English",
isbn = "9781728119717",
pages = "345--354",
editor = "Washizaki, {Hironori } and Zhang, {Hongyu }",
booktitle = "Proceedings - 25th Asia-Pacific Software Engineering Conference, APSEC 2018",
publisher = "IEEE, Institute of Electrical and Electronics Engineers",
address = "United States of America",

}

Chen, Q, Bao, L, Li, L, Xia, X & Cai, L 2018, Categorizing and predicting invalid vulnerabilities on common vulnerabilities and exposures. in H Washizaki & H Zhang (eds), Proceedings - 25th Asia-Pacific Software Engineering Conference, APSEC 2018: 4–7 December 2018 Nara, Japan., 8719428, IEEE, Institute of Electrical and Electronics Engineers, Piscataway NJ USA, pp. 345-354, Asia-Pacific Software Engineering Conference 2018, Nara, Japan, 4/12/18. https://doi.org/10.1109/APSEC.2018.00049

Categorizing and predicting invalid vulnerabilities on common vulnerabilities and exposures. / Chen, Qiuyuan; Bao, Lingfeng; Li, Li; Xia, Xin; Cai, Liang.

Proceedings - 25th Asia-Pacific Software Engineering Conference, APSEC 2018: 4–7 December 2018 Nara, Japan. ed. / Hironori Washizaki; Hongyu Zhang. Piscataway NJ USA : IEEE, Institute of Electrical and Electronics Engineers, 2018. p. 345-354 8719428.

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearch

TY - GEN

T1 - Categorizing and predicting invalid vulnerabilities on common vulnerabilities and exposures

AU - Chen, Qiuyuan

AU - Bao, Lingfeng

AU - Li, Li

AU - Xia, Xin

AU - Cai, Liang

PY - 2018

Y1 - 2018

N2 - To share vulnerability information across separate databases, tools, and services, newly identified vulnerabilities are recurrently reported to Common Vulnerabilities and Exposures (CVE) database.Unfortunately, not all vulnerability reports will be accepted. Some of them might get rejected or be accepted with disputations.In this work, we refer to those rejected or disputed CVEs as invalid vulnerability reports. Invalid vulnerability reports not only cause unnecessary efforts to confirm the vulnerability but also impact the reputation of the software vendors. In this paper, we aim to understand the root causes of invalid vulnerability reports and build a prediction model to automatically identify them.To this end, we first leverage card sorting to categorize invalid vulnerability reports, from which six main reasons are observed for rejected and disputed CVEs, respectively.Then, we propose a text mining approach to predict the invalid vulnerability reports. Our experiments reveal that the proposed text mining approach can achieve an AUC score of 0.87 for predicting invalid vulnerabilities. We also discuss the implications of our study: our categorization can be used to guide new committer to avoid these traps; some root causes of invalid CVEs can be avoided by using automatic techniques or optimizing reviewing mechanism; invalid vulnerability reports data should not be neglected.

AB - To share vulnerability information across separate databases, tools, and services, newly identified vulnerabilities are recurrently reported to Common Vulnerabilities and Exposures (CVE) database.Unfortunately, not all vulnerability reports will be accepted. Some of them might get rejected or be accepted with disputations.In this work, we refer to those rejected or disputed CVEs as invalid vulnerability reports. Invalid vulnerability reports not only cause unnecessary efforts to confirm the vulnerability but also impact the reputation of the software vendors. In this paper, we aim to understand the root causes of invalid vulnerability reports and build a prediction model to automatically identify them.To this end, we first leverage card sorting to categorize invalid vulnerability reports, from which six main reasons are observed for rejected and disputed CVEs, respectively.Then, we propose a text mining approach to predict the invalid vulnerability reports. Our experiments reveal that the proposed text mining approach can achieve an AUC score of 0.87 for predicting invalid vulnerabilities. We also discuss the implications of our study: our categorization can be used to guide new committer to avoid these traps; some root causes of invalid CVEs can be avoided by using automatic techniques or optimizing reviewing mechanism; invalid vulnerability reports data should not be neglected.

KW - invalid CVE

KW - prediction model

KW - reason categorization

UR - http://www.scopus.com/inward/record.url?scp=85066781873&partnerID=8YFLogxK

U2 - 10.1109/APSEC.2018.00049

DO - 10.1109/APSEC.2018.00049

M3 - Conference Paper

SN - 9781728119717

SP - 345

EP - 354

BT - Proceedings - 25th Asia-Pacific Software Engineering Conference, APSEC 2018

A2 - Washizaki, Hironori

A2 - Zhang, Hongyu

PB - IEEE, Institute of Electrical and Electronics Engineers

CY - Piscataway NJ USA

ER -

Chen Q, Bao L, Li L, Xia X, Cai L. Categorizing and predicting invalid vulnerabilities on common vulnerabilities and exposures. In Washizaki H, Zhang H, editors, Proceedings - 25th Asia-Pacific Software Engineering Conference, APSEC 2018: 4–7 December 2018 Nara, Japan. Piscataway NJ USA: IEEE, Institute of Electrical and Electronics Engineers. 2018. p. 345-354. 8719428 https://doi.org/10.1109/APSEC.2018.00049