Projects per year
Abstract
Machine learning (ML)-based Android malware detection has been one of the most popular research topics in the mobile security community. An increasing number of research studies have demonstrated that machine learning is an effective and promising approach for malware detection, and some works have even claimed that their proposed models could achieve 99% detection accuracy, leaving little room for further improvement. However, numerous prior studies have suggested that unrealistic experimental designs bring substantial biases, resulting in over-optimistic performance in malware detection. Unlike previous research that examined the detection performance of ML classifiers to locate the causes, this study employs Explainable AI (XAI) approaches to explore what ML-based models learned during the training process, inspecting and interpreting why ML-based malware classifiers perform so well under unrealistic experimental settings. We discover that temporal sample inconsistency in the training dataset brings over-optimistic classification performance (up to 99%F1 score and accuracy). Importantly, our results indicate that ML models classify malware based on temporal differences between malware and benign, rather than the actual malicious behaviors. Our evaluation also confirms the fact that unrealistic experimental designs lead to not only unrealistic detection performance but also poor reliability, posing a significant obstacle to real-world applications. These findings suggest that XAI approaches should be used to help practitioners/researchers better understand how do AI/ML models (i.e., malware detection) work-not just focusing on accuracy improvement.
Original language | English |
---|---|
Title of host publication | Proceedings - 2022 IEEE 33rd International Symposium on Software Reliability Engineering, ISSRE 2022 |
Editors | Nahgmeh Ivaki, Siwei Zhou |
Place of Publication | Piscataway NJ USA |
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Pages | 169-180 |
Number of pages | 12 |
ISBN (Electronic) | 9781665451321 |
ISBN (Print) | 9781665451338 |
DOIs | |
Publication status | Published - 2022 |
Event | International Symposium on Software Reliability Engineering 2022 - Charlotte, United States of America Duration: 31 Oct 2021 → 3 Nov 2021 Conference number: 33rd https://ieeexplore.ieee.org/xpl/conhome/9978763/proceeding (Proceedings) https://issre2022.github.io/ (Website) |
Publication series
Name | Proceedings - International Symposium on Software Reliability Engineering, ISSRE |
---|---|
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Volume | 2022-October |
ISSN (Print) | 1071-9458 |
ISSN (Electronic) | 2332-6549 |
Conference
Conference | International Symposium on Software Reliability Engineering 2022 |
---|---|
Abbreviated title | ISSRE 2022 |
Country/Territory | United States of America |
City | Charlotte |
Period | 31/10/21 → 3/11/21 |
Internet address |
|
Keywords
- Android malware
- Explainable AI
- Machine learning
-
Practical and Explainable Analytics to Prevent Future Software Defects
Australian Research Council (ARC)
2/03/20 → 2/03/23
Project: Research
-
Enabling Compatible and Secure Mobile Apps via Automated Program Repair
Li, L.
Australian Research Council (ARC)
1/03/20 → 9/09/22
Project: Research