Macquarie University
Browse
- No file added yet -

Adversarial Machine Learning for Enhanced Malware Detection

Download (4.87 MB)
thesis
posted on 2024-08-28, 00:44 authored by Maryam Shahpasand

Malware is executable code designed to inflict damage or abuse when run on a device. To improve the ability of target systems to recognise malware and protect the system, a wide range of Machine Learning (ML) models has been applied. However, the success of ML models is highly criticised for their vulnerability to Adversarial Example (AE): carefully crafted input that can mislead ML to produce an adversary’s desired output. In malware detection, an ML model that correctly detects a malware input can be fooled by a perturbed variant of that malware that resembles benign characteristics.

To improve the performance of ML models, a target’s classifier (which assesses whether a given input is malware or benign) can be trained using an attack model, which emulates a sophisticated attacker. In this thesis, we use well-known Android data sets, Drebin and APKPure, to conduct comprehensive comparative study on three feature-based attack models that are taken from the computer vision literature. Such attacks generate variants of a malware feature vector representing hypothetical variants of the actual malware. The results indicate that a Generative Adversarial Network (GAN) is the best at crafting malware that appears benign, but it has high distortion; i.e., it applies a large number of changes to the feature representation of the original malware version.

Based on this first study, we propose aGAN-based model that minimises distortion without compromising the ability to mislead classifiers. The model performance is examined against ML-based malware detectors with high detection accuracy, i.e., Drebin and MaMaDroid. It is shown that the proposed model is highly successful against the most accurate ML classifier while applying slight modification to the malware feature vector.

Lastly, an Iterative Adversarial Retraining (IAR) model is proposed for classifiers to better recognise malware. At each retraining iteration, the model uses successful and previously un-seen AEs from an attack model to augment the training set. The retrained classifiers demonstrate greater robustness that extends to assessing unfamiliar malware. The results also indicate that among the targeted attack models for training the classifiers, the proposed GAN-based attack model in this thesis provides the highest robustness against other attacks with the least impact on performance of the classifier on the original malware samples.

Overall, GAN-based models offer promising performance compared to other attack models in generating successful attacks and enhancing classifier robustness. This will have important application in reliable malware detection in adversarial environments leading to advanced protection for our computers and reducing the risk of harm to them.

History

Table of Contents

Chapter 1. Introduction -- Chapter 2. Background and related work -- Chapter 3. Analysis of evasion attacks on machine learning-based malware detectors -- Chapter 4. A generative adversarial network for effective adversarial malware samples -- Chapter 5. Machine vs. machine: can adversarial defence stay ahead of adversarial attack? -- Chapter 6. Conclusion -- References -- Appendices

Awarding Institution

Macquarie University

Degree Type

Thesis PhD

Degree

Doctor of Philosophy

Department, Centre or School

Department of Computing

Year of Award

2022

Principal Supervisor

Len Hamey

Additional Supervisor 1

Mohamed Ali (Dali) Kaafar

Rights

Copyright: The Author Copyright disclaimer: https://www.mq.edu.au/copyright-disclaimer

Language

English

Extent

183 pages

Usage metrics

    Macquarie University Theses

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC