Adversarial Machine Learning for Enhanced Malware Detection
Malware is executable code designed to inflict damage or abuse when run on a device. To improve the ability of target systems to recognise malware and protect the system, a wide range of Machine Learning (ML) models has been applied. However, the success of ML models is highly criticised for their vulnerability to Adversarial Example (AE): carefully crafted input that can mislead ML to produce an adversary’s desired output. In malware detection, an ML model that correctly detects a malware input can be fooled by a perturbed variant of that malware that resembles benign characteristics.
To improve the performance of ML models, a target’s classifier (which assesses whether a given input is malware or benign) can be trained using an attack model, which emulates a sophisticated attacker. In this thesis, we use well-known Android data sets, Drebin and APKPure, to conduct comprehensive comparative study on three feature-based attack models that are taken from the computer vision literature. Such attacks generate variants of a malware feature vector representing hypothetical variants of the actual malware. The results indicate that a Generative Adversarial Network (GAN) is the best at crafting malware that appears benign, but it has high distortion; i.e., it applies a large number of changes to the feature representation of the original malware version.
Based on this first study, we propose aGAN-based model that minimises distortion without compromising the ability to mislead classifiers. The model performance is examined against ML-based malware detectors with high detection accuracy, i.e., Drebin and MaMaDroid. It is shown that the proposed model is highly successful against the most accurate ML classifier while applying slight modification to the malware feature vector.
Lastly, an Iterative Adversarial Retraining (IAR) model is proposed for classifiers to better recognise malware. At each retraining iteration, the model uses successful and previously un-seen AEs from an attack model to augment the training set. The retrained classifiers demonstrate greater robustness that extends to assessing unfamiliar malware. The results also indicate that among the targeted attack models for training the classifiers, the proposed GAN-based attack model in this thesis provides the highest robustness against other attacks with the least impact on performance of the classifier on the original malware samples.
Overall, GAN-based models offer promising performance compared to other attack models in generating successful attacks and enhancing classifier robustness. This will have important application in reliable malware detection in adversarial environments leading to advanced protection for our computers and reducing the risk of harm to them.