Information leakage in machine learning models

Tonni, Shakila Mahjabin

doi:10.25949/19438532.v1

Information leakage in machine learning models

thesis

posted on 2022-03-28, 18:33 authored by Shakila Mahjabin Tonni

Machine Learning (ML) techniques are used by most data-driven organisations to extract insights. In addition, Machine-learning-as-a-service (MLaaS), where models are trained on potentially sensitive user data and then queried by external parties are becoming a reality. However, recently, these systems have been shown to be vulnerable to Membership Inference Attacks (MIA), where a target's data can be inferred to belong or not to the training data. While the key factors for the success of MIA have not been fully understood, existing defences mechanisms only consider the model-specific properties. In this thesis, we investigate the impact of both the data and ML model properties on the vulnerability of ML techniques to MIA. Our analysis indicates a strong relationship between the MIA success with the properties of the data in use, such as the data size and balance between the classes as well as with the model properties including the fairness in prediction and the mutual information between the data and the model's parameters. We provide recommendations on assessing the possible information leakage from a given dataset and propose new approaches to protect ML models from MIA by using several properties, e.g. the model's fairness and mutual information between data and the model's parameters as regularizers, which reduces the attack accuracy by 25% yielding a fairer and a better performing ML model.

History

Notes

Theoretical thesis. Bibliography pages 39-41

Awarding Institution

Macquarie University

Degree Type

Thesis MRes

Degree

MRes, Macquarie University, Faculty of Science and Engineering, Department of Computing

Department, Centre or School

Department of Computing

Year of Award

2020

Principal Supervisor

Dali Kaafar

Rights

Copyright Shakila Mahjabin Tonni 2020

Language

English

Extent

1 online resource (xiv, 41 pages) illustrations

Former Identifiers

mq:72335 http://hdl.handle.net/1959.14/1283808

Usage metrics

Keywords

adversarial attack Machine learning Machine learning -- Security measures Membership Inference Attack

Licence

In Copyright

Information leakage in machine learning models

History

Table of Contents

Notes

Awarding Institution

Degree Type

Degree

Department, Centre or School

Year of Award

Principal Supervisor

Rights

Language

Extent

Former Identifiers

Usage metrics

Categories

Keywords

Licence

Exports