Sybil attacks on differential privacy based federated learning
Machine learning and deep learning techniques have become prevailing in artificial intelligence. The rise of applications in autonomous vehicles, healthcare, and finance introduce practical challenges against various attacks towards these systems. Learning from unbalanced and non-IID (not independently and identically distributed) data while preserving privacy, federated learning is proposed to train global models on distributed devices. In federated learning, each device as a client owns a private training dataset that is invisible to other parties, which protects data privacy and data security. However, the loose federation of participating devices in this decentralized approach could bring potential security threats between the communications among these nodes. The state-of-the-art privacy-preserving technique in the context of federated learning is user-level differential privacy. It substantially reduces information disclosure about decentralized datasets rather than individual records. Despite this, such a mechanism is vulnerable to some specific model poisoning attacks such as sybil attacks. A malicious adversary could create multiple fake clients or collude compromised devices in sybil attacks to mount direct model updates manipulation. Recent works on novel defense against model poisoning attacks are difficult to detect sybil attacks when differential privacy mechanism is utilized, as it masks clients’ model updates with perturbation. This thesis is based on the scope of federated learning settings where user-level differential privacy is deployed. There are three contributions in this work as follows. The first contribution of the work is to implement sybil attacks on differential privacy based federated learning architectures and show the impact of model poisoning attacks on model convergence. The attack intensity depends on the number of sybil clients and the noise levels of each sybil reflected by the local privacy budget ε of differential privacy. The second contribution of the work is to propose a method to detect and defend sybil attacks for a differential privacy based federated learning setting. The key insight is that the poisoned model parameters from sybil clients can be identified by their induced higher loss values of prediction on the global model than those from honest clients in each iteration round of training. When the central server aggregates the clients’ models, the model updates obtained from sybil clients may induce higher cost in the global model than those from honest clients, which affects the convergence of the global model. The third contribution of the work is to apply our attacks to two recent Byzantine resilient aggregation defense mechanisms, called Krum and Trimmed Mean. Our evaluation results on the MNIST and CIFAR-10 datasets demonstrate that our proposed sybil attacks increase the training loss of the global model tremendously on various state-of-the-art defense mechanisms. We also conduct an empirical study which shows that our defense approach effectively mitigates the impact of our model poisoning attacks on model convergence.