Macquarie University
01whole.pdf (654.07 kB)

A novel framework for author obfuscation using generalised differential privacy

Download (654.07 kB)
posted on 2022-03-28, 14:16 authored by Natasha Fernandes
The problem of obfuscating the authorship of a text document has received little attention in the literature to date. Current approaches are ad-hoc and rely on assumptions about an adversary's auxiliary knowledge which makes it difficult to reason about the privacy properties of these methods. Another approach to privacy, known as differential privacy, is advocated in the literature for its strong privacy guarantees. However, differential privacy has been dismissed as an option for text document privacy due to its design around the release of aggregate statistics, and its dependence on notions of 'adjacency',neither of which apply to text document privacy. In addition, differential privacy does not permit the release of individual data points as required for text document publishing. However, a new approach to privacy known as generalised differential privacy extends differential privacy to arbitrary datasets with no notion of adjacency, and permits the private release of individual data points. In this thesis, we show to apply generalised differential privacy to author obfuscation, drawing inspiration from the example of geo-location privacy, and utilising existing tools and methods from the stylometry and natural language processing literature.


Table of Contents

1: Introduction -- 2: Author Obfuscation -- 3: An Exploration of Privacy -- 4: Theoretical Framework -- 5: Experimental Results -- 6: Concluding Remarks.


Bibliography: pages 52-56 Theoretical thesis.

Awarding Institution

Macquarie University

Degree Type

Thesis MRes


MRes, Macquarie University, Faculty of Science and Engineering, Department of Biological Science

Department, Centre or School

Department of Biological Sciences

Year of Award



Copyright Natasha Fernandes 2017. Copyright disclaimer:




1 online resource (56 pages) graphs, tables

Former Identifiers