Large scale author identification and obfuscation via deep learning
Benefiting from advances in machine learning and specifically neural networks, there have been significant improvements in tackling many natural language processing problems, including the two that are the focus of this thesis, author identification (i.e. identifying the author of a text) and author obfuscation (i.e. masking the identity of the author). However, the majority of systems in both tasks are designed for limited numbers of authors or rely on manually engineered rules.
This thesis aims at proposing a deep learning based solution for large-scale author identification and obfuscation, where there is in principle no limit on the number of authors. We first target large scale author identification using a similarity-based model. Then, we study applicability of adversarial techniques to similarity-based classifiers formulated as large scale author obfuscation.
Author Identification Author identification is applicable to important matters such as detecting plagiarism or finding ghost writers. Approaches to tackling authorship identification have been conventionally divided into classification-based and similarity-based. Classification-based approaches assign a piece of text to a specific class among all known authorial classes; this works very well for small numbers of candidate authors. Similarity-based methods compare an unknown/query piece of text to known/reference ones and assign the query to the author of the most similar reference. This technique is applicable to larger numbers of authors or for authors beyond the training set. Existing similarity-based methods so far have only embodied static notions of similarity. Deep learning methods, which blur the boundaries between classification-based and similarity-based approaches, are promising in terms of ability to learn a notion of similarity, but have previously only been used in a conventional small-closed-class classification setup.
Siamese networks have been used to develop learned notions of similarity in one-shot image tasks, and also for tasks of mostly semantic relatedness in NLP. We examine their application to the stylistic task of authorship identification on datasets with large numbers of authors, looking at multiple energy functions and neural network architectures, and show that they can substantially outperform previous approaches. Our experiments also involve identifying the effect of features such as topic and named entities on the decision made by the network. Furthermore, in a deeper analysis of the similarity-based model, we examine scores produced by the network to study whether scores on smaller segments of texts are interpretable.
Author Obfuscation Automatic author obfuscation is a required tool to avoid disclosure of sensitive information mainly for privacy reasons. Recent advances in deep neural networks have boosted author identification performance making author obfuscation more challenging. Existing approaches to author obfuscation are largely heuristic. Obfuscation can, however, be thought of as the construction of adversarial examples to attack author identification, suggesting that the deep learning architectures used for adversarial attacks could have application here. Current architectures are proposed to construct adversarial examples against classification-based models, which in author identification would exclude the high-performing similarity-based models employed when facing large number of authorial classes.
In this thesis, we propose the first deep learning architecture for constructing adversarial examples against similarity-based learners in text processing, and explore its application to author obfuscation. We analyse the output for both success in obfuscation and language acceptability, as well as comparing the performance with some common baselines, showing promising results in finding a balance between safety (i.e how author attributes have been hidden) and soundness of the perturbed texts.