The generation of natural descriptions: corpus-based investigations of referring expressions in visual domains
thesisposted on 28.03.2022, 15:00 by Henriette Anna Elisabeth Viethen
Referring expression generation (REG) has been studied by computational linguists for nearly three decades. Although other aspects of the task have been examined, most investigations into REG are focussed on the selection of those attributes of an object that best distinguish it from all others in its environment. Historically, much of this work has su ered from two problems: firstly, it does not take account of empirical evidence for how people refer; and secondly, it has not been evaluated against human-produced corpora. This thesis is based on two related premises which I take to be self-evident if our ultimate goal is to explain how humans refer: first, that naturalness should be the primary goal of computational models of referring expression generation, and second, that the task therefore needs to be approached by using human-produced corpora for the development and testing of algorithms. Based on these premises, this thesis presents an extensive exploration into how corpora can be used in REG. It makes three main contributions in this area: (1) it presents a study that explores how corpora can be used to evaluate algorithms for the generation of referring expressions, and shows that existing algorithms cannot fully account for the way humans generate referring expressions; (2) it provides a detailed analysis of the di erent aspects of the human use of referring expressions in two large corpora in order to inform the development of REG algorithms; and (3) it presents experiments in using these corpora to train decision trees for attribute selection for referring expressions. The main conclusion of the analyses and experiments in this thesis is that speaker-specific variation plays a much larger role in the generation of referring expressions than existing algorithms acknowledge. Chapter 2 begins by surveying existing research in the field of REG. Chapter 3 then provides an in-depth discussion of the methodological choices that have to be made when employing corpora to inform and evaluate REG algorithms. Chapter 4 presents an evaluation of three popular existing REG algorithms using a small corpus of human-produced data. It shows that, while one of the algorithms is capable of generating a large proportion of the referring expressions in the corpus, none of them are even in principle able to generate all of them. The experiment gives rise to a dissection of the issues involved in the evaluation of REG algorithms. Based on the analyses of the previous three chapters, Chapter 5 describes the design, collection and annotation of two large corpora of referring expressions, and analyses how speakers make use of diff erent object properties. These ial relations between objects, allowing a systematic analysis of the circumstances under which people use relations as well as other properties. The second corpus constitutes the largest systematically-designed single-domain collection of referring expressions to date. Finally, Chapter 6 explores the use of the corpora described in Chapter 5 to train algorithms which model the content selection behaviour of the human participants who contributed the data. Modelling this data using decision trees is a natural way to gain insights into the factors that influence a person's decision to include a particular property in a referring expression and how these factors interact.