Macquarie University
01whole.pdf (2.17 MB)

Variation and innovation in modern English: corpus-based studies in the grammaticalization of multiword units

Download (2.17 MB)
posted on 2022-03-28, 09:50 authored by Adam Michael Richard Smith
This dissertation is an empirical inquiry into the lexicalization and grammaticalization of several types of multiword units, whose status as fixed lexical units has not been established, and whose grammatical structure and roles are still open to question. They remain on the fringes of codification and classification in current dictionaries and grammars. The set of four published papers embodied in this dissertation investigate light verbs (e.g. have a look), non-numerical quantifiers (e.g. a lot of), complex prepositions (e.g. in spite of) and complex subordinators (e.g. the moment). In their structure, each of these includes a noun phrase, but as units they constitute different grammatical functions, those of the verb, determiner, preposition and subordinator respectively. These four types of multiword unit have been examined to assess how well they meet the standard criteria for grammaticalization, such as fixity, decategorialization and syntactic reanalysis. A range of standard corpora were used for this study, allowing investigation into the synchronic variation of the items under discussion across different English language regions and registers, along with some research into recent diachronic developments. Corpora of different sizes were selected to provide sufficient data on high- and low-frequency items. For higher frequency items, the Australian, British and New Zealand components of the 1million-word International Corpus of English (ICE), as well as ICE-US (written only), complemented by the spoken Santa Barbara Corpus were used. These smaller corpora also allowed the individual linguistic contexts of examples to be more closely examined. For lower frequency items the British National Corpus and Corpus of Contemporary American English were used, as well as the Corpus of Historical American English, which provided some diachronic data. Selective examples of linguistic contexts were elicited from these larger corpora (100million-word and over) and non-relevant usages were excluded from the frequency counts by the use of search strings adapted to each item. For each data set, the frequency of fixed and variable forms of the multiword units were compared, and the wider context also examined to find examples of indeterminate grammatical use, manifested by factors such as clause position and inconsistent patterns of concord. Data was also gathered from comprehensive and learner grammars, and dictionaries for first- and second-language users, to gauge the degree of recognition of these marginal/emergent items. The body of research finds that, while each of the multiword units investigated is lexicalized to some extent, there is also syntagmatic evidence of grammaticalization in two cases. The grammatical status of the unit was indicated in the case of non-numerical quantifiers by whether the singular or plural quantifying noun agrees with the following verb; and for complex subordinators by the absence of a preceding preposition and following relative pronoun, and especially its position at the start of a clause. The thesis demonstrates that several criteria are necessary to demonstrate the grammatical status of a multiword unit, and that some criteria, such as decategorialization, may be less indicative than others. The study proposes a systematic, corpus-based approach towards identifying and classifying emerging multiword units, so as to improve coverage of their contemporary lexicogrammatical functions within grammars and dictionaries.


Table of Contents

Chapter 1. Preliminary discussion -- Chapter 2. Light verbs in Australian, New Zealand and British English (Paper 1) -- Chapter 3. Non-numerical quantifiers (Paper 2) -- Chapter 4. Complex prepositions and variation within the PNP construction (paper 3) -- Chapter 5. Newly emerging subordinators in spoken/written English (Paper 4) -- Chapter 6. Conclusion.


Bibliography: pages 163-170 Thesis by publication.

Awarding Institution

Macquarie University

Degree Type

Thesis PhD


PhD, Macquarie University, Faculty of Human Sciences, Department of Linguistics

Department, Centre or School

Department of Linguistics

Year of Award


Principal Supervisor

Pam Peters

Additional Supervisor 1

Jan Tent


Copyright Adam Michael Richard Smith 2016. Copyright disclaimer:




1 online resource (xiv, 170 pages) diagrams, tables

Former Identifiers