Knowns and unknowns: an assessment of knowledge shortfalls in the digitised collection of Australia's flora
thesisposted on 2022-03-29, 01:45 authored by Md Mohasinul Haque
Massive digitisation of natural history collections (NHC), the predominant source of primary biodiversity data (i.e. species occurrence information), has provided myriad opportunities for studying biological diversity across space and time. Despite recent efforts to collate centuries of biodiversity inventories into comprehensive databases, these collections suffer inherent limitations in their spatial, temporal and taxonomic dimensions. Identifying these limitations is a priority to ensure that multiple targets specified by the Convention on Biological Diversity are met. In this thesis, which consists of four data chapters, I assess spatial, temporal and taxonomic patterns in the digitisation of data held within the Australasian Virtual Herbarium (AVH) - the largest electronic source of plant occurrence records in the country. In Chapter 2, I document spatial biases in the number of occurrence records from across Australia, with the Human Influence Index being a strong predictor of this bias. In Chapter 3, I demonstrate temporal biases, with 80% of records collected from 1970-1999. Furthermore, only 18% of the continent is represented by a relatively complete inventory consistently sampled over the last 200 years. I also found that around 25% of digitised specimens are missing key attribute information (i.e. collection date, taxonomic identification or geographic coordinates). An assessment of taxonomic bias in Chapter 4 indicates that, for one-third of Australia's plant families, the number of preserved specimens per family is not proportional to the family's known species richness. There is also a strong positive correlation between the number of collectors sampling a family and the taxonomic bias of that family. Finally, in Chapter 5, I demonstrate that digitisation effort over the last three decades varies significantly among Australia's herbaria: a time lag in digitisation means that only 30% of specimens are digitised within a year of collection. As the uses of primary biodiversity data continue to expand, my findings can direct future strategic sampling and digitisation efforts to increase our knowledge of Australia's flora.