The Taxonomy 2028 Challenge: Validation of Australia’s vascular plant collection data

Herbarium collections are the primary source of verifiable data on Australia’s flora. The information associated with each collection — including the taxon name and its locality — underpins research across a broad range of disciplines. Thanks to advances in cyber-infrastructures and the development of novel bioinformatics tools and techniques, biodiversity and distribution data can now be explored and analysed within a phylogenetic and environmental framework, providing a greater evolutionary understanding of our flora and novel data to inform conservation planning. However, to maximise the outcomes of big data analyses it is imperative that we improve the quality of the data upon which these analyses are based.

Specimen identification errors are commonplace in herbaria and are not confined to taxonomic groups that lack a recent monograph — they also exist (to varying degrees) in groups that have been revised in the past 40 years, most notably in collections that have been made subsequent to a taxonomic treatment or have otherwise not been examined by the treatment’s author. Geocode errors (e.g. mistakes made at the point of data entry, miscalculations and labelling errors) are similarly rife, and can be extreme to relatively minor in magnitude.

Taxonomic and geographic errors in our biodiversity data reduce our knowledge of a taxon’s distribution and habitat requirements, result in the dissemination of inaccurate information to our stakeholders (e.g. incorrect distribution maps, wrongly identified voucher specimens and photographs), undermine the results and interpretation of phylogenetic studies and the accuracy of spatial analyses or environmental modelling, and may significantly impact conservation planning at the species or regional level. Furthermore, a significant amount of useful, high quality data remains inaccessible in specimen backlogs.

By 2028 all vascular plant collections in Australian Herbaria will be audited for taxonomic and geographic accuracy

The audit should include: a taxonomic assessment of all collections, with an emphasis on those that have not been verified by a taxonomic expert; cross-checking duplicates of a single gathering housed at different herbaria to ensure that they have matching identifications (and are therefore represented by just one dot on Australasia’s Virtual Herbarium); preparation, database and taxonomic verification of all backlog materials, including undatabased collections that are currently on loan to other institutions; and validation of locality and geocode information, particularly for all geographical outliers that have been taxonomically confirmed. It would be possible to value-add to this process by capturing information on the reproductive state of every specimen (i.e. whether flowering, fruiting or sterile) thereby informing future collection needs and phenological research.

An audit would lead to the discovery of new taxa and new populations of conservation-listed taxa. Indeed, many scientists are undoubtedly already aware of new taxa that are represented in herbarium collections but are not yet on the National Species List — this knowledge should be captured as part of this process, particularly if taxonomic publications are unlikely to be forthcoming in the short-term. 

An audit would improve the quality of data fundamental to our understanding of Australia’s biodiversity and its evolution. It would underpin the eFlora of Australia, enabling more accurate descriptions and distribution maps to be generated, and would improve the quality of derivative products such as regional or taxon specific Apps and field guides. An audit would have tangible conservation outcomes, providing better information for individual taxa and improving analyses directed at conservation planning and decision making, and would also reduce the amount of time required for data cleaning prior to a large-scale analysis.

Identification errors often arise from imperfect taxonomic knowledge and as such an audit of this nature could not be completely uncoupled from baseline taxonomic research; however, it would focus attention on future research needs (e.g. specimens, species or groups in need of further research could be flagged and prioritised; potential student research projects could be highlighted) and collection gaps.

We will need a significant number of skilled research scientists and identification botanists to conduct a taxonomic audit of collections at their home institution, collections originating from their home state but housed at other national herbaria, and specimens belonging to their taxonomic speciality groups. We will also need additional curatorial staff to database backlog material, validate questionable geocodes and localities, perform database edits and maintain existing collections (e.g. duplicates from other states). Staffing levels will need to be maintained into the future to ensure incoming collections are processed and verified without major delay.

Our collections underpin everything — let’s give them the attention they deserve.