The Taxonomy 2028 Challenge: Validation of Australia’s vascular plant collection data

Herbarium collections are the primary source of verifiable data on Australia’s flora. The information associated with each collection — including the taxon name and its locality — underpins research across a broad range of disciplines. Thanks to advances in cyber-infrastructures and the development of novel bioinformatics tools and techniques, biodiversity and distribution data can now be explored and analysed within a phylogenetic and environmental framework, providing a greater evolutionary understanding of our flora and novel data to inform conservation planning. However, to maximise the outcomes of big data analyses it is imperative that we improve the quality of the data upon which these analyses are based.

Specimen identification errors are commonplace in herbaria and are not confined to taxonomic groups that lack a recent monograph — they also exist (to varying degrees) in groups that have been revised in the past 40 years, most notably in collections that have been made subsequent to a taxonomic treatment or have otherwise not been examined by the treatment’s author. Geocode errors (e.g. mistakes made at the point of data entry, miscalculations and labelling errors) are similarly rife, and can be extreme to relatively minor in magnitude.

Taxonomic and geographic errors in our biodiversity data reduce our knowledge of a taxon’s distribution and habitat requirements, result in the dissemination of inaccurate information to our stakeholders (e.g. incorrect distribution maps, wrongly identified voucher specimens and photographs), undermine the results and interpretation of phylogenetic studies and the accuracy of spatial analyses or environmental modelling, and may significantly impact conservation planning at the species or regional level. Furthermore, a significant amount of useful, high quality data remains inaccessible in specimen backlogs.

By 2028 all vascular plant collections in Australian Herbaria will be audited for taxonomic and geographic accuracy

The audit should include: a taxonomic assessment of all collections, with an emphasis on those that have not been verified by a taxonomic expert; cross-checking duplicates of a single gathering housed at different herbaria to ensure that they have matching identifications (and are therefore represented by just one dot on Australasia’s Virtual Herbarium); preparation, database and taxonomic verification of all backlog materials, including undatabased collections that are currently on loan to other institutions; and validation of locality and geocode information, particularly for all geographical outliers that have been taxonomically confirmed. It would be possible to value-add to this process by capturing information on the reproductive state of every specimen (i.e. whether flowering, fruiting or sterile) thereby informing future collection needs and phenological research.

An audit would lead to the discovery of new taxa and new populations of conservation-listed taxa. Indeed, many scientists are undoubtedly already aware of new taxa that are represented in herbarium collections but are not yet on the National Species List — this knowledge should be captured as part of this process, particularly if taxonomic publications are unlikely to be forthcoming in the short-term. 

An audit would improve the quality of data fundamental to our understanding of Australia’s biodiversity and its evolution. It would underpin the eFlora of Australia, enabling more accurate descriptions and distribution maps to be generated, and would improve the quality of derivative products such as regional or taxon specific Apps and field guides. An audit would have tangible conservation outcomes, providing better information for individual taxa and improving analyses directed at conservation planning and decision making, and would also reduce the amount of time required for data cleaning prior to a large-scale analysis.

Identification errors often arise from imperfect taxonomic knowledge and as such an audit of this nature could not be completely uncoupled from baseline taxonomic research; however, it would focus attention on future research needs (e.g. specimens, species or groups in need of further research could be flagged and prioritised; potential student research projects could be highlighted) and collection gaps.

We will need a significant number of skilled research scientists and identification botanists to conduct a taxonomic audit of collections at their home institution, collections originating from their home state but housed at other national herbaria, and specimens belonging to their taxonomic speciality groups. We will also need additional curatorial staff to database backlog material, validate questionable geocodes and localities, perform database edits and maintain existing collections (e.g. duplicates from other states). Staffing levels will need to be maintained into the future to ensure incoming collections are processed and verified without major delay.

Our collections underpin everything — let’s give them the attention they deserve.

The Taxonomy 2028 Challenge: Obtain high quality collections of all undescribed vascular plant taxa

New taxa continue to be discovered through examination of herbarium collections, regional surveys and botanical assessment of areas proposed for development; however, their taxonomic resolution and publication is often hampered by a lack of high quality (or even reasonable quality) material to serve as a type gathering or to enable the taxon to be adequately described. Many putative new taxa are represented by just one or a few collections that are fragmentary or lack key diagnostic features such as flowers or fruits.

By 2028 we will ensure that high quality collections of all undescribed vascular plant species (our known unknowns) will be made available for study in herbaria.

In the face of escalating threats to our biodiversity, there is a pressing need for a targeted collection effort to underpin taxonomic and systematic research, conservation planning and decision making. We need to act now or we risk undescribed species going extinct before they are adequately recorded. High quality collections can serve as type material and will enable reliable morphological descriptions to be generated, thereby facilitating accurate identification and on-ground conservation actions. Ancillary collections (e.g. samples for molecular studies, photographs, live material) could feed into other proposed 2028 goals (e.g. a genomic ark, stakeholder engagement) and ex situ conservation strategies.

For some undescribed species, obtaining good collections will be fraught with difficulties — many occur in remote or otherwise difficult to access areas, lack accurate geocode or locality information to enable them to be easily relocated, or require good seasonal conditions or fire to stimulate flowering. Furthermore, repeated visits to the same site may be required in order to collect adequate samples. We will therefore need skilled and energetic personnel to assess collection gaps, plan and conduct complex, targeted field expeditions or to otherwise co-ordinate regional personnel and skilled citizen scientists. Curatorial support will be essential for specimen processing, database and maintenance so that the specimens and their data can be made available for use by scientists.

An effort such as this would negate a major impediment to describing our vascular plant flora. And perhaps by the time this material is obtained, processed and ready for study, a future generation of skilled taxonomists with permanent positions will be in place and able to use these collections to best effect.