Taxonomy 2028 Challenge: Providing certainty in taxonomic applications through next generation sequencing

Holotypes are the universal standards we use to build our taxonomic knowledge. Since the integration of genetic data into our methods, our knowledge of diversity is rapidly increasing, but rarely can be linked back to species names. This has led to a divergence of knowledge systems where sometimes we have a parallel set of understanding that cannot be accurately tied together. In order to progress taxonomy, we need to be able to link all this new information with the centuries of foundational work based largely on morphology. Most holotypes are old, or preserved in a way that doesn’t easily allow the extraction of genetic information. However, newer technologies have overcome many of these issues (e.g. formalin preservation, degraded samples) and it should be possible to retrieve a barcode that could link historical type material with molecular studies.

By 2028 we will barcode 50% of holotypes in Australasian collections. This will result in the ability to link genetic information with available names and provide context for interpreting all molecular studies. This will solve some seemingly intractable taxonomic questions, and provide an essential resource for the future. This matters because currently, at best, molecular studies can include material re-collected from type localities, but most don’t, or can’t, since these areas may be highly impacted by human development. The proposed project will provide absolute certainty for contemporary identifications.

While it is not possible to utilise a single gene across all life, the key is unlocking the relevant marker for the groups of interest. Shotgun sequencing can be used to produce barcodes for type material. Shotgun sequencing breaks up DNA into small pieces, which get sequenced in short, overlapping fragments. These are assembled into continuous pieces, which will usually contain the high copy genes that we often use for species-level studies.

By maintaining project-level hubs at the involved institutions, existing databases will maintain the complete metadata record, while a purpose-built (very simple) database could list the species name and collection registration number. Using these two terms in a search of the public database GenBank, where data will be deposited, will retrieve all available data for that type. Other resources required would be salaries for project managers/scientists, a budget for sequencing, and the support of the involved institutions.

This project will increase the utility and integration of existing genetic information and provide absolute certainty of species-level identifications. It will also reduce the need for loaning or handling type material, which in many cases, becomes more fragile with age. This is future-proofing taxonomy!

By Nerida Wilson and Kym Abrams