Taxonomy 2028 Challenge: Let’s digitally image poorly known described species and undescribed species in an Australian Biodiversity Portal

by Volker W. Framenau (Perth)

I am an invertebrate scientist publishing taxonomic and systematic studies predominantly on spiders. I am also an environmental practitioner conducting invertebrate surveys in Western Australia trying to identify spiders from often poorly sampled, remote regions.

I am therefore as much a taxonomist as I am an end-user. I am familiar with the taxonomic literature of described species in my area of expertise (and some other invertebrate groups such as millipedes and ants) and have little problems to identify species that had a very recent taxonomic treatment and for which biodiversity and distribution data is readily available (i.e. through identifies specimens in collections published in the Atlas of Living Australia).

What I need, as expert end-user, is a system that helps me identify a species that is not properly illustrated (i.e. historically named, but with poor original description) and those that are undescribed.

As end-user trying to protect rare species, I don’t need a Linnaean name for this, but I need to know if a species is potentially rare or widespread, which determines if it is subject to an environmental assessment or not. In Western Australia, a species does not have to be named scientifically to be protected by the Wildlife Conservation Act.

Imagine, I could find images of all undescribed and poorly described species online, with diagnostic images, i.e. all pedipalps of male spiders in ventral view for a family or genus side-by-side? Apply the ‘retrolateral’ filter, and then I get them all in a different view for identification. Or images of heads of undescribed ants of a genus side by side? Impossible by 2028? Of course not!

Check out www.antweb.org and you can find thousands of images of ants (described and undescribed, the latter with morphocodes), can filter by bioregion, taxonomy and morphological view. Or closer to home, check out the Barrow Island QIM (http://www.padil.gov.au/barrow-island/search?queryType=all) funded by Chevron Australia as part of its biosecurity efforts for its Gorgon Project, that illustrates in access of 2,000 terrestrial invertebrate species, many with morphospecies codes. I have used this resource excessively for identifications of spiders and ants in the nearby Pilbara. Imagine a Barrow Island QIM for the whole of Australia, just better!

This of course will not work for all taxa, some cannot be easily identified by images alone, but for many it will work, as long as species-specific images are being presented (here the Barrow Island QIM falls short, at least for spiders).

There are three main elements that need to be developed for this:

The database structure and gallery type web-design with appropriate filters for images that are meta-tagged for these filters.
An Australia-wide pseudotaxonomic/morphospecies framework for undescribed species with unique species identifiers. This can be modelled on the Linnaean system, i.e. it will require ‘morphotypes’ to fix a morphospecies.
Expert curators for specific groups, possibly at state-level, that oversea the addition of new species.

Addition of new species will likely be managed at the state level, so let’s think this through for WA and spiders. There are approximately 900 described species in the state, our best estimates of the total number of species is probably three times as much (round it up to 3,000). For arguments sake, let’s assume that about half of the described species have been recently revised and can be identified based on published revisions. That leaves us with ca. 2,500 to be illustrated online for identification (however, no reason to not also include described species by using the published images, copyrights permitting). Many of these will occur in the neighbouring states or even Australia-wide (Australia-wide, about 3,800 spider species are described of an estimated 10,000+ species).

This number for WA spiders is just about as much as the Barrow Island Quim has been done since 2004! Not only is it possible, it has actually been done.

Let’s now assume, like Antweb, an online image catalog is being contributed to by the whole scientific community, overseen by expert curators to guarantee taxonomic consistency of the system? For example, if I as environmental consultant with expertise identification skill find a species and cannot find it online, I submit standard images and the specimen to the ‘curator’ who simply has to upload the images, establish a morphotype and add distribution data to the database (maybe by IBRA region?). In a well-established online Contents Management System (CMS) this may take all put 15 min per species. It’s almost like the Encyclopedia of Life for undescribed species. Once it is set up with a core number of species for each taxon, I would hopefully again momentum. Imagine then, that museum curators use the system to identify new accessions, database these and the respective distribution data would be available on the ALA (which by then allows listing of the established morphotypes).

We would move from species description to species registration, which, of course, would ultimately enormously facilitate future taxonomic revisions.

Of course there will be errors in the system, but a species is only a hypothesis after all

We won’t be able to scientifically describe all invertebrate species by 2028, but we can document a large proportion of these within 10 years!

9 responses

This is a great idea. I wonder how much of this could be done using existing imagery, from both specimens and the literature. GBIF has limited images for spiders, all from DNA barcoding https://www.gbif.org/occurrence/gallery?q=Arana... But there may be richer images for other groups. It would also be interesting to extract images from the published literature, both photos and line drawings. There’s been some work on doing this by the Plazi project. Lots of scope here for machine learning, for example using it to try and identify images, or perhaps an easier task such as automatically identifying what orientation a specimen has in a given image (I’ve been very impressed with the effectiveness of iNaturalist’s use of “deep learning” to suggest identifications for organisms in photos). I guess your suggestion is an argument for mass digitisation of collections, and for making all those images available online as soon as possible, even if unidentified. It’s pretty much the equivalent of sequencing everything and postponing identification of the sequences.

— Roderic Page

Hi Volker, Great idea, I am a mature (75) PhD student presently reviewing a group of marine protist and finding new species using light and electron microscopy along with morphometry and genomics. Protists are often difficult to preserve as type specimens in their natural form. I am developing a database which also uses digital imagery, including video and 3-D images using Z-Stacking. Regards John

— JOHN DOUGLAS

Thanks Roderic, there are some great initiatives out there for mass digitisation of collections, a recent paper on iCollections (butterflies) jumps to mind. They digitise whole and massive collections, but I am just thinking about unknown 'types' (although for some species you would have to illustrate the variation). And yes, uploading published images for described species could be done very easily within such as system, although I think it's strength would still be in the undocumented biota. It needs to be regulated to make sure the critical diagnostic images, let's say a 'core image library', is well curated. Let us get carried away by the idea of an 'Encyclopaedia of (unknown) Life': We we establish a reward system for submitting good quality diagnostic images of a 'type' and that specimen itself to the 'curator' of a taxon (i.e. by first right to name the species, or even a minor monetary reward), I believe this would speed up very quickly and documenting our fauna would speed up enormously. Think 100 AUD for an undiscovered species. Documenting the whole Australian spider fauna would cost a mere 1-1.5 mio AUD! Of course, once money is involved, we need to have strict regulations in place. But it would be the ultimate citizen science movement. When I look how many excited amateurs are currently involved in taking images of Peacock Spiders, or the Flickr group on Australian Spiders, there is massive untapped potential out there. You just have to establish the system, maybe at a small scale as proof of concept. It is then upscalable to infinity...(just add more geographic regions and more taxa...) As mentioned, it won't work for all groups, but it will work for many.

— Volker Framenau

Thanks John, there are so many scientists, and probably serious amateurs, out there now with massive amounts of unpublished images (I have 1000s...). One has Just to provide a (regulated) online platform, the 'Encyclopaedia of (unknown) Life', which also recognises the individual contribution in some way, and the opportunities are limitless. But for this to work, many scientists need to recognise that the goal of documenting Australia's biota is bigger than their own contributions and increasing of impact factors.... It has been amazing, how the Flickr group on Australia spiders (https://www.flickr.com/groups/australianspiders/) has taken off, with now more than 35,000 images online and 700+ members. Once the platform was established in 2009, it all took off with citizen science contributions. It's fairly unregulated, but imagine if you channel that enthusiasm into a more regulated form to document our biota....

Credit/attribution always seems to be a sticking point, an obstacle to sharing. Maybe the "data paper" approach would work here. Any scientist uploading a large number of unpublished images would get a data paper that describes the images, and perhaps cites relevant work by that researcher. The paper gets a DOI and hence is citable, and anyone using the corresponding images would be encouraged to cite that paper if and when they make use of them. Indeed, each image could get it's own DOI, which would make them even more citable.

Attribution is only a sticking point for professional scientists, not for the vast amount of 'citizen scientists' who will in fact and have to do the bulk of work on this. These will probably be quite happy with simply their name meta-tagged on images and then with their profile displayed. The few few taxonomists working on a group (with the exception of the 'curator') will most likely be insignificant in relation to all others, probably amateurs or end-users, who will actually drive this and upload images and submit reference specimens. And they have to, there is too few taxonomists now and there will be less in the future. I like what the Europeans had to say 10 years ago (https://drive.google.com/file/d/0B2Ukbp3fwytfNH...): "Current approaches to taxon description will need to be radically reviewed. The current approach is inadequate to meet needs so simply ramping up productivity using existing nomenclatural and publication tools will not suffice. Formal description might only be used in taxa or instances where a formal name is essential. Emerging biodiversity informatics techniques can associate different kinds of information with unique identifiers that do not require a formal name. These changes need to be led from within taxonomy." This is exactly what I think is necessary, an applied system to document biodiversity. That won't stop scientist naming species the traditional way, although I would foresee that to become less and less important. We will have an online species registration system, depicting morphologically important structures and possibly genetic codes. These species will be linked to the biodiversity databases, that will document actual distribution records of each species. Expert taxonomists will become the curator of those data and don't have to spend their valuable time writing long single species descriptions which will then take even longer to publish. They become scientific managers, hopefully adequately funded. The collection manager becomes the most important person in a museum and doesn't have to be a highflying scientist at all, just good in identifying species. I know many unemployed scientists with great expertise in taxonomic group who would love just such a job, even if it would only provide part-time employment.

Reading this document again, I believe it has the core elements for how future taxonomy will work, and it's almost 10 years old: https://drive.google.com/file/d/0B2Ukbp3fwytfNH...

2 visitors upvoted this post.

noto|biotica

Australasian taxonomy and systematics

Taxonomy 2028 Challenge: Let’s digitally image poorly known described species and undescribed species in an Australian Biodiversity Portal – the end-users perspective