Friday, October 4, 2013

Barcode library construction

As I am scrambling with last minute preparations for the upcoming online course I was extremely grateful when my colleague Rodger Gwiazdowski offered to write a guest post. So here we go:

Workflow of specimens from museum drawers,
into designed-for-DNA-Barcode-sampling
Schmidt boxes, and preliminary data capture.
DNA Barcode-based identifications depend on the taxon coverage of reference libraries. Museums are primary sources of specimens, and Dirk has blogged about this idea here before. As I help assemble a reference database for all Hemiptera native to Canada (~ 4K species of aphids, stinkbugs, cicadas etc.), I notice the biggest challenge is not finding all the species we need in museums, but obtaining barcode-compliant sequences from many species - mostly due to specimen age.

To address this, Paul Hebert and colleagues recently published predictive data on barcode-compliant sequence recovery from their large ‘Barcode Blitz’ across the ANIC Lep collection. Their findings affirm that museum sampling is the main tool for library construction. However, variation in specimen availability and age may require additional tools for comprehensive coverage.

As an example, the Canadian Hemiptera library still needs at least one specimen from 779 species across the Heteroptera (one of the four main Hemipteran suborders). I recently surveyed for all 779 in the Smithonian’s collection. Of these, ~290 were collected in the past 30+ years (many through curator Tom Henry’s additions/ID’s), and are likely to yield barcode-compliant sequences. About 80 species have their youngest specimens in the late 19th and early 20th centuries, and  ~320 species were not collectable as they were on loan, never accessioned or the very few specimens present (2-4) were preferred to be kept in-house (the vast majority were from the early 20th century, anyway). My observations, unfortunately, don’t include the valuable Carl John Drake Collection where specimens cannot be taken off-site - as a condition of the collection donation. On site, ‘Barcode Blitz’, sampling efforts have clear advantages here, but can’t address unavailable species, or those too old for barcodes.

Tingid specimens from the late 19th,
and early 20th centuries alongside recently
collected specimens to be DNA barcoded.
Hard-to-get species can be found through other museums, and a collaborator network. But the search can be expensive. All curators tell me if they don’t have recent specimens (as in, the past 30-50+ years), they’d like some. Because fresh specimens are ideal for barcoding, it seems possible to acquire rare taxa, and deposit them as vouchers using (at least) two new tools: DNA Barcoding at BioBlitzes and eBol.

Recently, ~ 5K specimens, collected during the world’s largest BioBlitz (Rouge Park, north of Toronto), will have their DNA Barcodes sequenced, and eBOL projects (curriculum based DNA Barcoding) are pairing students with taxonomists to analyze data using BOLD. As library construction efforts learn who their hard-to-get taxa are, regional BioBlitz participants can be alerted to look for these taxa, and students could take on sampling these species as an eBOL project. In both cases, citizen scientists, and educators could directly fill bioinformatics gaps in taxon libraries where vouchers reside in public collections.

Images were taken in the Smithsonian's Hemiptera collection, August 2013

No comments:

Post a Comment