Wednesday, January 2, 2013

The seven deadly sins of DNA barcoding

A Happy New Year to everyone!
I hope 2013 will be a successful, healthy, and prosperous year for all readers out there. 

After some much needed rest I am back at the keyboard ready to provide you with short bits and pieces on everything related to DNA-based identification and DNA Barcoding research, and whatever else I find in the web. 

Today I like to recommend the read of an opinion piece just published online early in Molecular Ecology Resources. The authors present the 'seven deadly sins' of DNA Barcoding or in other words deficiencies that they identified as common in DNA Barcoding research. I am sure some of the points the authors make will be met with criticism but overall there is a lot I agree with. They summarize some of the potential problems and provide suggestions to overcome them. I am flattered that one suggested solution is actually a paper I wrote with a colleague. Nice to see it made an impact in probably the single most important point raised in the paper.

But now without further due - the ‘seven deadly sins’ of DNA barcoding with consequences and solutions in short form as provided in the paper. I am thinking of taking on each one of them in single posts over the next couple of weeks.

Failure to test clear hypothesesChoice of inappropriate or suboptimal analytical method due to confusion as to the objectives of the studyExplicitly state each hypothesis, and for each distinct aspect of the study present separate headings in methods and results sections
Inadequate a priori identification of specimensConflicting identifications made by different labs can compromise the effectiveness of reference libraries that are ultimately used as a resource for scientific or regulatory purposesPresent a bibliography of references, as well as the distinguishing morphological characters used in the identification process. Follow recommendations outlined by Steinke & Hanner (2011)
The use of the term ‘species identification’Confusion between identification of individuals, and delimitation/discovery of speciesTo clarify objectives, use the term ‘specimen identification’ or ‘species discovery’ where appropriate
Inappropriate use of neighbour-joining trees(a) Relying on strict monophyly for identification can reduce the apparent effectiveness of DNA barcoding as an identification tool. This can be due to either mtDNA paraphyly or mis-identification of specimens. (b) For biodiversity assessment and species discovery, NJ trees cannot estimate the number of species independently with respect to the taxonomic names(a) Alternative criteria such as ‘best close match’ are readily available, and have higher rates of identification success. This method can be implemented using the free software packages TaxonDNA (Meier et al. 2006) or Spider (Brown et al. 2012). (b) Estimate species richness using ABGD (Puillandre et al. 2012), GMYC (Monaghan et al. 2009) or BOLD's BIN system (
Inappropriate use of bootstrap resamplingFor specimen identification purposes, bootstrap resampling can further reduce the already low identification success rates associated with NJ treesOnly use bootstrapping where appropriate: e.g. as part of a species delimitation process on preestimated groups
Inappropriate use of fixed distance thresholdsFor specimen identification purposes, a generic threshold which is set too low or high can reduce or bias identification error ratesThresholds can now be optimized for specific data sets using the method of Virgilio et al. (2012), or with software such as ABGD (Puillandre et al. 2012) and Spider (Brown et al. 2012)
Incorrectly interpreting the barcoding gapOverlapping distributions of intra-/interspecific distances do not necessarily mean that barcodes perform poorly for identificationFor specimen identification studies, dotplots of intra-/interspecific distances are a better way to illustrate the barcoding gap (e.g. Robinson et al. 2009)

No comments:

Post a Comment