Just finished reading a paper by some of my colleagues here at the institute. They summarize the results of a large study which employed a DNA barcode library for the vascular plants of Canada to determine the method with the best species resolution and the barcode marker (rbcL, matK, ITS2) with the highest performance.
The colleagues build a barcode reference library for 4923 of the 5108 species of non-hybrid origin (~96%) with coverage for all 1153 genera and 171 families in the Database of Vascular Plants of Canada. Of course coverage for the three markers differs. The rbcL dataset is most complete with almost 94% coverage. The ITS2 library includes almost 60% of the species and the matK dataset 39%. Overall, 78% of the species possess records for some combination of two markers, but only 1074 species (22%) have data for all three. Despite such gaps the results are more than promising and certainly very impressive. For almost all vascular plants in Canada the library contains barcode sequences for at least one marker and given their individual effectiveness it is possible to make species and genus assignments at a considerable level:
Analyses based on this library indicate that any one of the three barcode regions is very effective (>90%) in delivering a generic assignment while species resolution is often possible with ITS2 (72%) and matK (80%). BLAST demonstrated higher performance than mothur in assigning specimens to a species in all datasets, including those at a community level and for 1074 species with data for all three barcode regions. The higher performance of BLAST reflects its consideration of indel variation and absolute length of the marker, leading matK to deliver the highest resolution. Although ITS2 showed slightly lower performance, it has two important advantages; its short length makes it suitable for HTS-based applications, and it is readily recovered from diverse taxa, including vascular plants and fungi.