Monday, September 25, 2017

Metabarcoding and Metagenomics Journal is out


As announced before the new journal Metabarcoding and Metagenomics has been created by Pensoft and now it is officially running with the first few articles published. Here a part of the official press release:

A new innovative open-access academic journal Metabarcoding and Metagenomics (MBMG) is launched to welcome novel papers from both basic and applied aspects.

Focusing on genetic approaches to study biodiversity across all ecosystems, MBMG covers a considerably large scope of research including environmental, microbial and applied metabarcoding and metagenomics (especially DNA-based bioassessment and -monitoring, quarantine, nature conservation, species invasions, eDNA surveillance), as well as associated topics, such as molecular ecology, DNA-based species delimitation and identification, and other emerging related fields. Submissions of bioinformatic approaches to MBMG (algorithms, software) are also encouraged.

Featuring novel article formats and data publishing workflows, MBMG is to reflect the rapid growth in the use of metabarcoding and metagenomics in life and environmental sciences.

For the full release please read here.

I am serving as deputy Editor-in-chief for the journal and I am really looking forward to the deluge of publications to come.

Friday, September 15, 2017

Weekend reads

More to read including some from the rather large backlog. Have a great weekend with some good reads.

DNA barcoding involves the use of one or more short, standardized DNA fragments for the rapid identification of species. A 648-bp segment near the 5' terminus of the mitochondrial cytochrome c oxidase subunit I (COI) gene has been adopted as the universal DNA barcode for members of the animal kingdom, but its utility in mushrooms is complicated by the frequent occurrence of large introns. As a consequence, ITS has been adopted as the standard DNA barcode marker for mushrooms despite several shortcomings. This study employed newly designed primers coupled with cDNA analysis to examine COI sequence diversity in six species of Pleurotus and compared these results with those for ITS. The ability of the COI gene to discriminate six species of Pleurotus, the commonly cultivated oyster mushroom, was examined by analysis of cDNA. The amplification success, sequence variation within and among species, and the ability to design effective primers was tested. We compared ITS sequences to their COI cDNA counterparts for all isolates. ITS discriminated between all six species, but some sequence results were uninterpretable, because of length variation among ITS copies. By comparison, a complete COI sequences were recovered from all but three individuals of Pleurotus giganteus where only the 5' region was obtained. The COI sequences permitted the resolution of all species when partial data was excluded for P. giganteus. Our results suggest that COI can be a useful barcode marker for mushrooms when cDNA analysis is adopted, permitting identifications in cases where ITS cannot be recovered or where it offers higher resolution when fresh tissue is. The suitability of this approach remains to be confirmed for other mushrooms.

Environmental bulk samples often contain many different taxa that vary several orders of magnitude in biomass. This can be problematic in DNA metabarcoding and metagenomic high-throughput sequencing approaches, as large specimens contribute disproportionately high amounts of DNA template. Thus, a few specimens of high biomass will dominate the dataset, potentially leading to smaller specimens remaining undetected. Sorting of samples by specimen size (as a proxy for biomass) and balancing the amounts of tissue used per size fraction should improve detection rates, but this approach has not been systematically tested. Here, we explored the effects of size sorting on taxa detection using two freshwater macroinvertebrate bulk samples, collected from a low-mountain stream in Germany. Specimens were morphologically identified and sorted into three size classes (body size < 2.5 × 5, 5 × 10, and up to 10 × 20 mm). Tissue powder from each size category was extracted individually and pooled based on tissue weight to simulate samples that were not sorted by biomass ("Unsorted"). Additionally, size fractions were pooled so that each specimen contributed approximately equal amounts of biomass ("Sorted"). Mock samples were amplified using four different DNA metabarcoding primer sets targeting the Cytochrome c oxidase I (COI) gene. Sorting taxa by size and pooling them proportionately according to their abundance lead to a more equal amplification of taxa compared to the processing of complete samples without sorting. The sorted samples recovered 30% more taxa than the unsorted samples at the same sequencing depth. Our results imply that sequencing depth can be decreased approximately fivefold when sorting the samples into three size classes and pooling by specimen abundance. Even coarse size sorting can substantially improve taxa detection using DNA metabarcoding. While high-throughput sequencing will become more accessible and cheaper within the next years, sorting bulk samples by specimen biomass or size is a simple yet efficient method to reduce current sequencing costs.

Second-generation, high-throughput sequencing methods have greatly improved our understanding of the ecology of soil microorganisms, yet the short barcodes (< 500 bp) provide limited taxonomic and phylogenetic information for species discrimination and taxonomic assignment. Here, we utilized the third-generation Pacific Biosciences (PacBio) RSII and Sequel instruments to evaluate the suitability of full-length internal transcribed spacer (ITS) barcodes and longer rRNA gene amplicons for metabarcoding Fungi, Oomycetes and other eukaryotes in soil samples. Metabarcoding revealed multiple errors and biases: Taq polymerase substitution errors and mis-incorporating indels in sequencing homopolymers constitute major errors; sequence length biases occur during PCR, library preparation, loading to the sequencing instrument and quality filtering; primer-template mismatches bias the taxonomic profile when using regular and highly degenerate primers. The RSII and Sequel platforms enable the sequencing of amplicons up to 3000 bp, but the sequence quality remains slightly inferior to Illumina sequencing especially in longer amplicons. The full ITS barcode and flanking rRNA small subunit gene greatly improve taxonomic identification at the species and phylum levels, respectively. We conclude that PacBio sequencing provides a viable alternative for metabarcoding of organisms that are of relatively low diversity, require > 500-bp barcode for reliable identification or when phylogenetic approaches are intended.

INTRODUCTION:
Herbal medicines play an important role globally in the health care sector and in industrialised countries they are often considered as an alternative to mono-substance medicines. Current quality and authentication assessment methods rely mainly on morphology and analytical phytochemistry-based methods detailed in pharmacopoeias. Herbal products however are often highly processed with numerous ingredients, and even if these analytical methods are accurate for quality control of specific lead or marker compounds, they are of limited suitability for the authentication of biological ingredients.
OBJECTIVE:
To review the benefits and limitations of DNA barcoding and metabarcoding in complementing current herbal product authentication.
METHOD:
Recent literature relating to DNA based authentication of medicinal plants, herbal medicines and products are summarised to provide a basic understanding of how DNA barcoding and metabarcoding can be applied to this field.
RESULTS:
Different methods of quality control and authentication have varying resolution and usefulness along the value chain of these products. DNA barcoding can be used for authenticating products based on single herbal ingredients and DNA metabarcoding for assessment of species diversity in processed products, and both methods should be used in combination with appropriate hyphenated chemical methods for quality control.
CONCLUSIONS:
DNA barcoding and metabarcoding have potential in the context of quality control of both well and poorly regulated supply systems. Standardisation of protocols for DNA barcoding and DNA sequence-based identification are necessary before DNA-based biological methods can be implemented as routine analytical approaches and approved by the competent authorities for use in regulated procedures. 

An understanding of how biotic interactions shape species' distributions is central to predicting host-symbiont responses under climate change. Switches to locally adapted algae have been proposed to be an adaptive strategy of lichen-forming fungi to cope with environmental change. However, it is unclear how lichen photobionts respond to environmental gradients, and whether they play a role in determining the fungal host's upper and lower elevational limits. Deep-coverage Illumina DNA metabarcoding was used to track changes in the community composition of Trebouxia algae associated with two phylogenetically closely related, but ecologically divergent fungal hosts along a steep altitudinal gradient in the Mediterranean region. We detected the presence of multiple Trebouxia species in the majority of thalli. Both altitude and host genetic identity were strong predictors of photobiont community assembly in these two species. The predominantly clonally dispersing fungus showed stronger altitudinal structuring of photobiont communities than the sexually reproducing host. Elevation ranges of the host were not limited by the lack of compatible photobionts. Our study sheds light on the processes guiding the formation and distribution of specific fungal-algal combinations in the lichen symbiosis. The effect of environmental filtering acting on both symbiotic partners appears to shape the distribution of lichens.

Friday, September 8, 2017

Weekend reads

Back after a longer hiatus with more reads for you. Too much work and a little bit of vacation in between didn't allow for much posting. Let's if some reshuffling of things work better. Well, enough about me, back to other's papers (although the first is mine ;-) ):

Continuously increasing demand for plant and animal products causes unsustainable depletion of biological resources. It is estimated that one-quarter of sharks and rays are threatened worldwide and although the global fin trade is widely recognized as a major driver, demand for meat, liver oil, and gill plates also represents a significant threat. This study used DNA barcoding and 16 S rRNA sequencing as a method to identify shark and ray species from dried fins and gill plates, obtained in Canada, China, and Sri Lanka. 129 fins and gill plates were analysed and searches on BOLD produced matches to 20 species of sharks and five species of rays or – in two cases – to a species pair. Twelve of the species found are listed or have been approved for listing in 2017 in the appendices of the Convention on International Trade in Endangered Species of Fauna and Flora (CITES), including the whale shark (Rhincodon typus), which was surprisingly found among both shark fin and gill plate samples. More than half of identified species fall under the IUCN Red List categories ‘Endangered’ and ‘Vulnerable’, raising further concerns about the impacts of this trade on the sustainability of these low productivity species.

Community assembly is determined by a combination of historical events and contemporary processes that are difficult to disentangle, but eco-evolutionary mechanisms may be uncovered by the joint analysis of species and genetic diversity across multiple sites. Mountain streams across Europe harbour highly diverse macroinvertebrate communities whose composition and turnover (replacement of taxa) among sites and regions remain poorly known. We studied whole-community biodiversity within and among six mountain regions along a latitudinal transect from Morocco to Scandinavia at three levels of taxonomic hierarchy: genus, species and haplotypes. Using DNA barcoding of four insect families (>3100 individuals, 118 species) across 62 streams, we found that measures of local and regional diversity and intraregional turnover generally declined slightly towards northern latitudes. However, at all hierarchical levels we found complete (haplotype) or high (species, genus) turnover among regions (and even among sites within regions), which counters the expectations of Pleistocene postglacial northward expansion from southern refugia. Species distributions were mostly correlated with environmental conditions, suggesting a strong role of lineage- or species-specific traits in determining local and latitudinal community composition, lineage diversification and phylogenetic community structure (e.g., loss of Coleoptera, but not Ephemeroptera, at northern sites). High intraspecific genetic structure within regions, even in northernmost sites, reflects species-specific dispersal and demographic histories and indicates postglacial migration from geographically scattered refugia, rather than from only southern areas. Overall, patterns were not strongly concordant across hierarchical levels, but consistent with the overriding influence of environmental factors determining community composition at the species and genus levels.

Throughout the world DNA banks are used as storage repositories for genetic diversity of organisms ranging from plants to insects to mammals. Designed to preserve the genetic information for organisms of interest, these banks also indirectly preserve organisms’ associated microbiomes, including fungi associated with plant tissues. Studies of fungal biodiversity lag far behind those of macroorganisms, such as plants, and estimates of global fungal richness are still widely debated. Utilizing previously collected specimens to study patterns of fungal diversity could significantly increase our understanding of overall patterns of biodiversity from snapshots in time. Here, we investigated the fungi inhabiting the phylloplane among species of the endemic Hawaiian plant genus, Clermontia (Campanulaceae). Utilizing next generation DNA amplicon sequencing, we uncovered approximately 1,780 fungal operational taxonomic units from just 20 DNA bank samples collected throughout the main Hawaiian Islands. Using these historical samples, we tested the macroecological pattern of decreasing community similarity with decreasing geographic proximity. We found a significant distance decay pattern among Clermontia associated fungal communities. This study provides the first insights into elucidating patterns of microbial diversity through the use of DNA bank repository samples.

Metabarcoding of environmental samples has many challenges and limitations that require carefully considered laboratory and analysis workflows to ensure reliable results. We explore how decisions regarding study design, laboratory set-up, and bioinformatic processing affect the final results, and provide guidelines for reliable study of environmental samples.
We evaluate the performance of four primer sets targeting COI and 16S regions characterizing arthropod diversity in bat faecal samples, and investigate how metabarcoding results are affected by parameters including: (1) number of PCR replicates per sample, (2) sequencing depth, (3) PCR replicate processing strategy (i.e. either additively, by combining the sequences obtained from the PCR replicates, or restrictively, by only retaining sequences that occur in multiple PCR replicates for each sample), (4) minimum copy number for sequences to be retained, (5) chimera removal, and (6) similarity thresholds for Operational Taxonomic Unit (OTU) clustering. Lastly, we measure within- and between-taxa dissimilarities when using sequences from public databases to determine the most appropriate thresholds for OTU clustering and taxonomy assignment.
Our results show that the use of multiple primer sets reduces taxonomic biases and increases taxonomic coverage. Taxonomic profiles resulting from each primer set are principally affected by how many PCR replicates are carried out per sample and how sequences are filtered across them, the sequence copy number threshold and the OTU clustering threshold. We also report considerable diversity differences between PCR replicates from each sample. Sequencing depth increases the dissimilarity between PCR replicates unless the bioinformatic strategies to remove allegedly artefactual sequences are adjusted according to the number of analysed sequences. Finally, we show that the appropriate identity thresholds for OTU clustering and taxonomy assignment differ between markers.
Metabarcoding of complex environmental samples ideally requires (1) investigation of whether more than one primer sets targeting the same taxonomic group is needed to offset primer biases, (2) more than one PCR replicate per sample, (3) bioinformatic processing of sequences that balance diversity detection with removal of artefactual sequences, and (4) empirical selection of OTU clustering and taxonomy assignment thresholds tailored to each marker and the obtained taxa.

Precision and reliability of barcode-based biodiversity assessment can be affected at several steps during acquisition and analysis of data. Identification of operational taxonomic units (OTUs) is one of the crucial steps in the process and can be accomplished using several different approaches, namely, alignment-based, probabilistic, tree-based and phylogeny-based. The number of identified sequences in the reference databases affects the precision of identification. This paper compares the identification of marine nematode OTUs using alignment-based, tree-based and phylogeny-based approaches. Because the nematode reference dataset is limited in its taxonomic scope, OTUs can only be assigned to higher taxonomic categories, families. The phylogeny-based approach using the evolutionary placement algorithm provided the largest number of positively assigned OTUs and was least affected by erroneous sequences and limitations of reference data, compared to alignment-based and tree-based approaches.

Biota monitoring in ports is increasingly needed for biosecurity reasons and safeguarding marine biodiversity from biological invasion. Present and future international biosecurity directives can be accomplished only if the biota acquired by maritime traffic in ports is controlled. Methodologies for biota inventory are diverse and now rely principally on extensive and labor-intensive sampling along with taxonomic identification by experts. In this study, we employed an extremely simplified environmental DNA (eDNA) sampling methodology from only three 1-L bottles of water per port, followed by metabarcoding (high-throughput sequencing and DNA-based species identification) using 18S rDNA and Cytochrome oxidase I as genetic barcodes. Eight Bay of Biscay ports with available inventory of fouling invertebrates were employed as a case study. Despite minimal sampling efforts, three invasive invertebrates were detected: the barnacle Austrominius modestus, the tubeworm Ficopomatus enigmaticus and the polychaete Polydora triglanda. The same species have been previously found from visual and DNA barcoding (genetic identification of individuals) surveys in the same ports. The current costs of visual surveys, conventional DNA barcoding and this simplified metabarcoding protocol were compared. The results encourage the use of metabarcoding for early biosecurity alerts.

The DNA barcode reference library for Lepidoptera holds much promise as a tool for taxonomic research and for providing the reliable identifications needed for conservation assessment programs. We gathered sequences for the barcode region of the mitochondrial cytochrome c oxidase subunit I gene from 160 of the 176 nominal species of Erebidae moths (Insecta: Lepidoptera) known from the Iberian Peninsula. These results arise from a research project which constructing a DNA barcode library for the insect species of Spain. New records for 271 specimens (122 species) are coupled with preexisting data for 38 species from the Iberian fauna. Mean interspecific distance was 12.1%, while the mean nearest neighbour divergence was 6.4%. All 160 species possessed diagnostic barcode sequences, but one pair of congeneric taxa (Eublemma rosea and Eublemma rietzi) were assigned to the same BIN. As well, intraspecific sequence divergences higher than 1.5% were detected in four species which likely represent species complexes. This study reinforces the effectiveness of DNA barcoding as a tool for monitoring biodiversity in particular geographical areas and the strong correspondence between sequence clusters delineated by BINs and species recognized through detailed taxonomic analysis.

In this experimental study the patterns in early marine biofouling communities and possible implications for surveillance and environmental management were explored using metabarcoding, viz. 18S ribosomal RNA gene barcoding in combination with high-throughput sequencing. The community structure of eukaryotic assemblages and the patterns of initial succession were assessed from settlement plates deployed in a busy port for one, five and 15 days. The metabarcoding results were verified with traditional morphological identification of taxa from selected experimental plates. Metabarcoding analysis identified > 400 taxa at a comparatively low taxonomic level and morphological analysis resulted in the detection of 25 taxa at varying levels of resolution. Despite the differences in resolution, data from both methods were consistent at high taxonomic levels and similar patterns in community shifts were observed. A high percentage of sequences belonging to genera known to contain non-indigenous species (NIS) were detected after exposure for only one day.

Thursday, September 7, 2017

Online course on Metabarcoding

Its that time time of the year! In collaboration the the University of Guelph Open Ed department we are running another iteration of the distance education course on Metabarcoding taught by myself.
There are still spots available and the course will be running September 25 to October 20, 2017

This 4-week, web-based course provides an overview of the state of current technology and the various platforms used. The course consists of a series of online lectures and research exercises introducing different aspects of metabarcoding and environmental DNA research. I will also touch on the suite of bioinformatics tools available for sequence analysis and data interpretation.

We tried to cover as much as possible given the online format and the limited time participants usually have available to do such training. I am quite proud of it and feedback on last year's course was quite positive. The course is also designed with limited time resources of participants in mind. It usually takes an average of  four hours per week to go through the content and the materials. 

If you are interested there is still time to join. Sign up is here.