Last Friday I was so invested in some data analysis that I forgot everything around me and that included my Friday blog post with weekend reading material. My apologies for that. Nevertheless, here are my weekly favourites for some throughout the week reading.
Efficient DNA extraction is fundamental to molecular studies. However, commercial kits are expensive when a large number of samples need to be processed. Here we present a simple, modular and adaptable DNA extraction ‘toolkit’ for the isolation of high purity DNA from multiple sample types (modular universal DNA extraction method or Mu-DNA). We compare the performance of our method to that of widely used commercial kits across a range of soil, stool, tissue and water samples. Mu-DNA produced DNA extractions of similar or higher yield and purity to that of the commercial kits. As a proof of principle, we carried out replicate fish metabarcoding of aquatic eDNA extractions, which confirmed that the species detection efficiency of our method is similar to that of the most frequently used commercial kit. Our results demonstrate the reliability of Mu-DNA along with its modular adaptability to challenging sample types and sample collection methods. Mu-DNA can substantially reduce the costs and increase the scope of experiments in molecular studies.
The monitoring of impacts of anthropic activities in marine environments, such as aquaculture, oil-drilling platforms or deep-sea mining, relies on Benthic Biotic Indices (BBI). Several indices have been formalised to reduce the multivariate composition data into a single continuous value that is ascribed to a discrete ecological quality status. Such composition data is traditionally obtained from macrofaunal inventories, which is time-consuming and expertise-demanding. Important efforts are ongoing towards using High-Throughput Sequencing of environmental DNA (eDNA metabarcoding) to replace or complement morpho-taxonomic surveys for routine biomonitoring. The computation of BBI from such composition data is usually being undertaken by practitioners with excel spreadsheets or through custom script. Hence, the updating of reference morpho-taxonomic tables and cross studies comparison could be hampered. Here we introduce the R package BBI for the computation of BBI from composition data, either obtained from traditional morpho-taxonomic inventories or from metabarcoding data. Its aim is to provide an open-source, transparent and centralised method to compute BBI for routine biomonitoring.
The degradation of freshwater ecosystems has become a common ecological and environmental problem globally. Owing to the complexity of biological communities, there remain tremendous technical challenges for investigating influence of environmental stressors (e.g. chemical pollution) on biological communities. High-throughput sequencing-based metabarcoding provides a powerful tool to reveal complex interactions between environments and biological communities. Among many technical issues, the clustering strategies for Operational Taxonomic Units (OTUs) which are crucial for assessing biodiversity of communities, may affect final conclusions. Here, we used zooplankton communities along an environmental pollution gradient in the Chaobai River in Northern China to test different clustering strategies, including non-clustering and clustering with varied thresholds. Our results showed that though the number of OTUs estimated by non-clustering strategies and clustering strategies with divergence thresholds of 99-97% largely varied, they were able to identify the same set of significant environmental and spatial variables responsible for geographical distributions of zooplankton communities. In addition, the ecological conclusions obtained by clustering thresholds of 99-97% were consistent with non-clustering strategies, where for all eight clustering scenarios we detected that species sorting predicted by environmental variables overrode dispersal as the dominant factor in structuring zooplankton communities. However, clustering with the divergence thresholds of <95% affected the environmental and spatial variables identified. We conclude that both newly developed non-clustering methods and traditional clustering methods with divergence thresholds ≥97% were reliable to reveal mechanisms of complex community-environment interactions, although different clustering strategies could lead to largely varied biodiversity estimates such as those for α-diversity.
Sediment bypass tunnels (SBTs) are guiding structures used to reduce sediment accumulation in reservoirs during high flows by transporting sediments to downstream reaches during operation. Previous studies monitoring the ecological effects of SBT operations on downstream reaches suggest a positive influence of SBTs on riverbed sediment conditions and macroinvertebrate communities based on traditional morphology-based surveys. Morphology-based macroinvertebrate assessments are costly and time-consuming, and the large number of morphologically cryptic, small-sized and undescribed species usually results in coarse taxonomic identification. Here, we used DNA metabarcoding analysis to assess the influence of SBT operations on macroinvertebrates downstream of SBT outlets by estimating species diversity and pairwise community dissimilarity between upstream and downstream locations in dam-fragmented rivers with operational SBTs in comparison to dam-fragmented (i.e., no SBTs) and free-flowing rivers (i.e., no dam). We found that macroinvertebrate community dissimilarity decreases with increasing operation time and frequency of SBTs. These factors of SBT operation influence changes in riverbed features, e.g. sediment relations, that subsequently effect the recovery of downstream macroinvertebrate communities to their respective upstream communities. Macroinvertebrate abundance using morphologically-identified specimens was positively correlated to read abundance using metabarcoding. This supports and reinforces the use of quantitative estimates for diversity analysis with metabarcoding data.
Metabarcoding is a popular application which warrants continued methods optimization. To maximize barcoding inferences, hierarchy-based sequence classification methods are increasingly common. We present methods for the construction and curation of a database designed for hierarchical classification of a 157 bp barcoding region of the arthropod cytochrome c oxidase subunit I (COI) locus. We produced a comprehensive arthropod COI amplicon dataset including annotated arthropod COI sequences and COI sequences extracted from arthropod whole mitochondrion genomes, the latter of which provided the only source of representation for Zoraptera, Callipodida and Holothyrida. The database contains extracted sequences of the target amplicon from all major arthropod clades, including all insect orders, all arthropod classes and Onychophora, Tardigrada and Mollusca outgroups. During curation, we extracted the COI region of interest from approximately 81 percent of the input sequences, corresponding to 73 percent of the genus-level diversity found in the input data. Further, our analysis revealed a high degree of sequence redundancy within the NCBI nucleotide database, with a mean of approximately 11 sequence entries per species in the input data. The curated, low-redundancy database is included in the Metaxa2 sequence classification software. Using this database with the Metaxa2 classifier, we performed a cross-validation analysis to characterize the relationship between the Metaxa2 reliability score, an estimate of classification confidence, and classification error probability. We used this analysis to select a reliability score threshold which minimized error. We then estimated classification sensitivity, false discovery rate and overclassification, the propensity to classify sequences from taxa not represented in the reference database. Our work will help researchers design and evaluate classification databases and conduct metabarcoding on arthropods and alternate taxa.
Honeydew produced from the excretion of plant-sucking insects (order Hemiptera) is a carbohydrate-rich material that is foraged by honey bees to integrate their diets. In this study, we used DNA extracted from honey as a source of environmental DNA to disclose its entomological signature determined by honeydew producing Hemiptera that was recovered not only from honeydew honey but also from blossom honey. We designed PCR primers that amplified a fragment of mitochondrial cytochrome c oxidase subunit 1 (COI) gene of Hemiptera species using DNA isolated from unifloral, polyfloral and honeydew honeys. Ion Torrent next generation sequencing metabarcoding data analysis assigned Hemiptera species using a customized bioinformatic pipeline. The forest honeydew honeys reported the presence of high abundance of Cinara pectinatae DNA, confirming their silver fir forest origin. In all other honeys, most of the sequenced reads were from the planthopper Metcalfa pruinosa for which it was possible to evaluate the frequency of different mitotypes. Aphids of other species were identified from honeys of different geographical and botanical origins. This unique entomological signature derived by environmental DNA contained in honey opens new applications for honey authentication and to disclose and monitor the ecology of plant-sucking insects in agricultural and forest landscapes.
Introduced species of mammals in New Zealand have had catastrophic effects on populations of diverse native species. Quantifying the diets of these omnivorous and predatory species is critical for understanding which native species are most impacted, and to prioritize which mammal species and locations should be targeted with control programmes. A variety of methods have been applied to quantify diet components in animals, including visual inspection of gut contents (Daniel 1973; Pierce and Boyle 1991), stable isotope analysis (Major et al. 2007; Carreon-Martinez and Heath 2010), and time-lapse video (Brown and Brown 1997; Dunlap and Pawlik 1996). Increasingly, DNA-based metabarcoding methods are being used (King et al. 2008; Soininen et al. 2009). These metabarcoding methods require a PCR step using primers that bind to highly conserved genomic regions (e.g. mitochondrial COI) to amplify specific regions for sequencing. This step introduces significant bias, primarily due to the lack of a universal primer set (King et al. 2008). Here we show that direct metagenomic sequencing using the Oxford Nanopore Minion allows rapid quantification of rat diets. Using a sample of rats collected from within 100km of Auckland, NZ, we show that these rats consume a wide variety of plant, invertebrate, vertebrate, and fungal taxa, with substantial differences in diet content between locales. We then show that, based on diet content alone, it is possible to pinpoint the sampling location of an individual rat within tens of kilometres. We expect that the rapidly increasing accuracy and throughput of nanopore-based sequencing, as well as increases in the species diversity of genomic databases, will soon allow rapid and unbiased assessments of animal diets in field settings.