Here in Canada we are having a long weekend which means for some there is even more time to read. No worries, I won't add more than usual to this blog post although there have been quite a few new papers that were published over the past two weeks. Here we go:
A clear insight into the large-scale community structure of planktonic copepods is critical to understanding the mechanisms controlling diversity and biogeography of marine taxa in terms of their high abundance, ubiquity, and sensitivity to environmental changes. Here, we applied a 28S metabarcoding approach to large-scale communities of epipelagic and mesopelagic copepods at 70 stations across the Pacific Ocean and three stations in the Arctic Ocean. Major patterns of community structure and diversity, influenced by water mass structures, agreed with results from previous morphology-based studies. However, a large-scale metabarcoding approach could detect community changes even under stable environmental conditions, including changes in the north/south subtropical gyres and east/west areas within each subtropical gyre. There were strong effects of the epipelagic environment on mesopelagic communities, and community subdivisions were observed in the environmentally stable mesopelagic layer. In each sampling station, higher operational taxonomic unit (OTU) numbers and lower phylogenetic diversity were observed in the mesopelagic layer than in the epipelagic layer, indicating a recent rapid increase in species numbers in the mesopelagic layer. The phylogenetic analysis utilizing representative sequences of OTUs revealed trends of recent emergence of cold-water OTUs, which are mainly distributed at high latitudes with low water temperatures. Conversely, the high diversity of copepods at low latitudes was suggested to have been formed through long evolution under high water temperature conditions. The metabarcoding results suggest that evolutionary processes have strong impacts on current patterns of copepod diversity, and support the "out of the tropics" theory explaining latitudinal diversity gradients of copepods. Diversity patterns in both epipelagic and mesopelagic copepods was highly correlated to sea surface temperature; thus, predicted global warming may have a significant impact on copepod diversity in both layers.
Biological conclusions based on DNA barcoding and metabarcoding analyses can be strongly influenced by the methods utilized for data generation and curation, leading to varying levels of success in the separation of biological variation from experimental error. The 5' region of cytochrome c oxidase subunit I (COI-5P) is the most common barcode gene for animals, with conserved structure and function that allows for biologically informed error identification. Here, we present coil ( https://CRAN.R-project.org/package=coil ), an R package for the pre-processing and frameshift error assessment of COI-5P animal barcode and metabarcode sequence data. The package contains functions for placement of barcodes into a common reading frame, accurate translation of sequences to amino acids, and highlighting insertion and deletion errors. The analysis of 10 000 barcode sequences of varying quality demonstrated how coil can place barcode sequences in reading frame and distinguish sequences containing indel errors from error-free sequences with greater than 97.5% accuracy. Package limitations were tested through the analysis of COI-5P sequences from the plant and fungal kingdoms as well as the analysis of potential contaminants: nuclear mitochondrial pseudogenes and Wolbachia COI-5P sequences. Results demonstrated that coil is a strong technical error identification method but is not reliable for detecting all biological contaminants.
The meiofauna is an important part of the marine ecosystem, but its composition and distribution patterns are relatively unexplored. Here we assessed the biodiversity and community structure of meiofauna from five locations on the Swedish western and southern coasts using a high-throughput DNA sequencing (metabarcoding) approach. The mitochondrial cytochrome oxidase 1 (COI) mini-barcode and nuclear 18S small ribosomal subunit (18S) V1-V2 region were amplified and sequenced using Illumina MiSeq technology. Our analyses revealed a higher number of species than previously found in other areas: thirteen samples comprising 6.5 dm3 sediment revealed 708 COI and 1,639 18S metazoan OTUs. Across all sites, the majority of the metazoan biodiversity was assigned to Arthropoda, Nematoda and Platyhelminthes. Alpha and beta diversity measurements showed that community composition differed significantly amongst sites. OTUs initially assigned to Acoela, Gastrotricha and the two Platyhelminthes sub-groups Macrostomorpha and Rhabdocoela were further investigated and assigned to species using a phylogeny-based taxonomy approach. Our results demonstrate that there is great potential for discovery of new meiofauna species even in some of the most extensively studied locations.
The complexity and natural variability of ecosystems present a challenge for reliable detection of change due to anthropogenic influences. This issue is exacerbated by necessary trade-offs that reduce the quality and resolution of survey data for assessments at large scales. The Peace–Athabasca Delta (PAD) is a large inland wetland complex in northern Alberta, Canada. Despite its geographic isolation, the PAD is threatened by encroachment of oil sands mining in the Athabasca watershed and hydroelectric dams in the Peace watershed. Methods capable of reliably detecting changes in ecosystem health are needed to evaluate and manage risks. Between 2011 and 2016, aquatic macroinvertebrates were sampled across a gradient of wetland flood frequency, applying both microscope-based morphological identification and DNA metabarcoding. By using multispecies occupancy models, we demonstrate that DNA metabarcoding detected a much broader range of taxa and more taxa per sample compared to traditional morphological identification and was essential to identifying significant responses to flood and thermal regimes. We show that family-level occupancy masks high variation among genera and quantify the bias of barcoding primers on the probability of detection in a natural community. Interestingly, patterns of community assembly were nearly random, suggesting a strong role of stochasticity in the dynamics of the metacommunity. This variability seriously compromises effective monitoring at local scales but also reflects resilience to hydrological and thermal variability. Nevertheless, simulations showed the greater efficiency of metabarcoding, particularly at a finer taxonomic resolution, provided the statistical power needed to detect change at the landscape scale.
Better knowledge of food webs and related ecological processes is fundamental to understanding the functional role of biodiversity in ecosystems. This is particularly true for pest regulation by natural enemies in agroecosystems. However, it is generally difficult to decipher the impact of predators, as they often leave no direct evidence of their activity. Metabarcoding via high-throughput sequencing (HTS) offers new opportunities for unraveling trophic linkages between generalist predators and their prey, and ultimately identifying key ecological drivers of natural pest regulation. Here, this approach proved effective in deciphering the diet composition of key predatory arthropods (nine species.; 27 prey taxa), insectivorous birds (one species, 13 prey taxa) and bats (one species; 103 prey taxa) sampled in a millet-based agroecosystem in Senegal. Such information makes it possible to identify the diet breadth and preferences of predators (e.g., mainly moths for bats), to design a qualitative trophic network, and to identify patterns of intraguild predation across arthropod predators, insectivorous vertebrates and parasitoids. Appropriateness and limitations of the proposed molecular-based approach for assessing the diet of crop pest predators and trophic linkages are discussed.
PREPRINTS
Increasing evidence for global insect declines is prompting a renewed interest in the survey of whole insect communities. DNA metabarcoding can contribute to assessing diverse insect communities over a range of spatial and temporal scales, but efforts are still needed to optimise and standardise procedures, from field sampling, through laboratory analysis, to bioinformatic processing.
Here we describe and test a methodological pipeline for surveying nocturnal flying insects, combining a customised automatic light trap and DNA metabarcoding. We optimised laboratory procedures and then tested the methodological pipeline using 12 field samples collected in northern Portugal in 2017. We focused on Lepidoptera to compare metabarcoding results with those from morphological identification, using three types of bulks produced from each sample (individuals, legs and the unsorted mixture).
The customised trap was highly efficient at collecting nocturnal flying insects, allowing a small team to operate several traps per night, and a fast field processing of samples for subsequent metabarcoding with low contamination risks. Morphological processing yielded 871 identifiable individuals of 102 Lepidoptera species. Metabarcoding detected a total of 528 taxa, most of which were Lepidoptera (31.1%), Diptera (26.1%) and Coleoptera (14.7%). There was a reasonably high matching in community composition between morphology and metabarcoding when considering the ‘individuals’ and ‘legs’ bulk samples, with few errors mostly associated with morphological misidentification of small microlepidoptera. Regarding the ‘mixture’ bulk sample, metabarcoding identified nearly four times more Lepidoptera species than morphological examination.
Our study provides a methodological metabarcoding pipeline that can be used in standardised surveys of nocturnal flying insects, showing that it can overcome limitations and potential shortcomings of traditional methods based on morphological identification. Our approach efficiently collects highly diverse taxonomic groups such as nocturnal Lepidoptera that are poorly represented when using Malaise traps and other widely used field methods. To enhance the potential of this pipeline in ecological studies, efforts are needed to test its effectiveness and potential biases across habitat types and to extend the DNA barcode databases for important groups such as Diptera.
Modern ecosystem models have the potential to greatly enhance our capacity to predict community responses to change, but they demand comprehensive spatial distribution information, creating the need for new approaches to gather and synthesize biodiversity data. Metabarcoding or metagenomics can generate comprehensive biodiversity data sets at species-level resolution but they are limited to point samples. CommDivMap contains a number of functions that can be used to turn OTU tables resulting from metabarcoding runs of bulk samples into species richness maps. We tested the method on a series of arthropod bulk samples obtained from various experimental agricultural plots. The script runs smoothly and is reasonably fast. We hope that our assemble first, predict later approach to statistical modelling of species richness will set the stage for the transition from data-rich but finite sets of point samples to spatially continuous biodiversity maps.
The task of recognizing species names in scientific articles is a quintessential step for a large number of applications in high-throughput text mining and data analytics, such as species-specific information collection, construction of species food networks and trophic relationship extraction. These tasks become even more important in fast-paced species-discovery areas such as entomology, where an impressive number of new arthropod species are discovered each year. This article explores the use of twocharacter n-grams (bigrams) in machine learning models for arthropod species name recognition. This particular method has been previously applied successfully to the task of language identification but the application to species name identification had yet to be explored.
Arthropod species names, regular English words used in scientific publications and person names were collected from the public domain and bigrams were extracted and used as classifier features. A number of learning classifiers spanning 7 algorithmic categories (tree-based, rule-based, artificial neural network, Bayesian, boosting, lazy and kernel-based) were tested and the highest accuracies were consistently obtained with LIBLINEAR, Bayesian Logistic Regression, the Multilayer Perceptron, Random Forest, and the LIBSVM classifiers. When compared with dictionary-based external software tools such as GNRD and TaxonFinder, our top-3 classifiers were insensitive to words capitalization and were able to correctly classify novel species names that are absent in dictionary-based approaches with accuracies between 88.6% and 91.6%.
Our results suggest that character bigram-based classification is a suitable method for distinguishing arthropod species names from regular English words and person names commonly found in scientific literature. Moreover, our method can also be used to reduce the number of false positives produced by dictionary-based methods.