After some crazily busy weeks a quick weekend read blog post. Some really good papers have appeared in the last couple of weeks. Hard to make a selection.
DNA barcodes are useful for species discovery and species identification, but obtaining barcodes currently requires a well-equipped molecular laboratory, is time-consuming, and/or expensive. We here address these issues by developing a barcoding pipeline for Oxford Nanopore MinION™ and demonstrate that one flowcell can generate barcodes for ~500 specimens despite the high base-call error rates of MinION™ reads. The pipeline overcomes these errors by first summarizing all reads for the same tagged amplicon as a consensus barcode. Consensus barcodes are overall mismatch-free but retain indel errors that are concentrated in homopolymeric regions. They are addressed with an optional error correction pipeline that are corrected based on conserved amino-acid motifs from publicly available barcodes. The effectiveness of this pipeline is documented by analysing reads from three MinION™ runs that represent three different stages of MinION™ development. They generated data for (1) 511 specimens of a mixed Diptera sample, (2) 575 specimens of ants, and (3) 50 specimens of Chironomidae. The run based on the latest chemistry yielded MinION barcodes for 490 of the 511 specimens which were assessed against reference Sanger barcodes (N=471). Overall, the MinION barcodes have an accuracy of 99.3%-100% with the number of ambiguous bases after correction ranging from <0.01-1.5% depending on which correction pipeline is used. We demonstrate that it requires ~2 hours of sequencing to gather all information needed for obtaining reliable barcodes for most specimens (>90%). We estimate that up to 1000 barcodes can be generated in one flowcell and that the cost per barcode can be <USD 2.
While the high species diversity of tropical arthropod communities has often been linked to marked spatial heterogeneity, their temporal dynamics have received little attention. This study addresses this gap by examining spatio-temporal variation in the arthropod communities of a tropical montane forest in Honduras. By employing DNA barcode analysis and Malaise trap sampling across four years and five sites, 51,596 specimens were assigned to 8,193 presumptive species. High beta diversity was linked more strongly to elevation than geographic distance, decreasing by 12% when only the dominant species were considered. When sampling effort was increased by deploying more traps at a site, beta diversity only decreased by 2%, but extending sampling across years decreased beta diversity by 27%. Species inconsistently detected among years, likely transients from other settings, drove the low similarity in species composition among traps only a few metres apart. The dominant, temporally persistent species substantially influenced the cyclic pattern of change in community composition among years. This pattern likely results from divergence-convergence dynamics, suggesting a stable baseline of temporal turnover in each community. The overall results establish that large sample sizes are necessary to reveal species richness, but are not essential for quantifying beta diversity. This study further highlights the need for standardized methods of sampling and species identification to generate the comparative data required to evaluate biodiversity change in space and time.
BACKGROUND:
Reduced representation genomic datasets are increasingly becoming available from a variety of organisms. These datasets do not target specific genes, and so may contain sequences from parasites and other organisms present in the target tissue sample. In this paper, we demonstrate that (1) RADseq datasets can be used for exploratory analysis of tissue-specific metagenomes, and (2) tissue collections house complete metagenomic communities, which can be investigated and quantified by a variety of techniques.
METHODS:
We present an exploratory method for mining metagenomic "bycatch" sequences from a range of host tissue types. We use a combination of the pyRAD assembly pipeline, NCBI's blastn software, and custom R scripts to isolate metagenomic sequences from RADseq type datasets.
RESULTS:
When we focus on sequences that align with existing references in NCBI's GenBank, we find that between three and five percent of identifiable double-digest restriction site associated DNA (ddRAD) sequences from host tissue samples are from phyla to contain known blood parasites. In addition to tissue samples, we examine ddRAD sequences from metagenomic DNA extracted snake and lizard hind-gut samples. We find that the sequences recovered from these samples match with expected bacterial and eukaryotic gut microbiome phyla.
DISCUSSION:
Our results suggest that (1) museum tissue banks originally collected for host DNA archiving are also preserving valuable parasite and microbiome communities, (2) that publicly available RADseq datasets may include metagenomic sequences that could be explored, and (3) that restriction site approaches are a useful exploratory technique to identify microbiome lineages that could be missed by primer-based approaches.
Study of all flies (Diptera) collected for one year from a four-hectare (150 x 266 meter) patch of cloud forest at 1,600 meters above sea level at Zurquí de Moravia, San José Province, Costa Rica (hereafter referred to as Zurquí), revealed an astounding 4,332 species. This amounts to more than half the number of named species of flies for all of Central America. Specimens were collected with two Malaise traps running continuously and with a wide array of supplementary collecting methods for three days of each month. All morphospecies from all 73 families recorded were fully curated by technicians before submission to an international team of 59 taxonomic experts for identification. Overall, a Malaise trap on the forest edge captured 1,988 species or 51% of all collected dipteran taxa (other than of Phoridae, subsampled only from this and one other Malaise trap). A Malaise trap in the forest sampled 906 species. Of other sampling methods, the combination of four other Malaise traps and an intercept trap, aerial/hand collecting, 10 emergence traps, and four CDC light traps added the greatest number of species to our inventory. This complement of sampling methods was an effective combination for retrieving substantial numbers of species of Diptera. Comparison of select sampling methods (considering 3,487 species of non-phorid Diptera) provided further details regarding how many species were sampled by various methods. Comparison of species numbers from each of two permanent Malaise traps from Zurquí with those of single Malaise traps at each of Tapantí and Las Alturas, 40 and 180 km distant from Zurquí respectively, suggested significant species turnover. Comparison of the greater number of species collected in all traps from Zurquí did not markedly change the degree of similarity between the three sites, although the actual number of species shared did increase. Comparisons of the total number of named and unnamed species of Diptera from four hectares at Zurquí is equivalent to 51% of all flies named from Central America, greater than all the named fly fauna of Colombia, equivalent to 14% of named Neotropical species and equal to about 2.7% of all named Diptera worldwide. Clearly the number of species of Diptera in tropical regions has been severely underestimated and the actual number may surpass the number of species of Coleoptera. Various published extrapolations from limited data to estimate total numbers of species of larger taxonomic categories (e.g., Hexapoda, Arthropoda, Eukaryota, etc.) are highly questionable, and certainly will remain uncertain until we have more exhaustive surveys of all and diverse taxa (like Diptera) from multiple tropical sites. Morphological characterization of species in inventories provides identifications placed in the context of taxonomy, phylogeny, form, and ecology. DNA barcoding species is a valuable tool to estimate species numbers but used alone fails to provide a broader context for the species identified.
Given the ongoing decline of both pollinators and plants, it is crucial to implement effective methods to describe complex pollination networks across time and space in a comprehensive and high-throughput way. Here we tested if metabarcoding may circumvent the limits of conventional methodologies in detecting and quantifying plant-pollinator interactions. Metabarcoding experiments on pollen DNA mixtures described a positive relationship between the amounts of DNA from focal species and the number of trnL and ITS1 sequences yielded. The study of pollen loads of insects captured in plant communities revealed that as compared to the observation of visits, metabarcoding revealed 2.5 times more plant species involved in plant-pollinator interactions. We further observed a tight positive relationship between the pollen-carrying capacities of insect taxa and the number of trnL and ITS1 sequences. The number of visits received per plant species also positively correlated to the number of their ITS1 and trnL sequences in insect pollen loads. By revealing interactions hard to observe otherwise, metabarcoding significantly enlarges the spatiotemporal observation window of pollination interactions. By providing new qualitative and quantitative information, metabarcoding holds great promise for investigating diverse facets of interactions and will provide a new perception of pollination networks as a whole.
The DNA present in the environment is a unique and increasingly exploited source of information for conducting fast and standardized biodiversity assessments for any type of organisms. The datasets resulting from these surveys are however rarely compared to the quantitative predictions of biodiversity models. In this study, we simulate neutral taxa-abundance datasets, and artificially noise them by simulating noise terms typical of DNA-based biodiversity surveys. The resulting noised taxa abundances are used to assess whether the two parameters of Hubbell's neutral theory of biodiversity can still be estimated. We find that parameters can be inferred provided that PCR noise on taxa abundances does not exceed a certain threshold. However, inference is seriously biased by the presence of artifactual taxa. The uneven contribution of organisms to environmental DNA owing to size differences and barcode copy number variability does not impede neutral parameter inference, provided that the number of sequence reads used for inference is smaller than the number of effectively sampled individuals. Hence, estimating neutral parameters from DNA-based taxa abundance patterns is possible but requires some caution. In studies that include empirical noise assessments, our comprehensive simulation benchmark provides objective criteria to evaluate the robustness of neutral parameter inference.
DNA barcodes are widely used for identification and discovery of species. While such use draws on information at the DNA level, the current amassment of ca. 4.7 million COI barcodes also offers a unique resource for exploring functional constraints on DNA evolution. Here, we explore amino acid variation in a crosscut of the entire animal kingdom. Patterns of DNA variation were linked to functional constraints at the level of the amino acid sequence in functionally important parts of the enzyme. Six amino acid sites show variation with possible effects on enzyme function. Overall, patterns of amino acid variation suggest convergent or parallel evolution at the protein level connected to the transition into a parasitic life style. Denser sampling of two diverse insect taxa revealed that the beetles (Coleoptera) show more amino acid variation than the butterflies and moths (Lepidoptera), indicating fundamental difference in patterns of molecular evolution in COI. Several amino acid sites were found to be under notably strong purifying selection in Lepidoptera as compared to Coleoptera. Overall, these findings demonstrate the utility of the global DNA barcode library to extend far beyond identification and taxonomy, and will hopefully be followed by a multitude of work.
Moths are globally relevant as pollinators but nocturnal pollination remains poorly understood. Plant-pollinator interaction networks are traditionally constructed using either flower-visitor observations or pollen-transport detection using microscopy. Recent studies have shown the potential of DNA metabarcoding for detecting and identifying pollen-transport interactions. However, no study has directly compared the realised observations of pollen-transport networks between DNA metabarcoding and conventional light microscopy. Using matched samples of nocturnal moths, we construct pollen-transport networks using two methods: light microscopy and DNA metabarcoding. Focussing on the feeding mouthparts of moths, we develop and provide reproducible methods for merging DNA metabarcoding and ecological network analysis to better understand species-interactions. DNA metabarcoding detected pollen on more individual moths, and detected multiple pollen types on more individuals than microscopy, but the average number of pollen types per individual was unchanged. However, after aggregating individuals of each species, metabarcoding detected more interactions per moth species. Pollen-transport network metrics differed between methods, because of variation in the ability of each to detect multiple pollen types per moth and to separate morphologically-similar or related pollen. We detected unexpected but plausible moth-plant interactions with metabarcoding, revealing new detail about nocturnal pollination systems. The nocturnal pollination networks observed using metabarcoding and microscopy were similar, yet distinct, with implications for network ecologists. Comparisons between networks constructed using metabarcoding and traditional methods should therefore be treated with caution. Nevertheless, the potential applications of metabarcoding for studying plant-pollinator interaction networks are encouraging, especially when investigating understudied pollinators such as moths.
In recent decades, show caves have begun to suffer from microorganism proliferation due to artificial lighting installations for touristic activity. In addition to the aesthetic problem, light encourages microorganisms that are responsible for physical and chemical degradation of limestone walls, speleothems and prehistoric paintings of cultural value. Microorganisms have previously been described by microscopy or culture-dependent methods, but data provided by new generation sequencing are rare. The authors identified, for the first time, microorganisms proliferating in one Swiss and in four French show caves using three different primers. The results showed that both photosynthetic and non-photosynthetic bacteria were the dominant taxa present in biofilms. Microalgae were heavily represented by the Trebouxiophyceae, Eustigmatophyceae and Chlorophyceae groups. Twelve diatoms were also recorded, with dominance of Syntrichia sp. (96.1%). Fungi were predominantly represented by Ascomycota, Zygomycota and Basidiomycota, fully half of the sampled biofilms where Fungi were detected. Comparing microbial communities from bleach-treated caves to those in untreated caves showed no significant difference except for a low-level change in the abundance of certain taxa. These findings provided by Illumina sequencing reveal a complex community structure in the 5 caves based on the assembly of bacteria, cyanobacteria, algae, diatoms, fungi and mosses.