Thursday, November 12, 2020

Metabarcoding remote learning course - March 01, 2021 to March 28, 2021

I will be teaching our Metabarcoding course again coming March (March 01, 2021 to March 28, 2021).

This course will provide an overview of the state of current technology and the various sequencing platforms used. The course consists of a series of online lectures and research exercises introducing different aspects of metabarcoding and metagenomics. We will also touch on the suite of bioinformatics tools available for sequence analysis and data interpretation. The course goes over 4 weeks but is designed in a fashion that you can go through course content at your own pace and according to your own schedule with work worth 4-8h per week. The course is fully asynchronous to accommodate for participants from various time zones. We still strive to make it as interactive as possible.

For more information please go on the course enrolment page at the University of Guelph.

Thursday, August 20, 2020

Important message from the World Register of Marine Species (WoRMS)

 WoRMS needs YOU! 


WoRMS is a highly collaborative effort of over 500 involved experts, but we need all users – taxonomists, ecologists and non-scientists – to help us to keep WoRMS up-to-date and correct. If you find an error or an omission, please get in touch with us directly. Direct contact can fix errors a lot faster and more efficient than the WoRMS Team having to learn about these through peer-reviewed publications.


The World Register of Marine Species is a community driven effort to provide an authoritative and comprehensive list of names of marine organisms. The only way to achieve this goal is through broad-scale collaboration between taxonomic experts from a wide range of disciplines, regions and backgrounds. The past thirteen years have been a story of success, with more than 500 taxonomic and thematic editors volunteering their time to participate in the creation of this unique and freely available resource.


WoRMS is truly collaborative and does not rely on the taxonomic editors alone to improve its content and functionality. The input of its users is critical to the work of WoRMS, to provide feedback, spot omissions and errors, and in making suggestions for improved tools and new features the community needs. The support of the Data Management Team, in processing the numerous enquiries from users, answering or directing them to the right editor, and ensuring they are dealt with swiftly, is also key to the success of the database.


We write this plea for direct contact with the WoRMS team in response to a number of publications written with the aim of highlighting errors and omissions in WoRMS, but without contacting the WoRMS team to inform us of the issues . Although the WoRMS team can fix omissions and errors quite rapidly – on average within a few days – we do need to be aware of them.


With over 500 editors making edits on the database on a daily and voluntary basis – the Steering Committee and the Data Management Team cannot 'police' everything that is being edited, and thus we rely on trust, expertise and goodwill of users and experts to inform us of problems that we can then look into. 


If you notice any errors or omissions in WoRMS we ask that you please simply contact us at, rather than writing editorials or blogs or publishing about them. Once the WoRMS Data Management Team have been alerted to the issue then the feedback can be logged and dealt with swiftly and efficiently by either addressing it directly or rerouting it to the responsible editor and/or the WoRMS Steering Committee. It would be most useful if you can also provide relevant documents/research papers together with your feedback to help us processing your feedback quickly. 


If we do not know about the problem, we cannot fix it – but we do promise to work to solve issues once we are informed of them. Working together, we can improve WoRMS for all users.

Friday, May 15, 2020

Weekend reads - Week 20/2020

Here in Canada we are having a long weekend which means for some there is even more time to read. No worries, I won't add more than usual to this blog post although there have been quite a few new papers that were published over the past two weeks. Here we go: 

A clear insight into the large-scale community structure of planktonic copepods is critical to understanding the mechanisms controlling diversity and biogeography of marine taxa in terms of their high abundance, ubiquity, and sensitivity to environmental changes. Here, we applied a 28S metabarcoding approach to large-scale communities of epipelagic and mesopelagic copepods at 70 stations across the Pacific Ocean and three stations in the Arctic Ocean. Major patterns of community structure and diversity, influenced by water mass structures, agreed with results from previous morphology-based studies. However, a large-scale metabarcoding approach could detect community changes even under stable environmental conditions, including changes in the north/south subtropical gyres and east/west areas within each subtropical gyre. There were strong effects of the epipelagic environment on mesopelagic communities, and community subdivisions were observed in the environmentally stable mesopelagic layer. In each sampling station, higher operational taxonomic unit (OTU) numbers and lower phylogenetic diversity were observed in the mesopelagic layer than in the epipelagic layer, indicating a recent rapid increase in species numbers in the mesopelagic layer. The phylogenetic analysis utilizing representative sequences of OTUs revealed trends of recent emergence of cold-water OTUs, which are mainly distributed at high latitudes with low water temperatures. Conversely, the high diversity of copepods at low latitudes was suggested to have been formed through long evolution under high water temperature conditions. The metabarcoding results suggest that evolutionary processes have strong impacts on current patterns of copepod diversity, and support the "out of the tropics" theory explaining latitudinal diversity gradients of copepods. Diversity patterns in both epipelagic and mesopelagic copepods was highly correlated to sea surface temperature; thus, predicted global warming may have a significant impact on copepod diversity in both layers.

Biological conclusions based on DNA barcoding and metabarcoding analyses can be strongly influenced by the methods utilized for data generation and curation, leading to varying levels of success in the separation of biological variation from experimental error. The 5' region of cytochrome c oxidase subunit I (COI-5P) is the most common barcode gene for animals, with conserved structure and function that allows for biologically informed error identification. Here, we present coil ( ), an R package for the pre-processing and frameshift error assessment of COI-5P animal barcode and metabarcode sequence data. The package contains functions for placement of barcodes into a common reading frame, accurate translation of sequences to amino acids, and highlighting insertion and deletion errors. The analysis of 10 000 barcode sequences of varying quality demonstrated how coil can place barcode sequences in reading frame and distinguish sequences containing indel errors from error-free sequences with greater than 97.5% accuracy. Package limitations were tested through the analysis of COI-5P sequences from the plant and fungal kingdoms as well as the analysis of potential contaminants: nuclear mitochondrial pseudogenes and Wolbachia COI-5P sequences. Results demonstrated that coil is a strong technical error identification method but is not reliable for detecting all biological contaminants.

The meiofauna is an important part of the marine ecosystem, but its composition and distribution patterns are relatively unexplored. Here we assessed the biodiversity and community structure of meiofauna from five locations on the Swedish western and southern coasts using a high-throughput DNA sequencing (metabarcoding) approach. The mitochondrial cytochrome oxidase 1 (COI) mini-barcode and nuclear 18S small ribosomal subunit (18S) V1-V2 region were amplified and sequenced using Illumina MiSeq technology. Our analyses revealed a higher number of species than previously found in other areas: thirteen samples comprising 6.5 dm3 sediment revealed 708 COI and 1,639 18S metazoan OTUs. Across all sites, the majority of the metazoan biodiversity was assigned to Arthropoda, Nematoda and Platyhelminthes. Alpha and beta diversity measurements showed that community composition differed significantly amongst sites. OTUs initially assigned to Acoela, Gastrotricha and the two Platyhelminthes sub-groups Macrostomorpha and Rhabdocoela were further investigated and assigned to species using a phylogeny-based taxonomy approach. Our results demonstrate that there is great potential for discovery of new meiofauna species even in some of the most extensively studied locations.

The complexity and natural variability of ecosystems present a challenge for reliable detection of change due to anthropogenic influences. This issue is exacerbated by necessary trade-offs that reduce the quality and resolution of survey data for assessments at large scales. The Peace–Athabasca Delta (PAD) is a large inland wetland complex in northern Alberta, Canada. Despite its geographic isolation, the PAD is threatened by encroachment of oil sands mining in the Athabasca watershed and hydroelectric dams in the Peace watershed. Methods capable of reliably detecting changes in ecosystem health are needed to evaluate and manage risks. Between 2011 and 2016, aquatic macroinvertebrates were sampled across a gradient of wetland flood frequency, applying both microscope-based morphological identification and DNA metabarcoding. By using multispecies occupancy models, we demonstrate that DNA metabarcoding detected a much broader range of taxa and more taxa per sample compared to traditional morphological identification and was essential to identifying significant responses to flood and thermal regimes. We show that family-level occupancy masks high variation among genera and quantify the bias of barcoding primers on the probability of detection in a natural community. Interestingly, patterns of community assembly were nearly random, suggesting a strong role of stochasticity in the dynamics of the metacommunity. This variability seriously compromises effective monitoring at local scales but also reflects resilience to hydrological and thermal variability. Nevertheless, simulations showed the greater efficiency of metabarcoding, particularly at a finer taxonomic resolution, provided the statistical power needed to detect change at the landscape scale.

Better knowledge of food webs and related ecological processes is fundamental to understanding the functional role of biodiversity in ecosystems. This is particularly true for pest regulation by natural enemies in agroecosystems. However, it is generally difficult to decipher the impact of predators, as they often leave no direct evidence of their activity. Metabarcoding via high-throughput sequencing (HTS) offers new opportunities for unraveling trophic linkages between generalist predators and their prey, and ultimately identifying key ecological drivers of natural pest regulation. Here, this approach proved effective in deciphering the diet composition of key predatory arthropods (nine species.; 27 prey taxa), insectivorous birds (one species, 13 prey taxa) and bats (one species; 103 prey taxa) sampled in a millet-based agroecosystem in Senegal. Such information makes it possible to identify the diet breadth and preferences of predators (e.g., mainly moths for bats), to design a qualitative trophic network, and to identify patterns of intraguild predation across arthropod predators, insectivorous vertebrates and parasitoids. Appropriateness and limitations of the proposed molecular-based approach for assessing the diet of crop pest predators and trophic linkages are discussed.


Increasing evidence for global insect declines is prompting a renewed interest in the survey of whole insect communities. DNA metabarcoding can contribute to assessing diverse insect communities over a range of spatial and temporal scales, but efforts are still needed to optimise and standardise procedures, from field sampling, through laboratory analysis, to bioinformatic processing.
Here we describe and test a methodological pipeline for surveying nocturnal flying insects, combining a customised automatic light trap and DNA metabarcoding. We optimised laboratory procedures and then tested the methodological pipeline using 12 field samples collected in northern Portugal in 2017. We focused on Lepidoptera to compare metabarcoding results with those from morphological identification, using three types of bulks produced from each sample (individuals, legs and the unsorted mixture).
The customised trap was highly efficient at collecting nocturnal flying insects, allowing a small team to operate several traps per night, and a fast field processing of samples for subsequent metabarcoding with low contamination risks. Morphological processing yielded 871 identifiable individuals of 102 Lepidoptera species. Metabarcoding detected a total of 528 taxa, most of which were Lepidoptera (31.1%), Diptera (26.1%) and Coleoptera (14.7%). There was a reasonably high matching in community composition between morphology and metabarcoding when considering the ‘individuals’ and ‘legs’ bulk samples, with few errors mostly associated with morphological misidentification of small microlepidoptera. Regarding the ‘mixture’ bulk sample, metabarcoding identified nearly four times more Lepidoptera species than morphological examination.
Our study provides a methodological metabarcoding pipeline that can be used in standardised surveys of nocturnal flying insects, showing that it can overcome limitations and potential shortcomings of traditional methods based on morphological identification. Our approach efficiently collects highly diverse taxonomic groups such as nocturnal Lepidoptera that are poorly represented when using Malaise traps and other widely used field methods. To enhance the potential of this pipeline in ecological studies, efforts are needed to test its effectiveness and potential biases across habitat types and to extend the DNA barcode databases for important groups such as Diptera.

Modern ecosystem models have the potential to greatly enhance our capacity to predict community responses to change, but they demand comprehensive spatial distribution information, creating the need for new approaches to gather and synthesize biodiversity data. Metabarcoding or metagenomics can generate comprehensive biodiversity data sets at species-level resolution but they are limited to point samples. CommDivMap contains a number of functions that can be used to turn OTU tables resulting from metabarcoding runs of bulk samples into species richness maps. We tested the method on a series of arthropod bulk samples obtained from various experimental agricultural plots. The script runs smoothly and is reasonably fast. We hope that our assemble first, predict later approach to statistical modelling of species richness will set the stage for the transition from data-rich but finite sets of point samples to spatially continuous biodiversity maps.

The task of recognizing species names in scientific articles is a quintessential step for a large number of applications in high-throughput text mining and data analytics, such as species-specific information collection, construction of species food networks and trophic relationship extraction. These tasks become even more important in fast-paced species-discovery areas such as entomology, where an impressive number of new arthropod species are discovered each year. This article explores the use of twocharacter n-grams (bigrams) in machine learning models for arthropod species name recognition. This particular method has been previously applied successfully to the task of language identification but the application to species name identification had yet to be explored.
Arthropod species names, regular English words used in scientific publications and person names were collected from the public domain and bigrams were extracted and used as classifier features. A number of learning classifiers spanning 7 algorithmic categories (tree-based, rule-based, artificial neural network, Bayesian, boosting, lazy and kernel-based) were tested and the highest accuracies were consistently obtained with LIBLINEAR, Bayesian Logistic Regression, the Multilayer Perceptron, Random Forest, and the LIBSVM classifiers. When compared with dictionary-based external software tools such as GNRD and TaxonFinder, our top-3 classifiers were insensitive to words capitalization and were able to correctly classify novel species names that are absent in dictionary-based approaches with accuracies between 88.6% and 91.6%.
Our results suggest that character bigram-based classification is a suitable method for distinguishing arthropod species names from regular English words and person names commonly found in scientific literature. Moreover, our method can also be used to reduce the number of false positives produced by dictionary-based methods. 

Friday, May 1, 2020

Weekend reads -- Week 18/2020

Add caption
I am revitalizing an older tradition of this blog. A weekly (very subjective) collection of papers relating to DNA barcoding, metabarcoding and everything related:

Insects form an established part of the diet in many parts of the world and insect food products are emerging into the European and North American marketplaces. Consumer confidence in product is key in developing this market, and accurate labelling of content identity is an important component of this. We used DNA barcoding to assess the accuracy of insect food products sold in the UK. We purchased insects sold for human consumption from online retailers in the UK and compared the identity of the material ascertained from DNA barcoding to that stated on the product packaging. To this end, the COI sequence of mitochondrial DNA was amplified and sequenced, and compared the sequences produced to reference sequences in NCBI and the Barcode of Life Data System (BOLD). The barcode identity of all insects that were farmed was consistent with the packaging label. In contrast, disparity between barcode identity and package contents was revealed in two cases of foraged material (mopane worm and winged termites). One case of very broad family-level description was also highlighted, where material described as grasshopper was identified as Locusta migratoria from DNA barcode. Overall these data indicate the need to establish tight protocols to validate product identity in this developing market. Maintaining biosafety and consumer confidence rely on accurate and consistent product labelling that provides a clear chain of information from producer to consumer.

Walnut (Juglans regia L.) is one of the most widely cultivated nuts. Walnut milk beverage is very popular in China due to its nutritional value. However, adulterated walnut milk ingredients have been detected in the Chinese market. Peanut and soybean are sold at much lower prices than walnut and are reported to be commonly used for adulteration in the industrial chain of walnut milk production. The purpose of this study is therefore to develop an accurate and efficient method for detecting the authenticity of the raw materials used in walnut milk beverage. DNA barcoding and high‐resolution melting (HRM) analyses were used to identify common adulterated raw ingredients such as peanut and soybean in commercial walnut milk beverage samples. The chloroplast psbA‐trnH gene was used for sequencing, and HRM analysis was performed. We also prepared experimental mixtures, in the laboratory, with different quantities of walnut, peanut, and soybean. High‐resolution melting analysis of the experimental mixtures clearly distinguished all of them. The results revealed that most of the walnut milk beverage samples fell in the same cluster of walnut species. Several samples fell in the peanut cluster, confirming that they were adulterated products. The results revealed that HRM analysis based on the psbA‐trnH barcode sequence can be used to identify raw ingredients in walnut milk beverages. 

Accurate and cost-effective methods for tracking changes in arthropod communities are needed to develop integrative environmental monitoring programs in the Arctic. To date, even baseline data on their species composition at established ecological monitoring sites are severely lacking. We present the results of a pilot assessment of non-marine arthropod diversity in a middle arctic tundra area near Ikaluktutiak (Cambridge Bay), Victoria Island, Nunavut, undertaken in 2018 using DNA barcodes. A total of 1264 Barcode Index Number (BIN) clusters, used as a proxy for species, were recorded. The efficacy of widely used sampling methods was assessed. Yellow pan traps captured 62% of the entire BIN diversity at the study sites. When complemented with soil and leaf litter sifting, the coverage rose up to 74.6%. Combining community-based data collection with high-throughput DNA barcoding has the potential to overcome many of the logistic, financial, and taxonomic obstacles for large-scale monitoring of the Arctic arthropod fauna.

Improved taxonomic methods are needed to quantify declining populations of insect pollinators. This study devises a high‐throughput DNA barcoding protocol for a regional fauna (United Kingdom) of bees (Apiformes), consisting of reference library construction, a proof‐of‐concept monitoring scheme, and the deep barcoding of individuals to assess potential artefacts and organismal associations. A reference database of cytochrome oxidase c subunit 1 (cox1) sequences including 92.4% of 278 bee species known from the UK showed high congruence with morphological taxon concepts, but molecular species delimitations resulted in numerous split and (fewer) lumped entities within the Linnaean species. Double tagging permitted deep Illumina sequencing of 762 separate individuals of bees from a UK‐wide survey. Extracting the target barcode from the amplicon mix required a new protocol employing read abundance and phylogenetic position, which revealed 180 molecular entities of Apiformes identifiable to species. An additional 72 entities were ascribed to nuclear pseudogenes based on patterns of read abundance and phylogenetic relatedness to the reference set. Clustering of reads revealed a range of secondary operational taxonomic units (OTUs) in almost all samples, resulting from traces of insect species caught in the same traps, organisms associated with the insects including a known mite parasite of bees, and the common detection of human DNA, besides evidence for low‐level cross‐contamination in pan traps and laboratory procedures. Custom scripts were generated to conduct critical steps of the bioinformatics protocol. The resources built here will greatly aid DNA‐based monitoring to inform management and conservation policies for the protection of pollinators.

Freshwaters face some of the highest rates of species loss, caused by strong human impact. To decrease or even revert this strong impact, ecological restorations are increasingly applied to restore and maintain the natural ecological status of freshwaters. Their ecological status can be determined by assessing the presence of indicator species (e.g., certain fish species), which is called biomonitoring. However, traditional biomonitoring of fish, such as electrofishing, is often challenging and invasive. To augment traditional biomonitoring of fish, the analysis of environmental DNA (eDNA) has recently been proposed as an alternative, sensitive approach. The present study employed this modern approach to monitor the Rhine sculpin (Cottus rhenanus), a fish species that has been reintroduced into a recently restored stream within the Emscher catchment in Germany, in order to validate the success of the applied restorations and to monitor the species’ dispersal. We monitored the dispersal of the Rhine sculpin using replicated 12S end-point nested PCR eDNA surveillance at a fine spatial and temporal scale. In that way, we investigated if eDNA analysis can be applied for freshwater assessments. We also performed traditional electrofishing in one instance to validate our eDNA-based approach. We could track the dispersal of the Rhine sculpin and showed a higher dispersal potential of the species than we assumed. eDNA detection indicated the species’ dispersal across a potential dispersal barrier and showed a steep increase of positive detections once the reintroduced population had established. In contrast to that, false negative eDNA results occurred at early reintroduction stages. Our results show that eDNA detection can be used to confirm and monitor reintroductions and to contribute to the assessment and modeling of the ecological status of streams.

Environmental DNA (eDNA) is usually defined as genetic material obtained directly from environmental samples, such as soil, water, or ice. Coupled to DNA metabarcoding, eDNA is a powerful tool in biodiversity assessments. Results from eDNA approach provided valuable insights to the studies of past and contemporary biodiversity in terrestrial and aquatic environments. However, the state and fate of eDNA are still investigated and the knowledge about the form of eDNA (i.e., extracellular vs. intracellular) or the DNA degradation under different environmental conditions is limited. Here, we tackle this issue by analyzing foraminiferal sedimentary DNA (sedDNA) from different size fractions of marine sediments: >500 µm, 500-100 µm, 100-63 µm, and < 63 µm. Surface sediment samples were collected at 15 sampling stations located in the Svalbard archipelago. Sequences of the foraminifera-specific 37f region were generated using Illumina technology. The presented data may be used as a reference for a wide range of eDNA-based studies, including biomonitoring and biodiversity assessments across time and space.


Environmental DNA (eDNA) analysis utilises trace DNA released by organisms into their environment for species detection and is revolutionising non-invasive species monitoring. The use of this technology requires rigorous validation - from field sampling to interpretation of PCR-based results - for meaningful application and interpretation. Assays targeting eDNA released by individual species are typically validated with no predefined criteria to answer specific research questions in one ecosystem. Their general applicability, uncertainties and limitations often remain undetermined. The absence of clear guidelines prevents targeted eDNA assays from being incorporated into species monitoring and policy, thus their establishment will be key for the future implementation of eDNA-based surveys. We describe the measures and tests necessary for successful validation of targeted eDNA assays and the associated pitfalls to form the basis of guidelines. A list of 122 variables was compiled and consolidated into a scale to assess the validation status of individual assays. These variables were evaluated for 546 published single-species assays. The resulting dataset was used to provide an overview of current validation practices and test the applicability of the validation scale for future assay rating. The 122 variables representing assay validation status were classified into 14 thematic blocks, such as "in silico analysis", and arranged on a 5-level validation scale from "incomplete" to "operational". Additionally, minimum validation criteria were defined for each level. The majority (30%) of investigated assays were classified as Level 1 (incomplete), and 15% did not achieve this first level. These assays were characterised by minimal in silico and in vitro testing, but their share in annually published eDNA assays has declined since 2014. The total number of reported variables ranged from 20% to 76% and deviated both between and within levels. The meta-analysis demonstrates the suitability of the 5-level validation scale for assessing targeted eDNA assays. It is a user-friendly tool to evaluate previously published assays for future research and routine monitoring, while also enabling appropriate interpretation of results. Finally, it provides guidance on validation and reporting standards for newly developed assays.

We used two large-scale metabarcoding datasets to evaluate phylogenetic signals at global marine and regional terrestrial scales using co-occurrence and co-exclusion networks. Phylogenetic relatedness was estimated using either global pairwise sequence distance or phylogenetic distance and the significance of observed patterns relating networks and phylogenies were evaluated against two null models. In all datasets, we found that phylogenetically close OTUs significantly co-occurred more often, and OTUs with intermediate phylogenetic relatedness co-occurred less often, than expected by chance. Phylogenetically close OTUs co-excluded less often than expected by chance in the marine datasets only. Simultaneous excess of co-occurrences and co-exclusions were observed in the inversion zone between close and intermediate phylogenetic distance classes in marine surface. Similar patterns were observed by using either pairwise sequence or phylogenetic distances, and by using both null models. These results suggest that environmental filtering and dispersal limitation are the preponderant forces driving co-occurrence of protists in both environments, while signal of competitive exclusion was only detected in the marine surface environment. The discrepancy in the co-exclusion pattern is potentially linked to the individual environments: water bodies are more homogeneous while tropical forest soils contain a myriad of nutrient rich micro-environment reducing the strength of mutual exclusion.

The Bees@Schools Program

Some years after the last run of our successful School Malaise Trap Program we started thinking about new ways to involve citizen scientists at schools in our research. We pitched an idea to the Natural Sciences and Engineering Research Council of Canada (NSERC) and were granted some funds to set it up and start with a few runs.

The Bees@School project initially involved 100 school classrooms in discerning critical information on the changing geographic distributions of plant-pollinator interactions across Canadat. By combining state-of-the-art DNA barcoding of bees, and the pollen they carry, with distribution and climate change data, we are collecting data to show how distributions of Canada’s bee species are changing along with climate. The project will also help to determine how pollination services shift across Canada, with impacts on food production. The ultimate hope is to provide landscape management advice to improve vital species' chances of persisting in agricultural landscapes and alleviating pollination deficits. 

Each participating school receives a wild bee nest box (perhaps better known as bee hotel) in the spring that is installed throughout the summer. In the fall, nest boxes are sent back to our institute. Here, the contents of the nest boxes will be analyzed using DNA barcoding of the larvae we found and metabarcoding of the pollen that was provisioned for them. 

The project is run by one of my grad students, Sage Handler. She is doing pretty much everything from communication with schools and the public to the laboratory work and data analysis. As you can imagine, her planning for this year's run was thrown into chaos once the COVID-19 pandemic caused major lockdowns including the closure of schools here in Canada. However, we were able to shift gears and run the the program regardless. Thanks to the support of so many teachers 200 traps are currently deployed across Canada. You'll find them in teacher's backyards, school yards or public spaces and now we are developing material (videos, activities for kids at home etc.) to keep this as educational as possible amidst the school closures. This video is just an example on how kids can learn and interact with the program. 

Thursday, April 30, 2020

18 Million Euros to explore biodiversity

Our friends at Naturalis in Leiden, Netherlands released this piece of great news today: 

An organized overview of all multicellular flora and fauna in the Netherlands and the infrastructure to identify them semi-automatically. This is what the ARISE megaproject wants to achieve in five to ten years’ time. The Dutch Research Council (NWO), Naturalis Biodiversity Center, the University of Amsterdam, the University of Twente and the Westerdijk Fungal Biodiversity Institute are investing a combined total of over 18 million euros. Koos Biesmeijer, leader of the consortium, is very pleased: "ARISE will give a big boost to our understanding of food webs and ecosystems and in the status and trends of our biodiversity."

The ARISE project aims to construct an infrastructure, the only one of its kind in the world, in order to identify and monitor every species of multicellular flora and fauna in the Netherlands. This infrastructure will combine information from DNA, visual/audio recognition and radar data to yield a comprehensive picture of the country's biodiversity. The international community is following the project with great interest as well. 
The ARISE project may prove highly valuable as a means to supply policymakers, water authorities, provinces and other stakeholders with more reliable input in the field of biodiversity.

A new era in species identification
Edwin van Huis, General manager of Naturalis Biodiversity Center, views this as an important step. “The loss of biodiversity is one of the chief threats to humanity's survival. For this reason, we urgently need better instruments for species identification and for monitoring biodiversity. Because only if we know what is, we can make an effort to preserve it.”
According to Annemarie van Wezel, scientific director of the Institute for Biodiversity and Ecosystem Dynamics (IBED/UvA), ARISE will lay the foundation for a new era of systematic ecological research into biodiversity. “Together with scientific partners, we will be demonstrating the added value of this infrastructure at demo sites. We'll be able to better understand patterns of biodiversity and changes in those patterns, which in turn will help us improve efforts to manage that biodiversity.”
Joost Kok, dean at the Faculty of Electrical Engineering, Mathematics and Computer Science at the University of Twente, is enthusiastic about the unprecedented opportunities this project offers. “The proposed infrastructure brings together many new insights in the fields of artificial intelligence and data science. This wouldn't have been feasible even ten years ago.” Pedro Crous, director of the Westerdijk Fungal Biodiversity Institute, adds: “ARISE will make it possible for researchers to identify every species they come across, even if it's one that was previously undiscovered.”

Unique combination: multidisciplinary collaboration and integration of expertise
The Netherlands has long had a leading role in international collaborative partnerships relating to species identification and biodiversity. Because ARISE integrates a variety of techniques, the project is extremely advanced and the only one of its kind in the world. The ARISE project will make it possible to identify species quickly and semi-automatically based on the reference collections of Naturalis and the Westerdijk Institute. The University of Amsterdam is supplying expertise with regard to how ecosystems function and the University of Twente is contributing knowledge of state-of-the-art data science and artificial intelligence. Each of these parties was already exploring the use of sensors, DNA, image recognition, radar, audio and data science for the purpose of species identification.

Access to the most advanced near-real-time identification service
This integrated infrastructure and facility will provide Dutch researchers, nature conservation organizations, government bodies and the business community with access to the most advanced near-real-time identification service for monitoring biodiversity and species detection. This, in turn, will yield new opportunities for understanding how ecosystems function, identifying trends and better integrating attention to biodiversity into solutions for major societal challenges such as the circular economy, nature-inclusive cities and the agricultural cycle. 

Thursday, April 9, 2020

The Centre for Biodiversity Genomics: a look inside the world's leading DNA barcoding facility

A new video showing what our institute was busily doing not too long ago. Great way to spend six minutes of your home office time. A tour through the institute by the "Father of DNA Barcoding" himself:

Thursday, March 26, 2020

From the inbox: Looking for a project coordinator in CaBOL

Dear fellow barcoders,

we plan to extend our ongoing barcoding efforts in the Caucasus region by a new project, CaBOL. We submitted the CaBOL proposal a while ago and recently received very positive signals from the funder, although not yet the official grant notification.
In order to be ready to start the project soon after COVID loosens its grip, I would like to circulate the job offer for a CaBOL project coordinator (based at ZFMK, Bonn, Germany, but with frequent visits to Georgia and Armenia), please get in touch or forward this:

The Zoological Research Museum A. Koenig in Bonn, Germany, is looking for a full-time project coordinator (m/f/d)

within the project "CaBOL: A Georgian-Armenian-German Initiative to establish a joint Caucasian Biodiversity Research Platform". Conditional on the funding of the CaBOL project through BMBF, the position can be filled for three years.

Your tasks:
• Coordinating the network of CaBOL partner institutes (in Germany, Georgia, and Armenia), monitoring schedules and project funds, reporting, organization of project meetings
• PR, dissemination of results, representing CaBOL
• Coordinating web page development, curating content of web page
• Managing the CaBOL subproject at Museum Koenig
• Guest liaison and supervision of students
• Maintaining contact and procuring samples to taxon experts for species identification
• Data analysis

Your profile:
• University degree in biology or related life science, in the organismic domain
• Especially outstanding organizational skills and bargaining strength
• Very good knowledge of the English language and experience with composing scientific texts and reports
• Basic skills in DNA barcoding, in working with databases, and entomological expertise of advantage
• Intercultural competence
• Readiness to travel frequently (esp. into the Caucasus)

We offer a highly motivating environment at a renowned and pioneering research facility and the possibility to work independently. Salary and benefits are according to a public service position in Germany, TV-L E 13.

Equally qualified severely disabled applicants will be given preference. Qualified women are strongly encouraged to apply.

Please send your application by e-mail attachment, including a detailed CV, until April 19, 2020 to Mrs. Heike Lenz at Stiftung Zoologisches Forschungsmuseum Alexander Koenig, Leibniz-Institut für Biodiversität der Tiere, Adenauerallee 160, 53113 Bonn or via E-Mail as a summarized PDF to: In case of questions concerning the position please send an email to Jonas Astrin:

For more information about our institution see

Best wishes
 Jonas Astrin

Wednesday, February 26, 2020

5 Million Specimens at the CBG collection

Over the past decade and a half our Natural History collection here at the Centre for Biodiversity Genomics continuously grew and today our media team publicly announced that we surpassed the 5 Million Specimen mark. Mind you all of those are digitized (and barcoded).

Here the official press release:

February 2020 marks an important milestone for the Centre for Biodiversity Genomics (CBG) at the University of Guelph. CBG’s in-house natural history collection (CBG Collections) has reached five million barcode voucher specimens. Each specimen is fully digitized, sequenced and available online through the Barcode of Life Data System platform (BOLD).
Since its inception in 2006, the barcode reference collection at CBG has grown steadily. The development of DNA barcoding high-throughput workflows has enabled the rapid collection, processing and sequencing of millions of specimens from across the globe. In the last five years, CBG has sorted, prepared and DNA barcoded 800,000 specimens per year before archiving in this unique reference collection. With the recent addition of the PacBio Sequel sequencing platform and launch of the BIOSCAN program, the CBG aspires to grow the collection by one to two million specimens each year.
The final specimen array of 95 specimens added to CBG Collections to reach five million were collected in a Malaise trap located in Guanacaste, Costa Rica as part of the ongoing BIOSCAN project BioAlfa. The array was very diverse, and included some of the following insects:
  • Hawk moth (Sphingidae: Xylophanes hannemanni) – these strong-flying moths have been recorded year-round in Costa Rica and have multiple generations per year.
  • Spider wasp (Pompilidae) – attack and paralyze spiders to lay their eggs on and feed their young; the adults feed on nectar.
  • Tortoise beetle (Chrysomelidae: Hybosa) – a common beetle that can be encountered in your own backyard; they feed on plant tissues and some are agricultural pests.
  • Lanternfly (Fulgoridae) – a colourful true-bug found throughout the tropics; despite their name, they do not emit light like a lantern.
  • Robber fly or assassin fly (Asilidae) – are ambush predators that target other insects for a meal.
  • Leafhopper (Cicadellidae: Erythrogonia) – come in a variety of colours and patterns and are found worldwide; leafhoppers are small insects with over 20,000 species described.

Tuesday, January 21, 2020

BIOSCAN - new video

In case you did not come across the new version of the BIOSCAN video - really worth watching!

BIOSCAN is iBOL's new seven-year, $180 million global research program that aims to revolutionize our understanding of biodiversity and our capacity to manage it. Involving scientists, research organizations, and citizens, BIOSCAN will explore three major research themes: Species Discovery, Species Interactions, Species Dynamics.

iBOL (International Barcode of Life Consortium) involves researchers in 30+ nations who share a mission to transform biodiversity science through DNA-based approaches with DNA barcoding at its core. iBOL works in partnership with academic, government, and private sector organizations.

Monday, January 6, 2020

PostDoc Bioinformatics and Environmental Genomics

A position to work at McGill partly in collaboration with our lab:

Preferred Disciplines: Biology, Bioinformatics (Postdoc position)
Project length: 2 years, renewable for 3rd year 
Approx. start date: February 15, 2020
McGill University, Montreal, QC

Summary of Project:
The Postdoctoral Fellow will be involved in long-term and highly replicated laboratory and field experiments on the effect of multiple stressors on the structure and function of aquatic communities. The research will involve developing and implementing bioinformatic tools for analysing metabarcoding, metagenomics and transcriptomics data sets and assessing biodiversity trends for broad taxonomic groups (bacterial, phytoplankton, zooplankton). The fellow will compare biodiversity estimates obtained from traditional sampling techniques with estimates based on refined metabarcoding approaches to describe the biodiversity of contaminated aquatic habitats. The project involves the biodiversity group at McGill University and collaborators from the Centre for Biodiversity Genomics (CBG), University of Guelph, University of Quebec at Montreal and University of Montreal.

Research Objectives/Sub-Objectives: 1) Develop sensitive metabarcoding bioinformatics protocols to describing aquatic communities; 2) Investigate the impact of multiple stressors on complex aquatic communities. 

Methodology: 1) Use high-throughput sequencing to develop metabarcoding and metagenomics protocols for describing aquatic communities in complex environmental samples; 2) Validate protocols; 3) Apply protocols on highly replicated field experiments.

Expertise and Skills Needed:
Experience with next generation sequencing or large sequence data and related bioinformatics / computational / programming skills is required. Familiarity with one or more of the following would be an advantage: genomics, transcriptomics, phylogenetic analyses, genome evolution / programming language (R/Unix/Python or Perl). Experience working with aquatic organisms would be an asset. The candidate should have a PhD in evolution / genetics / computational biology, a good publication record and the ability to work well in a collaborative research environment.
Applicants should send a curriculum vitae, short statements of research interests, and 3 representative publications to melania.cristescu@mcgill.caThe application deadline is January 31, 2020.

McGill University is strongly committed to diversity and equity within its community. McGill University is among Canada’s leading research-intensive universities with students from over 140 countries. The university is located in Montreal, a cosmopolitan city with great cultural and linguistic diversity.