Thursday, December 3, 2015

Quantitative metabarcoding

It is a bit of a holy grail for metabarcoding and a dream of many: Using it not only to determine species identities but also for quantifying relative species abundances. It is widely accepted that metabarcoding has its limits and is biased both biologically and technically. For example, chimeric sequences, contaminants and clustering algorithms can bias even the most basic outputs of DNA metabarcoding studies such as species richness. This becomes very problematic if one tries to infer abundance from the proportions of species DNA. Attempts have been made to extrapolate differences in mass or abundance of species through differences in sequence read abundance but biases have been repeatedly reported.

Previous attempts to control biasing factors in DNA metabarcoding studies have primarily focused on correcting for a single source of bias, or altering protocol steps that are known to introduce bias...An alternative approach to correcting for individual biases is to create control materials for target organisms, which when sequenced alongside environmental samples can be used to create correction factors that account for multiple sources of bias simultaneously.

A new study that just showed up in the accepted article section of Molecular Ecology Resources went the latter route and used sequencing of 50/50 mixtures (target species/control species) to establish relative correction factors (RCF) that account for multiple sources of bias and are applicable to field studies.

The colleagues focused on a rather small model system containing a few fish and a small prey library for Pacific harbour seals. However, they also applied the prey library derived correction factors to some wild seal scat samples to determine the impact of the correction method on a real world scenario. Their results show that the 50/50 RCF approach represents an effective tool for evaluating and correcting biases for the model chosen and likely for other studies as well. The authors also clearly state that their new method will not solve the problem for all possible scenarios and that there will be lots of cases where it is simply unrealistic to expect accurate estimates of species proportion based on DNA sequence read abundances.

However, in study systems focused on a limited number of species which have conserved barcode priming regions, 50/50 RCFs offer potential to improve proportional estimates by accounting for multiple sources of bias. The 50/50 RCF approach will be particularly useful when biases to sequence read abundance are substantial and the resulting species correction factor magnitudes are large. Even when it is not possible to generate a complete tissue library, a 50/50 RCF library consisting of a subset of key species could be used to screen for large species-specific biases and aid in the interpretation of sequencing results.

No comments:

Post a Comment