Thursday, December 4, 2014

Computational tools for environmental samples

Microbes of interest to science rarely exist in isolation. Organisms that are e.g. essential to breaking down pollutants or causing illness live in complex communities, and separating one microbe from hundreds of companion species can be challenging. Environmental samples can be equally complex containing hundreds or even thousands of different species (not necessarily microbes).

The ability to identify and enumerate the organisms in complex communities using culture-independent, genomic technologies and associated bioinformatics algorithms is becoming more important as scientists study organisms that can't be grown in the lab or deal with samples that are so rich that any presorting effort is simply too much effort. The majority of the world's organisms resist traditional lab culture, meaning they have to be studied in the field and identified through genetic information. Of course the DNA Barcoder in me immediately thinks of metabarcoding and very similar challenges. 

While we have a number of extremely potent sequencing platforms available to us there is a shortage of computational tools designed to help identify and characterize the genetic diversity of the residents of these environmental samples.

A new National Science Foundation-supported project run by the Georgia Institute of Technology and Michigan State University, aims equip researchers with the tools needed to compare the genomic information of organisms they encounter against the growing volumes of data provided by the world's scientific community. The tools will be hosted on a web server designed to be used by researchers who may not have training in the latest bioinformatics techniques. A prototype system containing a limited number of computational tools is already available and is attracting more than 500 users each month. The system will initially operate on servers at Georgia Tech and Michigan State University, but if demand and data grow, additional resources may be sought, such as the National Science Foundation's XSEDE supercomputer.

The tools developed in the project will not be developed specific to any one discipline as some of the research questions and issues are universal. The main task is to develop computational solutions for the problems of keeping up with all the new data that become available. The solutions are supposed to provide high throughput in order to deal with data volumes that are increasing geometrically.

Although all tools are developed with the intend to deal with microbial communities they allow for much more and that is what makes me hope we can utilize a good number of them for metabarcoding challenges. Current techniques e.g. identify individual microbes by examining their small subunit ribosomal RNA (SSU rRNA) genes, but the new tools will allow us to analyze entire genomes and meta-genomes and provide the necessary flexibility to deal with a plethora of problems.

I will certainly keep an eye on this project and I believe every metabarcoder should do too. 

No comments:

Post a Comment