Thursday, January 31, 2013

The seven deadly sins of DNA Barcoding (3)

It is time for sin number three in the series on the Collins and Cruickshank paper.

The use of the term ‘species identification’

The authors criticize the universal use of the term 'species identification' despite the fact that sometimes two different subdisciplines of DNA Barcoding are meant.

What's the problem here?
There are indeed two different applications for DNA Barcoding that shouldn't be confused. The primary motivation to do DNA Barcoding is the prospect of being able to identify an unknown organism based on a short sequence. I concur with the authors that this process of querying a DNA Barcode reference library with sequences of an unidentified specimen should bear the name species identification.

However, Collins and Cruickshank are concerned that species identification is often confused with species discovery. Building a DNA Barcoding reference libraries often exposes new species. When looking at this just semantically this should indeed be termed species discovery as identification represents the process of assigning a pre-existing name to an individual organism.

I have to admit that I never stumbled over this when reading DNA Barcoding papers although I have to admit that I might have overlooked it and figured out the main purpose of the study differently. That being said one needs to stress that I am an insider when it comes to DNA Barcoding and it is rather easy for me to figure such details out on the fly while someone not so much into the topic might be confused. What I have noticed quite often though is a confusion of species identification (now in its correct definition) and DNA Barcode library building. Those are completely different. Actually the former wouldn't work without the latter and this should be clearly separated in a publication. This is also related to the first sin as formulating a clear hypothesis based on an array of data points would go a long way to avoid this conflict.

Overall not a particular bad sin as in most cases I would think it is only a matter of semantic accuracy and not a real problem of study design.

Wednesday, January 30, 2013

Barcoding Life's Matrix

A comprehensive review conducted by the US National Academies' National Research Council suggests that most high school laboratory experiences fail to conform to established guidelines for effective science instruction. These guidelines advocate designing laboratory experiences such that they produce clear and discernible outcomes, merge science content with learning about the process of science, and integrate hands-on activities into a sequence of traditional didactic course instruction.

So much from the introduction of a community paper that has just been published in PLoS Biology. I remember my own high school times where laboratory experiments had been conducted over and over again for decades. I also remember my disappointment as this had nothing to do with 'real' science. Everything was designed in a way that it could be easily done and the outcome was predictable. There wasn't any surprise element or even a remote sense of discovery. Unfortunately this was also the case in the first two years at university. This is one of the reasons why I am particularly interested in new approaches to scientific education both in schools and universities.

The community paper mentioned above is one of those new approaches and I think it is a very successful one. It describes a project that brought together scientists and educators from the Coastal Marine Biolabs in Ventura, California, the Biodiversity Institute of Ontario, and the New York Hall of Science, New York. Barcoding Life's Matrix includes two main strategies for student engagement: a residential research experience for high school students (grades 11/12), and professional development workshops for high school science teachers.

The students go through an intensive, seven-day residential research experience hosted at Coastal Marine Biolabs biosciences laboratory in Ventura Harbor where they work alongside scientists conducting interrelated field, laboratory, and bioinformatics activities. The results are DNA Barcode reference data that are submitted  to BOLD and GenBank. So, next time you query the BOLD database with a sequence of an unknown marine sample it might match one of the records contributed by these young investigators.

BOLD-SDP's customized five consoles
In order to provide a student-friendly online environment the BOLD team designed the BOLD Student Data Portal (BOLD-SDP). BOLD-SDP is a classroom-focused interface to the BOLD database. It provides both instructors and students with the tools necessary to make contributions to the DNA Barcode library.  It also gives students the opportunity to integrate and analyze specimen and sequence data while providing instructors with tools to monitor student progress and evaluate their work. Students can explore BOLD, but they are also able to add their own data it.

High school science teachers are offered a professional development program which provides in-depth pedagogical and procedural training, multimedia instructional materials, and research equipment needed to engage students in the generation and submission of high-quality barcode data within their own school laboratories. I wish my teachers had enrolled to this course.

The Barcoding Life's Matrix project represents a striking new example of how discovery-based science can be effectively translated into secondary educational settings to address science education reform agendas, overcome extremely difficult challenges associated with molecular life science teaching and learning, and engage large numbers of students in the creation of a valuable public and scientific resource (an aspect of the project that students regard as particularly exciting and transformative).

Thumbs up! A very good read with a commendable strategy. There have been a lot of great projects that engage high school students in DNA Barcoding research but this paper represents a step forward as it formalizes the approach. I am sure it will set a precedent.

Tuesday, January 29, 2013

The seven deadly sins of DNA Barcoding (2)

Inadequate a priori identification of specimens

Let's get to sin number two in our series on the Collins and Cruickshank paper.

The authors point out an issue that limits the use of  DNA Barcoding as a practical resource. Human error and uncertainty in creating and curating reference libraries result in conflicting identifications because multiple labs are working on the same taxa. In the process of their morphological identifications they ascribe different taxonomic names to the same species. 

The problem is as old as taxonomy. The accuracy of an identification relies heavily on the experience of the identifier and the availability of proper taxonomic keys. I am sure that all collections have specimens in their drawers that are conflicting within and among institutions.

The good thing about DNA Barcoding is the fact that it can unsheathe such issues. The public availability of DNA Barcode data through BOLD allowed e.g. Collins and Cruickshank to show that there are many
unambiguous species-level identifications for ornamental cyprinid fishes and that the amount of error increased over time. This is indeed a problem for DNA Barcoding when it comes to practical applications but it certainly is not a problem that was generated by it. 

Over the years I've collaborated with quite a few curators at museums and many of them started using DNA Barcoding to clean up their collections. While it would be a life-long undertaking to identify incorrect names within their collection by using traditional approaches they use DNA Barcoding as a first filtering mechanism. They weed out the ones that are not placed within their expected group and subsequently revisit the actual voucher specimen to determine the reason for the wrong placement thereby reducing the amount of errors in their collection.

But back to the problem at hand. The rapid accumulation of DNA Barcodes over the last 10 years has led to an increasing amount of contradicting identifications and the introduction of Barcode Index Numbers (BINs) in BOLD has made it quite simple to spot those. The authors emphasize a crucial aspect of DNA barcoding is the maintenance of records, supporting information and voucher specimens; this is what sets BOLD apart from GenBank. I find it important to stress that all this information associated with a barcode record makes it so much easier to investigate contradictions. BOLD also began to provide a framework for community-based annotation of barcode data that can facilitate subsequent communication between researchers on the subject. As a result names can be changed and harmonised hopefully much easier and faster. In addition it can spur necessary discussions about the taxonomic status of some species.

Often the problem of different taxonomic names ascribed to the same species could be solved by increased diligence over how identifications are generated and justified. This would require a few more informations provided with each record but would go a long way to help. In 2011 Bob Hanner and I therefore proposed the implementation of a system of identification confidence to the FishBOL community. It is based on a system that is already in use at the Commonwealth Scientific and Industrial Research Organisation (CSIRO), in Australia. Identifications are rated according to the degree of expertise used and effort made. The system has five levels comprising a range of expertise. Highly reliable identifications should be provided by either an internationally recognized authority of the group, or a specialist that is presently studying the group in the respective region. The lowest level is defined as superficial when the specimen was identified by either a trained identifier who is uncertain of the family placement of the species, an untrained identifier using, at best, figures in a guide, or where the status and expertise of the identifier is unknown. In our paper we also stressed a sixth option: the case of an unknown specimen identified using only the BOLD ID engine or another genetic database. It is essential to introduce such labels to avoid the creation of a self-referential database.

Our authors go even one step further and stipulate that it should be mandatory for publication to provide a bibliography of reference material and morphological characters used for identification. I think this is a very helpful idea and I would go one step further and ask identifiers to provide this information in the database already. Many records will probably never be published formally in a paper but still represent very valuable data. Some taxonomic keys provide a hierarchical numbering system for every species. In such a case the reference to the source and number would be all that is needed to reduce the amount of detective work necessary to clarify issues with conflicting identifications.

Monday, January 28, 2013

Open Access, seriously (2)

Let's continue with the open access theme...

(2) Do not do ANY work for non open access journals. 

This includes reviewing, suggesting reviewers, etc. Let's assume a bunch of people have made the conscious decision to stop submitting their work to non-open access journals. What impact would that actually have on the publishers? Frankly, not a big immediate one unless we're assuming a massive migration to open access journals. That is rather unlikely and therefore, everyone should consider step two: stop all other work for those journals, i.e. decline to review articles for them, don't even make suggestions for alternative reviewers, and do not serve on the editorial board. Yes - ignore the emails!

For those who feel uneasy to publish exclusively open access but still are skeptical of the traditional publishing market this might be a good starting point and if done by many it probably hurts even more. Why is that?

All these jobs are done on a voluntary basis. Peer review is an essential part of scientific publishing and when done right it helps to improve publications and also prevents bad science from getting any attention. At least that's the theory and most often it works. All journals need our expertise in order to assess the quality and scientific integrity of a publication and we are all happily provide it free of charge. We even volunteer some of our time to serve as editors thereby essentially relieving publishers from at least one management level in their company. As good and important this community driven review process is there is no real return of investment for me as editor or reviewer aside from providing service to the community. Don't get me wrong here, I am not arguing for paid services as this would bring up costs even more. But, for a moment just picture this scenario - you serve as a volunteer for a local sports club or as I do, for a choral organisation. My time and efforts are spend to support the members of the organisation by enabling the group to do more for its members but certainly not to increase revenue. That's why I am not volunteering for a company which would gain the benefit of reduced expenses and higher profit if I did. 

I am not an economist but this business model seems to be flawed as it relies very strongly on other peoples good will. Instead of changing it over time to make it more self-sustaining it looks like the publishing companies kept building on it and over the years they started to push more and more responsibilities towards authors, reviewers, and editors. An example are ever growing rule sets for formatting (length, line spacing, fonts, citations...). We even had to learn to include DOI's for some journals. In any other branch of publishing this is the job of the publishing company. In our little world it is a requirement for submission (and rejection for that matter). I hope nobody believes that these things are intended to help reviewers or editors. They are not! Actually it is a nuisance to pay attention to this instead of focusing solely on content. I wouldn't mind if it would help to bring down the costs for us authors and our institutions but this doesn't seem to be the case. On the contrary with increased outsourcing subscriptions paradoxically went up. I wonder whether prices will climb again when all of a sudden all reviewers and editors resign and move to open access journals. For once some companies would face the true costs of publishing and the real hardship other sectors in the industry are already going through.

For my part I feel much better to provide my service as reviewer and editor for an open access journal. I still devote a part of my time to do service to the scientific community - as we all should - but I think I am doing the right thing to set us up for the future of scientific publishing.

Friday, January 25, 2013

Open Access, seriously (1)

Last week I had talked about open access and the tribute to Aaron Swartz  through the #PDFTribute movement. I also had announced that I plan to change my habits and consequently publish and review exclusively in fully open access journals from now on - or use my blog as an outlet. This was mainly triggered by a post by Jonathan Eisen in which he showed a list of things one can do to really support open access. I know that some of them are perceived as not so easy to follow through especially by those of us that are not in a tenured position yet. That might be true but I thought it would be a good idea to go through them one by one and provide my ideas to make such a shift perhaps more desirable. The first suggestion in the list of ten is perhaps the one that has the most immediate impact but it is also the most controversial as there is so much anxiety connected to it.

(1) Only publish in fully open access journals.

Let's first clarify what is meant with fully open access journals. Full open access content is easily accessible online, available to anyone free of charge, and available for re-use without restriction except that attribution be given to the source. Especially the latter is not always given although authors are usually charged for open access options. Some publishers e.g. have adopted a business model through which authors pay for immediate publication on the Internet but the publisher nonetheless keeps commercial reuse rights for itself. This is not full open access! 

If you want to make sure you publish with the right journals and what your options are when it comes to full open access have a look at the Directory of Open Access Journals.

Make no mistake - open access costs money. Authors are charged for publications. Your paper won't be published out of the sheer goodness of the hearts of the publishers. With open access also came a shift of expenses. In the traditional publishing system authors didn't have to pay as publishers were making their profit by charging for subscriptions or per view of an article. The average scientist at a larger university was able to enjoy the advantages of institutional subscriptions for a multitude of journals without the need to budget for the costs. In the worst case they were part of the overhead already taken away. However, researchers at institutions with lower budgets had no access whatsoever and they perhaps still constitute the majority on our planet. Furthermore, the interested public was excluded as well although most of the research is financed with tax payers money. One particular field in my line of research is actually very bad when it comes to accessibility of research publications - taxonomy. Most descriptions, even the more recent ones, appear either in some exotic journals nobody has access to, or more and more they are published online in Zootaxa. Unfortunately, this journal is anything but open access (although they offer that option to authors at proof stage for a fee). They are the biggest player in the field and many species descriptions are published there but only a fraction of the scientists have access to them because they can't afford a subscription and neither can their institution. Species descriptions in particular should be considered common good and as such accessible to everyone without having to pay any fee. By the way, there is an open access alternative for taxonomists (ZooKeys).

Most open access journals charge at the other end and the fees vary greatly. This puts authors under considerable pressure as this is an expense they need to cover through their budget. However, I do see a good chance that often that can be included in grant proposals as separate line item. Most funding agencies are aware of the shift in the publication market and some actually expect grantees to publish exclusively open access. My experience with that is consistently positive and I always justified the additional costs with the need for open access. Also there are new models for fees in the works - one good example is a membership approach for authors such as in PeerJ.

That leaves us with the omnipresent impact factor argument. I know, the open access idea is rather young and certainly not fully established in our scientific circles. As a consequence impact factors are low to medium. If you want to make a big splash à la Science or Nature you won't stand a chance with open access although in recent years especially the press has been paying a lot of attention to journals such as PLoSONE or the ones from the BMC series. But do we really need to publish in big journals at all cost? Do we really need an impact factor and all related indices to assess the performance of a researcher? Is it really necessary that search committees at universities use these indices or just the fact that somebody has published in a high profile journal as selection criterion?

The internet has changed a lot for that matter and modern web tools provide better and perhaps more objective metrics. From my point of view the number of times one of my articles has been cited is of less value than the actual number of times it has been accessed over a given time. The first value provides a measure of how often my work was recognized by my colleagues but the second one tells me how many people in general found it interesting enough to download it. For example in less than half a year my blog has been accessed by more people than all my papers have been cited altogether in the last 10 years. None of these metrics provides any indication of the quality of my research. Like it or not if you want to know how good my work really is you have to read my publications.

What it comes down to is what we want for our little precious study. Is it more important to present it to as many people as possible, researchers and non-researchers alike or will we continue to try to place it in one of the big ones with ridiculously high rejection and subscription rates, or do we want to have it bedded comfortably in a more specialized journal read by a handful of like minded colleagues?

I made my decision in favor of accessibility because I think it will work for me and not damaging my career (and I am not tenured or even tenure track!). I am convinced that the world of science will eventually change to open access and it is about time to join the crowd.

Thursday, January 24, 2013

The seven deadly sins of DNA Barcoding (1)

Failure to test clear hypotheses


As promised I will have a closer look at a recent publication that listed deficiencies that they identified as common in DNA Barcoding research.Seven deadly sins were identified. I decided to give each 'sin' it's own blog post in which I will try to briefly comment on it. And now without further due the first sin.

According to the authors Collins and Cruickshank one of the gravest sins in many DNA Barcoding studies is the lack of clearly stated, objective hypotheses. I've heard similar criticism before especially in barcoding related research. Actually some of those concerns have already been raised when researchers began to sequence larger amounts of DNA such as in the Human Genome Project. Objectively the criticism has some merit. It is based on the classical hypothetical-deductive model which states that scientific inquiry progresses by formulating a hypothesis that could be falsified by a test on observable data. Scientific hypotheses are generally based on previous observations that cannot satisfactorily be explained with the available theories. Unfortunately many colleagues are not willing to accept that the massive accumulation of sequence information also qualifies as previous observation in this respect and some (not the authors of the paper at hand) dismiss DNA Barcoding studies and theses as endeavors without leading hypothesis and the authors as "stamp-collectors".

I strongly recommend that the assembly of DNA libraries should be considered a hypothesis generating exercise. Today we are in the fortunate situation that biological observation is not limited to visual perception and experiments alone. Genetic information is relatively easy to retrieve and enables us to ask questions because we have assembled large amounts of standardized data. As a consequence the barcode community has been approaching journals (PLoSONE, Molecular Ecology Resources) that are willing to publish articles that are called data release paper. Those are outlets for researchers that have assembled DNA Barcode libraries for their work and the community at large. In the classical hypothetical-deductive environment there would be no venue to present the data to the public let alone getting credit for it as the work was primarily a collection of data points which were generated to derive hypotheses from it. Data release paper usually come with only a limited number of analyses and are very descriptive. Nevertheless, this strategy has two main advantages. One being the fact that the data are much earlier released and available to the public through databases such as BOLD and Genbank (Sharing, sharing, sharing!). The other plus is that a researcher is credited for work that is not necessarily following age-old standards (it's all about incentives!).

Collins and Cruickshank also state "If the data collected are intended to be used as an identification tool, then they should be tested as such. Conversely, if a study aims to test the suitability of DNA barcoding as a biodiversity assessment tool (species discovery), then hypotheses of species richness should be estimated independently of the taxonomic names, and then compared a posteriori." 

I believe the main criticism is that many DNA Barcoding studies do not explicitly test the identification success. Usually a more descriptive analysis is given that shows that there is sufficient variation between species and very low variation within them. The famous "barcode gap" is what authors present and not necessarily a statistical test based on simulations or by using independent data. We could start heated discussions if it needs such tests if a barcode gap is clearly present especially in sufficiently sampled groups and/or environments just because only a hypothetical-deductive study is a good study

As for the second part of the paragraph  I am a bit puzzled as this would also question the more traditional, morphology-based of species discovery. For me there is no big difference between discovering a new species based on DNA differences or morphological variations. In both cases it is essential to show that these differences are characteristic for the respective species, backed up by other characteristics, and different enough to actually qualify for species and not just result from a more pronounced population structure.

In summary I do understand where the criticism comes from and I do not necessarily agree with it in all points. Actually when it comes to the hypothetical-deductive model I would love to see some more flexibility also for the sake of many students that have to fight through a lot of skepticism when they decide to take on a DNA Barcoding project.

One thing for sure - I don't think the failure to test clear hypotheses is a deadly sin in the context of DNA Barcoding and the paper at hand. It is certainly not a grave one. 

Wednesday, January 23, 2013

Invasive seaweed arrived in Canada

In a post last year in October I had reported about invasive seaweeds and among other examples talked about an invasion at the beaches in Massachusetts that happened in June 2012. A large area was blanketed with thickly packed red fibers resembling matted hair, causing a stink for beachfront residents and tourists alike. The Pacific native Heterosiphonia japonica was likely introduced through ship ballast water. It was first discovered at US coasts back in 2009 but had already caused a lot of damage on European Coasts in the early 1980's.

At the time findings were restricted to the coasts of Massachusetts and Rhode Island. But now the student Amanda Savoie (her first paper - congratulations!) and her supervisor Gary Saunders from the University of New Brunswick reported that in August 2012 they collected four specimens of the invasive red alga at Mahone Bay, Nova Scotia, Canada. This was actually a chance find confirmed by DNA Barcoding. I had a brief email exchange with Gary Saunders and he stated "we actually overlooked this species in the field - too busy with two dives a day, processing samples, etc. Plus, we simply weren't expecting it this far north already. The barcode results were a slap in the face that we quickly confirmed by accessing the vouchers!".

It was also a very lucky find as this group of algae experts in Fredericton has almost no money to do the barcode sequencing let alone finance the trip to Nova Scotia. They squeezed the last bit out of their grant money to confirm these important findings. What concerns me most is the fact that necessary surveys with the purpose to investigate the extend of the invasion at the Canadian Atlantic coast will not happen because no money was allocated to the researchers that are actually capable of doing the job. It's a shame because such an investment would be minimal compared to the costs a further spread of Heterosiphonia japonica could cause as it poses a serious threat to the health of this coastal ecosystem. It has the potential to grow over native seaweed, starving it of light and nutrients and thereby damaging a habitat and food source for many marine animals.

Tuesday, January 22, 2013

DNA Barcoding reveals a hybrid

Doesn't that sound strange to you?

For years we have emphasized that hybridisation can complicate the use of DNA-based approaches for species identification such as DNA Barcoding. The fact that we are using a fragment of the mitochondrial genome that is almost exclusively inherited maternally can lead to results were two species seemingly share a DNA Barcode despite all morphological differences. However, the complicating effects of hybridisation are not restricted to DNA-based identification methods; they can also strongly affect morphological identification. As a matter of fact hybrids can be different enough from both parent species to gain description as a distinct species.

Modified from Rougerie et al. 2012
A neat litte study just published in Invertebrate Systematics shows that DNA Barcodes can actually help to unravel such problems. An initial discordance between morphological identifications and the cohesiveness of DNA barcode clusters provoked deeper investigation of the situation with both a nuclear marker and morphology. Paratypes of the hawkmoth species Gnathothlibus collardi possessed a barcode sequence identical to that of Philodila astyanor. Furthemore Gnathothlibus collardi resembles Gnathothlibus eras at first glance. The best possible explanation for this after contamination or other technical errors were ruled out was that Gnathothlibus collardi specimens represent F1 hybrids between Philodila astyanor females and male Gnathothlibus eras. Examination of the D2 region of the 28S rDNA gene and a more thorough morphological reexamination of specimens of the three species then confirmed the findings. A taxonomist had described an invalid species from a hybrid.

Who would have thought that well tended DNA Barcode libraries may one day contribute to the early detection of hybridisation?

Monday, January 21, 2013

Burgers - where's the beef?

Last week the Food Safety Authority of Ireland (FSAI) published the findings of a targeted study examining the authenticity of a number of beef burger, beef meal and salami products available from retail outlets in Ireland.  The study which tested for the presence of horse and pig DNA, revealed the presence of horse DNA in some beef burger products. 

A total of 27 beef burger products were analysed with 37% testing positive for horse DNA and 85% testing positive for pig DNA. In addition, 31 beef meal products (cottage pie, beef curry pie, lasagne, etc) were analysed of which 21 were positive for pig DNA. Only 19 salami products were free of any other species.

Efforts to trace the source of adulteration in the burgers are focusing on additives used in the manufacturing process. Cheap burgers are made with so called "beef ingredient products" which can make up 37% of a burger. So only 63% are actually meat and the rest is additive mixes of concentrated proteins extracted from animal carcasses and offcuts. There is a chance that the horse and pig DNA were more likely to have originated with these high-protein powders rather than any fresh meat. 

Initially it was thought that the problem is limited to Ireland but within a few days more facilities were found which produced adulterated burgers and investigators extend their efforts and look to the UK, the Netherlands and Spain. The British Food Standards Agency announced yesterday that they initiated a sampling programme to investigate the authenticity, that is, the content compared with the label’s listed ingredients, of a range of meat products.

I wonder what would happen if someone here in North America - the paradise for meat burger lovers - would have a similar close look the patties.

Friday, January 18, 2013

Cheese Maggots

Calliphora vicina
In criminal forensics flies of the species Calliphora vicina have an important role in post mortem interval determination. Factors such as region, weather, temperature, time of day and conditions under which a body was found all contribute to determining a post mortem interval which is the time that has elapsed since a person has died. For a precise calculation a forensic entomologist must consider what is commonly known about the species and integrate it with experimental data gathered from a crime scene. It is essential to know how the blow-fly behaves specifically in the area where the body was discovered. Calliphora vicina and a few other fly species are currently widely used in criminal forensics because of their consistent time of arrival and colonization of a corpse. 

This knowledge played a big role when researchers of the Zoological State Collection in Munich, Germany used DNA Barcoding to determine the species identity of some maggots that were found in soft cheese. The sample came from a large German cheese manufacturer as they suspected improper storage that led to the infestation. The species identity would allow to extrapolate the time of oviposition. This in turn would allow the manufacturer to confirm their suspicions by crosschecking with information of their chain of custody . 

Stinking Bishop (just an example for a
smelly cheese. Not the one sampled!)
Indeed the researcher were able to help. The maggots clearly belonged to Calliphora vicina which not only likes putrescent meat but also smelly cheese. I better contain myself from speculating what the smell of rotten meat and strong-smelling soft cheese have in common but this clearly is one of those creative applications of DNA Barcoding in combination with forensic science I would never have thought of before.

Thursday, January 17, 2013

Shiitake pest

In case you didn't know or haven't eaten it yet Shiitake (Lentinula edodes) is an edible mushroom native to East Asia. It is a feature of many Asian cuisines. It is also considered a medicinal mushroom in some forms of traditional medicine.
Mainly in Asian countries, shiitake is generally cultivated in greenhouses under controlled conditions. Recently, in many shiitake nurseries of Korea, farmers have experienced a serious loss of crops and have even had to abandon mushroom farming due to severely damaged cultures. Mycophagous maggots had caused serious damage by preventing formation of the mushrooms fruiting body. However, they couldn't be identified due to their lack of morphological characteristics in larvae and female adults. As a consequence, farmers and entomologists are unable to determine what species are the primary cause of the shiitake damage which in turn means that there are no adequate ways to fight this pest. Members of the two fly families Cecidomyiidae and Sciaridae are believed to be the main vermin species but their identity remains uncertain as only a few males could be identified using genital morophology.

Life stages of Camptomyia migdes (A-D)
and the mushroom culture(E)
A group of Korean researchers has now conducted a study using DNA Barcoding to test if the method can help to identify the maggots and female flies. Their results suggest that five dipteran species collected from shiitake farms were present among the collected samples, which is consistent with some morphological identifications of male specimens. The two main species they found were Camptomyia heterobia and Camptomyia corticalis. Interestingly last year another group from Korea had tested a variety of plant essential oils for the potential to serve as larvicides. The organism they used to test the oils was the gall midge Camptomyia corticalis. Nice coincidence (if it is one). I guess it is time the two groups get together and discuss potential applications of both their findings.

Wednesday, January 16, 2013

The Dirty 22

Meet the Dirty 22. 

The US Food and Drug administration has identified the 22 most common pests  contributing to the spread of foodborne pathogens. Most common foodborne biological hazards are bacterial or microbial pathogens and all 22 species have been identified as carriers of the most common and concerning pathogens.
So far the FDA visually inspected samples via microscopy to identify insect carcasses and parts for the detection of these species. This method is very time-consuming and may lead to inaccurate or insufficient identification through a lack of appropriate expertise. The FDA is already using DNA Barcoding to identify seafood and it seems a logical step to extend this into other branches of the agencies work. 

Together with a few of my colleagues here in Guelph, the FDA generated DNA Barcodes for all 22 species and developed a systematic primer strategy to obtain them. It is no surprise for me that they have been successful but it is worth to mention that the tests they conducted resulted in a very simple protocol with one primer cocktail working for 17 of the species. Two alternative cocktails were necessary to yield results for the remaining five species.

The result is a regulatory database that used vouchered specimens to make this approach potentially useful for either supplementing or replacing morphological methods in the routine work at the FDA. They already plan to generate a more comprehensive library of DNA Barcodes of other common pests that contribute to the spread of foodborne disease.

Tuesday, January 15, 2013

New Lizard

Calotes bachae (credit Peter Geissler)
The lizard genus Calotes currently comprises 23 species. Some species are known as forest lizards, others as "bloodsuckers" due to their red heads, and yet others as as garden lizards (Calotes versicolor). The latter natively inhabits a large area from East Iran throughout Asia to Indonesia (Sumatra) while the majority of species in this genus are restricted to relatively small geographical regions in India, Sri Lanka and Myanmar.

A group of German and Russian researchers recently collected specimens of the genus Calotes in Vietnam. Based on their general morphology they preliminary identified them as Calotes mystaceus a species indicated as widespread and found mostly in the southern parts of the country with some isolated records known from the north. However, when the looked closer and compared their animals with specimens of Calotes mystaceus from western Cambodia and Thailand they spotted a significantly different color pattern, for example in lacking three characteristic dark brown blotches on the back. This geographic variance was already recognized way back in 1921 and also very recently in 2009. Authors characterized two forms, one on each side of the Mekong River, however nobody made any taxonomic decisions.

Only extensive geographical sampling of comparative specimens and DNA Barcoding have made it possible to distinguish between two genetically distinct units within the specimens currently referred as Calotes mystaceus. As a consequence the authors describe a new species of this agamid genus with the name Calotes bachae. Males of the species impress with their astonishingly rich coloration. During mating season, their azure heads shine bright like in a contest, in order to impress the females. They can also reduce their play of colors, similar to chameleons. For example, at night they appear inconspicuously dark and brownish.

Monday, January 14, 2013


You might have heard of the tragic death of Aaron Swartz a software developer and internet activist. Swartz co-authored the first RSS specifications at the age of 14 and was co-owner of Reddit. Later he also focused on sociology, civic awareness and activism and was a strong advocate for open access in general. He committed suicide at the age of 26 and was found dead in his apartment last Friday.

In July 2011, Swartz was indicted on federal charges of gaining illegal access to JSTOR, a subscription-only service for distributing scientific and literary journals, and downloading 4.8 million articles and documents, nearly the entire library. The pending charges were potential penalties of up to 35 years in prison and $1 million in fines and many believe that these and the pressure build up by prosecutorial overreach is what eventually killed him.

Apparently he had struggled with depression for a while already. Depression is a very serious and unfortunately underestimated problem. It is not "just a phase" that people go through. It won't go away without proper treatment and can lead to a number of other serious problems if it is not dealt with properly.

Although I don't believe in any conspiracy theories out there and I find it exaggerating to make others responsible for someones suicidal death, I can fully understand that a person with chronic depression facing disproportional charges for something meant positive doesn't see any other way out of this dilemma. That is the tragic in this story which eventually took the life of a bright mind that had already accomplished so much for the cause of open access in all its forms.

As a reaction to Aaron Swartz's death supporters responded with an effort called #pdftribute to promote open access. In a tribute to him, researchers have begun posting pdf's to Twitter to honor his campaign for open access. 

I have long pondered if I should join them and post my pdf's elsewhere but I think many are already public as I tried to publish and review as much as possible open access for years. There is also the growing ResearchGate which is home to quite a few pdf's but in all honesty this won't get us anywhere but to serve as a tribute. What we need is a change of publishing attitude. Researchers should exclusively publish open access. Jonathan Eisen has assembled a great list of things we can to do to really support open access on his Tree of life blog. I intend to change my habits accordingly and consequently publish and review only in real open access journals from now on - or simply here in my blog.

However such paradigm shifts also require a change in attitude if it comes to valuing a scientists work. Search committees and funding agencies have to stop looking for high impact factor publications but more for open access contributions. After all it is mostly tax payers money that pays us. We owe it to them to make our research accessible instead of wasting some of it to pay for insanely high subscription fees. We just need to start with it. #pdftribute could serve as a good starting point for such a movement but it is sad to realize that again it needed such a tragedy to get the ball rolling.

Friday, January 11, 2013

What's for dinner?

A rapid marine invasion is currently occurring in the western Atlantic. In the mid-1980s lionfish (Pterois volitans) were released in Florida. Since then, they have become established in >4 million km2 of the western Atlantic, Caribbean, and Gulf of Mexico. 

The problem is that these invasive lionfish reach higher densities and larger sizes than in their native range (Indonesia). Their hunting method is unlike that of any Atlantic predator as they use prey herding to catch fish and crustaceans which they ingest as a whole (prey can be half their own body size).

This has an immense impact on the native reef fish populations in the western Atlantic. Furthermore, little is known of how lionfish numbers are kept stable within their native range. The problem until recently was the lack of an in-depth understanding of their diet which in turn would help to assess the impact on the native species. Given their hunting mode, lionfish could prey on most fish species within their gape size limits.

The only solution to the problem is to look at the stomach content of lionfish which is problematic when relying only on morphology. Two studies utilized DNA Barcoding of lionfish stomach content to identify prey species for a better estimate on the breadth of the diet of these invaders. A first paper by Mexican scientists came out last summer and described efforts to analyze the prey composition of lionfish collected along the Mexican part of the Mesoamerican Coral Reef. Two days ago we (researchers from Canada and the US including myself) were able to add some more information to this. Our fish were collected at reef sites off southwest New Providence, Bahamas. The results of both studies are similar. Not surprisingly DNA Barcoding considerably increased the resolution of the diet studies. The number of fish species found in comparable number of stomachs analyzed was around 35 with some overlap. Differences are likely the result of regional fauna differences. However, the overall picture is alarming as prey fish are indeed members of most groups that fit within the gape size range of lionfish and it seems that the latter are not picky. Actually both studies also found evidence for cannibalism. Lionfish eat juvenile lionfish as long as they have the right size. But fish is not the only item on the menu as the Mexican group showed that 1/4 of the diet consisted of crustaceans mainly decapods.

Invasive species are often generalists. The ecological implications of these findings are profound because of the large number of interspecific interactions they can create or disrupt, particularly in species-rich ecosystems like coral reefs. Only taxonomically well-resolved diet information combined with prey availability data can help to identify the species most at risk from lionfish predation.