|Scott Miller (senior author of the study)|
About ten years ago researchers ended an insect-rearing program from native fruit collected at sites throughout Kenya. The initial focus of the 5-year survey lay on fruit flies (Tephritidae) and their parasitoids but as it turned out 19% of the catch were actually lepidoptera. Today I got my hand on the third in a series of papers focused on the lepidoptera of this large collection and it uses DNA Barcoding to sort through the microlepidoptera. The authors provide a data release paper that aims to make DNA Barcode data available to document ongoing research, to contribute to the efforts of building a global library of DNA Barcodes, and to encourage enhancement in identifications, following the Fort Lauderdale principles for genetic data.
What are those principles?
In response to requests from the large-genome sequencing scientific community, in January 2003 an international meeting was organised to discuss pre-publication data release. Held in Fort Lauderdale, Florida, the meeting primarily brought together representatives of the producers of large-genome sequence data, the users of such data, funding agencies and scientific journals. The meeting concluded that pre-publication release of sequence data was beneficial to the scientific research community in general and the following suggestions were made:
- The meeting attendees enthusiastically reaffirmed the 1996 Bermuda Principles, which expressly called for rapid release to the public international DNA sequence databases (GenBank, EMBL, and DDBJ) of sequence assemblies of 2kb or greater by large-scale sequencing efforts and recommended that that agreement be extended to apply to all sequence data, including both the raw traces submitted to the Trace Repositories at NCBI and Ensembl and whole genome shotgun assemblies.
- The attendees recommended that the principle of rapid pre-publication release should apply to other types of data from other large-scale production centers specifically established as ‘community resource projects’
- The attendees recognized that pre-publication data release might conflict with a fundamental scientific incentive – publishing the first analysis of one's own data. The attendees noted that it would not be possible to absolutely guarantee this incentive without applying restrictions that would undermine the rationale for rapid, unrestricted release of data from community resources. Nonetheless, it is essential that excellent scientists continue to be attracted to these projects. To encourage this, the scientific community should understand that pre-publication data release needs active community-wide support if it is to continue to receive widespread support from the producers. The contributions and interests of the large-scale data producers should be recognized and respected by the users of the data, and the ability of the production centres to analyse and publish their own data should be supported by their funding agencies.
Many of the records made public in the paper on microlepidoptera represent undescribed species, and the authors have purposefully refrained from assigning new names until the relevant taxa can be studied in sufficient detail. Under the Fort Lauderdale principles, they ask others to refrain from assigning new species names to these records outside of the context of proper systematic study. The authors have made their data publicly available knowing that it can take many years until all the specimens are properly named or described. I think this is a great example on how we should change our academic workflow. By making these data available, while identifications are in progress, DNA cluster-based species (or MOTUs, BINs or whatever you want to call them) can be used as species hypotheses as long as we need to wait for the confirmation of taxonomic experts or follow-up studies.