Please use this identifier to cite or link to this item: http://dx.doi.org/10.25673/108798
Title: The archives are half-empty : an assessment of the availability of microbial community sequencing data
Author(s): Jurburg, Stephanie D.
Konzack, Maximilian
Eisenhauer, NicoLook up in the Integrated Authority File of the German National Library
Heintz-Buschart, Anna
Issue Date: 2020
Type: Article
Language: English
Abstract: As DNA sequencing has become more popular, the public genetic repositories where sequences are archived have experienced explosive growth. These repositories now hold invaluable collections of sequences, e.g., for microbial ecology, but whether these data are reusable has not been evaluated. We assessed the availability and state of 16S rRNA gene amplicon sequences archived in public genetic repositories (SRA, EBI, and DDJ). We screened 26,927 publications in 17 microbiology journals, identifying 2015 16S rRNA gene sequencing studies. Of these, 7.2% had not made their data public at the time of analysis. Among a subset of 635 studies sequencing the same gene region, 40.3% contained data which was not available or not reusable, and an additional 25.5% contained faults in data formatting or data labeling, creating obstacles for data reuse. Our study reveals gaps in data availability, identifies major contributors to data loss, and offers suggestions for improving data archiving practices.
URI: https://opendata.uni-halle.de//handle/1981185920/110753
http://dx.doi.org/10.25673/108798
Open Access: Open access publication
License: (CC BY 4.0) Creative Commons Attribution 4.0(CC BY 4.0) Creative Commons Attribution 4.0
Journal Title: Communications biology
Publisher: Springer Nature
Publisher Place: London
Volume: 3
Original Publication: 10.1038/s42003-020-01204-9
Page Start: 1
Page End: 8
Appears in Collections:Open Access Publikationen der MLU

Files in This Item:
File Description SizeFormat 
s42003-020-01204-9.pdf632.94 kBAdobe PDFThumbnail
View/Open