首页 | 本学科首页   官方微博 | 高级检索  
     


Importance of timely metadata curation to the global surveillance of genetic diversity
Authors:Eric D. Crandall  Rachel H. Toczydlowski  Libby Liggins  Ann E. Holmes  Maryam Ghoojaei  Michelle R. Gaither  Briana E. Wham  Andrea L. Pritt  Cory Noble  Tanner J. Anderson  Randi L. Barton  Justin T. Berg  Sofia G. Beskid  Alonso Delgado  Emily Farrell  Nan Himmelsbach  Samantha R. Queeno  Thienthanh Trinh  Courtney Weyand  Andrew Bentley  John Deck  Cynthia Riginos  Gideon S. Bradburd  Robert J. Toonen
Affiliation:1. Department of Biology, Pennsylvania State University, University Park, Pennsylvania, USA;2. Ecology, Evolution, and Behavior Program, Department of Integrative Biology, Michigan State University, East Lansing, Michigan, USA;3. School of Natural Sciences, Massey University, Auckland, New Zealand;4. Department of Animal Science, University of California, Davis, Davis, California, USA;5. Department of Biology, University of Central Florida, Orlando, Florida, USA;6. Department of Research Informatics and Publishing, The Pennsylvania State University Libraries, Pennsylvania State University, University Park, Pennsylvania, USA;7. Madlyn L. Hanes Library, The Pennsylvania State University Libraries, Pennsylvania State University, Middletown, Pennsylvania, USA;8. Department of Anthropology, University of Oregon, Eugene, Oregon, USA;9. Department of Marine Science, California State University Monterey Bay, Seaside, California, USA;10. UOG Marine Laboratory, University of Guam, Mangilao, Guam;11. Department of Integrative Biology, University of Texas at Austin, Austin, Texas, USA;12. Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, Ohio, USA;13. Department of Natural Science, Hawai‘i Pacific University, Honolulu, Hawaii, USA;14. Department of Biological Sciences, Auburn University, Auburn, Alabama, USA;15. Biodiversity Institute, University of Kansas, Lawrence, Kansas, USA;16. Berkeley Natural History Museums, University of California, Berkeley, Berkeley, California, USA;17. School of Biological Sciences, The University of Queensland, Brisbane, Queensland, Australia;18. Hawai‘i Institute of Marine Biology, University of Hawai‘i at Mānoa, Kaneohe, Hawaii, USA
Abstract:Genetic diversity within species represents a fundamental yet underappreciated level of biodiversity. Because genetic diversity can indicate species resilience to changing climate, its measurement is relevant to many national and global conservation policy targets. Many studies produce large amounts of genome-scale genetic diversity data for wild populations, but most (87%) do not include the associated spatial and temporal metadata necessary for them to be reused in monitoring programs or for acknowledging the sovereignty of nations or Indigenous peoples. We undertook a distributed datathon to quantify the availability of these missing metadata and to test the hypothesis that their availability decays with time. We also worked to remediate missing metadata by extracting them from associated published papers, online repositories, and direct communication with authors. Starting with 848 candidate genomic data sets (reduced representation and whole genome) from the International Nucleotide Sequence Database Collaboration, we determined that 561 contained mostly samples from wild populations. We successfully restored spatiotemporal metadata for 78% of these 561 data sets (n = 440 data sets with data on 45,105 individuals from 762 species in 17 phyla). Examining papers and online repositories was much more fruitful than contacting 351 authors, who replied to our email requests 45% of the time. Overall, 23% of our email queries to authors unearthed useful metadata. The probability of retrieving spatiotemporal metadata declined significantly as age of the data set increased. There was a 13.5% yearly decrease in metadata associated with published papers or online repositories and up to a 22% yearly decrease in metadata that were only available from authors. This rapid decay in metadata availability, mirrored in studies of other types of biological data, should motivate swift updates to data-sharing policies and researcher practices to ensure that the valuable context provided by metadata is not lost to conservation science forever.
Keywords:biodiversity  conservation genetics  Convention on Biological Diversity  digital sequence information  evolution  genetic diversity  metadata  molecular ecology  open data  biodiversidad  Convenio sobre la Diversidad Biológica  datos abiertos  diversidad genética  ecología molecular  evolución  información de secuencia digital  metadatos  演化  分子生态学  保护遗传学  元数据  遗传多样性  开放数据  数字序列信息  《生物多样性公约》
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号