Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis

Arlin Stoltzfus, Brian O'Meara, Jamie Whitacre, Ross Mounce, Emily L. Gillespie, Sudhir Kumar, Dan F. Rosauer, Rutger A. Vos

Research output: Contribution to journalArticle

28 Citations (Scopus)

Abstract

Background: Recently, various evolution-related journals adopted policies to encourage or require archiving of phylogenetic trees and associated data. Such attention to practices that promote sharing of data reflects rapidly improving information technology, and rapidly expanding potential to use this technology to aggregate and link data from previously published research. Nevertheless, little is known about current practices, or best practices, for publishing trees and associated data so as to promote re-use. Findings. Here we summarize results of an ongoing analysis of current practices for archiving phylogenetic trees and associated data, current practices of re-use, and current barriers to re-use. We find that the technical infrastructure is available to support rudimentary archiving, but the frequency of archiving is low. Currently, most phylogenetic knowledge is not easily re-used due to a lack of archiving, lack of awareness of best practices, and lack of community-wide standards for formatting data, naming entities, and annotating data. Most attempts at data re-use seem to end in disappointment. Nevertheless, we find many positive examples of data re-use, particularly those that involve customized species trees generated by grafting to, and pruning from, a much larger tree. Conclusions: The technologies and practices that facilitate data re-use can catalyze synthetic and integrative research. However, success will require engagement from various stakeholders including individual scientists who produce or consume shareable data, publishers, policy-makers, technology developers and resource-providers. The critical challenges for facilitating re-use of phylogenetic trees and associated data, we suggest, include: a broader commitment to public archiving; more extensive use of globally meaningful identifiers; development of user-friendly technology for annotating, submitting, searching, and retrieving data and their metadata; and development of a minimum reporting standard (MIAPA) indicating which kinds of data and metadata are most important for a re-useable phylogenetic record.

Original languageEnglish (US)
Article number574
JournalBMC Research Notes
Volume5
DOIs
StatePublished - 2012

Fingerprint

Technology
Metadata
Practice Guidelines
Information Dissemination
Information technology
Administrative Personnel
Research

Keywords

  • Bioinformatics
  • Data sharing
  • Evolution
  • Phylogeny
  • Phyloinformatics
  • Standards

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Cite this

Stoltzfus, A., O'Meara, B., Whitacre, J., Mounce, R., Gillespie, E. L., Kumar, S., ... Vos, R. A. (2012). Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis. BMC Research Notes, 5, [574]. https://doi.org/10.1186/1756-0500-5-574

Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis. / Stoltzfus, Arlin; O'Meara, Brian; Whitacre, Jamie; Mounce, Ross; Gillespie, Emily L.; Kumar, Sudhir; Rosauer, Dan F.; Vos, Rutger A.

In: BMC Research Notes, Vol. 5, 574, 2012.

Research output: Contribution to journalArticle

Stoltzfus, A, O'Meara, B, Whitacre, J, Mounce, R, Gillespie, EL, Kumar, S, Rosauer, DF & Vos, RA 2012, 'Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis', BMC Research Notes, vol. 5, 574. https://doi.org/10.1186/1756-0500-5-574
Stoltzfus, Arlin ; O'Meara, Brian ; Whitacre, Jamie ; Mounce, Ross ; Gillespie, Emily L. ; Kumar, Sudhir ; Rosauer, Dan F. ; Vos, Rutger A. / Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis. In: BMC Research Notes. 2012 ; Vol. 5.
@article{32521ad4adb448e99374070c666d9687,
title = "Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis",
abstract = "Background: Recently, various evolution-related journals adopted policies to encourage or require archiving of phylogenetic trees and associated data. Such attention to practices that promote sharing of data reflects rapidly improving information technology, and rapidly expanding potential to use this technology to aggregate and link data from previously published research. Nevertheless, little is known about current practices, or best practices, for publishing trees and associated data so as to promote re-use. Findings. Here we summarize results of an ongoing analysis of current practices for archiving phylogenetic trees and associated data, current practices of re-use, and current barriers to re-use. We find that the technical infrastructure is available to support rudimentary archiving, but the frequency of archiving is low. Currently, most phylogenetic knowledge is not easily re-used due to a lack of archiving, lack of awareness of best practices, and lack of community-wide standards for formatting data, naming entities, and annotating data. Most attempts at data re-use seem to end in disappointment. Nevertheless, we find many positive examples of data re-use, particularly those that involve customized species trees generated by grafting to, and pruning from, a much larger tree. Conclusions: The technologies and practices that facilitate data re-use can catalyze synthetic and integrative research. However, success will require engagement from various stakeholders including individual scientists who produce or consume shareable data, publishers, policy-makers, technology developers and resource-providers. The critical challenges for facilitating re-use of phylogenetic trees and associated data, we suggest, include: a broader commitment to public archiving; more extensive use of globally meaningful identifiers; development of user-friendly technology for annotating, submitting, searching, and retrieving data and their metadata; and development of a minimum reporting standard (MIAPA) indicating which kinds of data and metadata are most important for a re-useable phylogenetic record.",
keywords = "Bioinformatics, Data sharing, Evolution, Phylogeny, Phyloinformatics, Standards",
author = "Arlin Stoltzfus and Brian O'Meara and Jamie Whitacre and Ross Mounce and Gillespie, {Emily L.} and Sudhir Kumar and Rosauer, {Dan F.} and Vos, {Rutger A.}",
year = "2012",
doi = "10.1186/1756-0500-5-574",
language = "English (US)",
volume = "5",
journal = "BMC Research Notes",
issn = "1756-0500",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis

AU - Stoltzfus, Arlin

AU - O'Meara, Brian

AU - Whitacre, Jamie

AU - Mounce, Ross

AU - Gillespie, Emily L.

AU - Kumar, Sudhir

AU - Rosauer, Dan F.

AU - Vos, Rutger A.

PY - 2012

Y1 - 2012

N2 - Background: Recently, various evolution-related journals adopted policies to encourage or require archiving of phylogenetic trees and associated data. Such attention to practices that promote sharing of data reflects rapidly improving information technology, and rapidly expanding potential to use this technology to aggregate and link data from previously published research. Nevertheless, little is known about current practices, or best practices, for publishing trees and associated data so as to promote re-use. Findings. Here we summarize results of an ongoing analysis of current practices for archiving phylogenetic trees and associated data, current practices of re-use, and current barriers to re-use. We find that the technical infrastructure is available to support rudimentary archiving, but the frequency of archiving is low. Currently, most phylogenetic knowledge is not easily re-used due to a lack of archiving, lack of awareness of best practices, and lack of community-wide standards for formatting data, naming entities, and annotating data. Most attempts at data re-use seem to end in disappointment. Nevertheless, we find many positive examples of data re-use, particularly those that involve customized species trees generated by grafting to, and pruning from, a much larger tree. Conclusions: The technologies and practices that facilitate data re-use can catalyze synthetic and integrative research. However, success will require engagement from various stakeholders including individual scientists who produce or consume shareable data, publishers, policy-makers, technology developers and resource-providers. The critical challenges for facilitating re-use of phylogenetic trees and associated data, we suggest, include: a broader commitment to public archiving; more extensive use of globally meaningful identifiers; development of user-friendly technology for annotating, submitting, searching, and retrieving data and their metadata; and development of a minimum reporting standard (MIAPA) indicating which kinds of data and metadata are most important for a re-useable phylogenetic record.

AB - Background: Recently, various evolution-related journals adopted policies to encourage or require archiving of phylogenetic trees and associated data. Such attention to practices that promote sharing of data reflects rapidly improving information technology, and rapidly expanding potential to use this technology to aggregate and link data from previously published research. Nevertheless, little is known about current practices, or best practices, for publishing trees and associated data so as to promote re-use. Findings. Here we summarize results of an ongoing analysis of current practices for archiving phylogenetic trees and associated data, current practices of re-use, and current barriers to re-use. We find that the technical infrastructure is available to support rudimentary archiving, but the frequency of archiving is low. Currently, most phylogenetic knowledge is not easily re-used due to a lack of archiving, lack of awareness of best practices, and lack of community-wide standards for formatting data, naming entities, and annotating data. Most attempts at data re-use seem to end in disappointment. Nevertheless, we find many positive examples of data re-use, particularly those that involve customized species trees generated by grafting to, and pruning from, a much larger tree. Conclusions: The technologies and practices that facilitate data re-use can catalyze synthetic and integrative research. However, success will require engagement from various stakeholders including individual scientists who produce or consume shareable data, publishers, policy-makers, technology developers and resource-providers. The critical challenges for facilitating re-use of phylogenetic trees and associated data, we suggest, include: a broader commitment to public archiving; more extensive use of globally meaningful identifiers; development of user-friendly technology for annotating, submitting, searching, and retrieving data and their metadata; and development of a minimum reporting standard (MIAPA) indicating which kinds of data and metadata are most important for a re-useable phylogenetic record.

KW - Bioinformatics

KW - Data sharing

KW - Evolution

KW - Phylogeny

KW - Phyloinformatics

KW - Standards

UR - http://www.scopus.com/inward/record.url?scp=84867641403&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867641403&partnerID=8YFLogxK

U2 - 10.1186/1756-0500-5-574

DO - 10.1186/1756-0500-5-574

M3 - Article

C2 - 23088596

AN - SCOPUS:84867641403

VL - 5

JO - BMC Research Notes

JF - BMC Research Notes

SN - 1756-0500

M1 - 574

ER -