TY - JOUR
T1 - Phylotastic! Making tree-of-life knowledge accessible, reusable and convenient
AU - Stoltzfus, Arlin
AU - Lapp, Hilmar
AU - Matasci, Naim
AU - Deus, Helena
AU - Sidlauskas, Brian
AU - Zmasek, Christian M.
AU - Vaidya, Gaurav
AU - Pontelli, Enrico
AU - Cranston, Karen
AU - Vos, Rutger
AU - Webb, Campbell O.
AU - Harmon, Luke J.
AU - Pirrung, Megan
AU - O'Meara, Brian
AU - Pennell, Matthew W.
AU - Mirarab, Siavash
AU - Rosenberg, Michael S.
AU - Balhoff, James P.
AU - Bik, Holly M.
AU - Heath, Tracy A.
AU - Midford, Peter E.
AU - Brown, Joseph W.
AU - McTavish, Emily Jane
AU - Sukumaran, Jeet
AU - Westneat, Mark
AU - Alfaro, Michael E.
AU - Steele, Aaron
AU - Jordan, Greg
N1 - Funding Information:
We thank Mark Holder (TreeStore), Ben Vandervalk (Architecture), Chris Baron (Shiny) and Jon Eastman (DateLife) for their contributions to the hackathon, and we thank Mark Wilkinson and Sergei Pond for participating in the first Leadership Team meeting. We thank Danielle Wilson, David Palmer, and Mattison Ward for administrative and IT support. Supported by NESCent (the National Evolutionary Synthesis Center, NSF #EF-0905606), the iPlant Collaborative (NSF #DBI-0735191), and the Biodiversity Synthesis Center (BioSync) of the Encyclopedia of Life. Additional funding for travel expenses was provided to RV by the Naturalis Research Incentive budget. The identification of any specific commercial products is for the purpose of specifying a protocol, and does not imply a recommendation or endorsement by the National Institute of Standards and Technology.
PY - 2013/5/13
Y1 - 2013/5/13
N2 - Background: Scientists rarely reuse expert knowledge of phylogeny, in spite of years of effort to assemble a great " Tree of Life" (ToL). A notable exception involves the use of Phylomatic, which provides tools to generate custom phylogenies from a large, pre-computed, expert phylogeny of plant taxa. This suggests great potential for a more generalized system that, starting with a query consisting of a list of any known species, would rectify non-standard names, identify expert phylogenies containing the implicated taxa, prune away unneeded parts, and supply branch lengths and annotations, resulting in a custom phylogeny suited to the user's needs. Such a system could become a sustainable community resource if implemented as a distributed system of loosely coupled parts that interact through clearly defined interfaces.Results: With the aim of building such a " phylotastic" system, the NESCent Hackathons, Interoperability, Phylogenies (HIP) working group recruited 2 dozen scientist-programmers to a weeklong programming hackathon in June 2012. During the hackathon (and a three-month follow-up period), 5 teams produced designs, implementations, documentation, presentations, and tests including: (1) a generalized scheme for integrating components; (2) proof-of-concept pruners and controllers; (3) a meta-API for taxonomic name resolution services; (4) a system for storing, finding, and retrieving phylogenies using semantic web technologies for data exchange, storage, and querying; (5) an innovative new service, DateLife.org, which synthesizes pre-computed, time-calibrated phylogenies to assign ages to nodes; and (6) demonstration projects. These outcomes are accessible via a public code repository (GitHub.com), a website (http://www.phylotastic.org), and a server image.Conclusions: Approximately 9 person-months of effort (centered on a software development hackathon) resulted in the design and implementation of proof-of-concept software for 4 core phylotastic components, 3 controllers, and 3 end-user demonstration tools. While these products have substantial limitations, they suggest considerable potential for a distributed system that makes phylogenetic knowledge readily accessible in computable form. Widespread use of phylotastic systems will create an electronic marketplace for sharing phylogenetic knowledge that will spur innovation in other areas of the ToL enterprise, such as annotation of sources and methods and third-party methods of quality assessment.
AB - Background: Scientists rarely reuse expert knowledge of phylogeny, in spite of years of effort to assemble a great " Tree of Life" (ToL). A notable exception involves the use of Phylomatic, which provides tools to generate custom phylogenies from a large, pre-computed, expert phylogeny of plant taxa. This suggests great potential for a more generalized system that, starting with a query consisting of a list of any known species, would rectify non-standard names, identify expert phylogenies containing the implicated taxa, prune away unneeded parts, and supply branch lengths and annotations, resulting in a custom phylogeny suited to the user's needs. Such a system could become a sustainable community resource if implemented as a distributed system of loosely coupled parts that interact through clearly defined interfaces.Results: With the aim of building such a " phylotastic" system, the NESCent Hackathons, Interoperability, Phylogenies (HIP) working group recruited 2 dozen scientist-programmers to a weeklong programming hackathon in June 2012. During the hackathon (and a three-month follow-up period), 5 teams produced designs, implementations, documentation, presentations, and tests including: (1) a generalized scheme for integrating components; (2) proof-of-concept pruners and controllers; (3) a meta-API for taxonomic name resolution services; (4) a system for storing, finding, and retrieving phylogenies using semantic web technologies for data exchange, storage, and querying; (5) an innovative new service, DateLife.org, which synthesizes pre-computed, time-calibrated phylogenies to assign ages to nodes; and (6) demonstration projects. These outcomes are accessible via a public code repository (GitHub.com), a website (http://www.phylotastic.org), and a server image.Conclusions: Approximately 9 person-months of effort (centered on a software development hackathon) resulted in the design and implementation of proof-of-concept software for 4 core phylotastic components, 3 controllers, and 3 end-user demonstration tools. While these products have substantial limitations, they suggest considerable potential for a distributed system that makes phylogenetic knowledge readily accessible in computable form. Widespread use of phylotastic systems will create an electronic marketplace for sharing phylogenetic knowledge that will spur innovation in other areas of the ToL enterprise, such as annotation of sources and methods and third-party methods of quality assessment.
KW - Data reuse
KW - Hackathon
KW - Phylogeny
KW - Taxonomy
KW - Tree of life
KW - Web services
UR - http://www.scopus.com/inward/record.url?scp=84877613686&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84877613686&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-14-158
DO - 10.1186/1471-2105-14-158
M3 - Article
C2 - 23668630
AN - SCOPUS:84877613686
SN - 1471-2105
VL - 14
JO - BMC bioinformatics
JF - BMC bioinformatics
M1 - 158
ER -