TY - JOUR
T1 - Verbalizing phylogenomic conflict
T2 - Representation of node congruence across competing reconstructions of the neoavian explosion
AU - Franz, Nico M.
AU - Musher, Lukas J.
AU - Brown, Joseph W.
AU - Yu, Shizhuo
AU - Ludäscher, Bertram
N1 - Funding Information:
Support for NMF research on this manuscript (as PI): National Science Foundation; award DEB-1155984; https://nsf.gov/awardsearch/showAward?AWD_ID=1155984 National Science Foundation; award DBI-1342595; https://www.nsf.gov/awardsearch/showAward?AWD_ID=1342595 Support for JWB research on this manuscript (as postdoc): National Science Foundation; award DEB-1207915; https://www.nsf.gov/awardsearch/showAward?AWD_ID=1207915 Support for BL research on this manuscript (as PI): National Science Foundation; award IIS-1118088; https:// www.nsf.gov/awardsearch/showAward?AWD_ID= 1118088 National Science Foundation; award DBI-1147273; https://www.nsf.gov/awardsearch/showAward?AWD_ID=1147273 The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Publisher Copyright:
© 2019 Franz et al.
PY - 2019/2
Y1 - 2019/2
N2 - Phylogenomic research is accelerating the publication of landmark studies that aim to resolve deep divergences of major organismal groups. Meanwhile, systems for identifying and integrating the products of phylogenomic inference–such as newly supported clade concepts– have not kept pace. However, the ability to verbalize node concept congruence and conflict across multiple, in effect simultaneously endorsed phylogenomic hypotheses, is a prerequisite for building synthetic data environments for biological systematics and other domains impacted by these conflicting inferences. Here we develop a novel solution to the conflict ver-balization challenge, based on a logic representation and reasoning approach that utilizes the language of Region Connection Calculus (RCC–5) to produce consistent alignments of node concepts endorsed by incongruent phylogenomic studies. The approach employs clade concept labels to individuate concepts used by each source, even if these carry identical names. Indirect RCC–5 modeling of intensional (property-based) node concept definitions, facilitated by the local relaxation of coverage constraints, allows parent concepts to attain congruence in spite of their differentially sampled children. To demonstrate the feasibility of this approach, we align two recent phylogenomic reconstructions of higher-level avian groups that entail strong conflict in the "neoavian explosion" region. According to our representations, this conflict is constituted by 26 instances of input "whole concept" overlap. These instances are further resolvable in the output labeling schemes and visualizations as "split concepts", which provide the labels and relations needed to build truly synthetic phylogenomic data environments. Because the RCC–5 alignments fundamentally reflect the trained, logic-enabled judgments of systematic experts, future designs for such environments need to promote a culture where experts routinely assess the intensionalities of node concepts published by our peers–even and especially when we are not in agreement with each other.
AB - Phylogenomic research is accelerating the publication of landmark studies that aim to resolve deep divergences of major organismal groups. Meanwhile, systems for identifying and integrating the products of phylogenomic inference–such as newly supported clade concepts– have not kept pace. However, the ability to verbalize node concept congruence and conflict across multiple, in effect simultaneously endorsed phylogenomic hypotheses, is a prerequisite for building synthetic data environments for biological systematics and other domains impacted by these conflicting inferences. Here we develop a novel solution to the conflict ver-balization challenge, based on a logic representation and reasoning approach that utilizes the language of Region Connection Calculus (RCC–5) to produce consistent alignments of node concepts endorsed by incongruent phylogenomic studies. The approach employs clade concept labels to individuate concepts used by each source, even if these carry identical names. Indirect RCC–5 modeling of intensional (property-based) node concept definitions, facilitated by the local relaxation of coverage constraints, allows parent concepts to attain congruence in spite of their differentially sampled children. To demonstrate the feasibility of this approach, we align two recent phylogenomic reconstructions of higher-level avian groups that entail strong conflict in the "neoavian explosion" region. According to our representations, this conflict is constituted by 26 instances of input "whole concept" overlap. These instances are further resolvable in the output labeling schemes and visualizations as "split concepts", which provide the labels and relations needed to build truly synthetic phylogenomic data environments. Because the RCC–5 alignments fundamentally reflect the trained, logic-enabled judgments of systematic experts, future designs for such environments need to promote a culture where experts routinely assess the intensionalities of node concepts published by our peers–even and especially when we are not in agreement with each other.
UR - http://www.scopus.com/inward/record.url?scp=85062759750&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85062759750&partnerID=8YFLogxK
U2 - 10.1371/journal.pcbi.1006493
DO - 10.1371/journal.pcbi.1006493
M3 - Article
C2 - 30768597
AN - SCOPUS:85062759750
SN - 1553-734X
VL - 15
JO - PLoS Computational Biology
JF - PLoS Computational Biology
IS - 2
M1 - e1006493
ER -