Going back to the roots: Evaluating Bayesian phylogeographic models with discrete trait uncertainty

Matteo A. Vaiente, Matthew Scotch

Research output: Contribution to journalArticlepeer-review

Abstract

Phylogeography is a popular way to analyze virus sequences annotated with discrete, epidemiologically-relevant, trait data. For applied public health surveillance, a key quantity of interest is often the state at the root of the inferred phylogeny. In epidemiological terms, this represents the geographic origin of the observed outbreak. Since determining the origin of an outbreak is often critical for public health intervention, it is prudent to understand how well phylogeographic models perform this root state classification task under various analytical scenarios. Specifically, we investigate how discrete state space and sequence data set influence the root state classification accuracy. We performed phylogeographic inference on several simulated DNA data sets while i) increasing the number of sequences and ii) increasing the total number of possible discrete trait values. We show that phylogeographic models tend to perform best at intermediate sequence data set sizes. Further, we demonstrate that a popular metric used for evaluation of phylogeographic models, the Kullback-Leibler (KL) divergence, both increases with discrete state space and data set sizes. Further, by modeling phylogeographic root state classification accuracy using logistic regression, we show that KL is not supported as a predictor of model accuracy, indicating its limited utility for assessing phylogeographic model performance on empirical data. These results suggest that relying solely on the KL metric may lead to artificially inflated support for models with finer discretization schemes and larger data set sizes. These results will be important for public health practitioners seeking to use phylogeographic models for applied infectious disease surveillance.

Original languageEnglish (US)
Article number104501
JournalInfection, Genetics and Evolution
Volume85
DOIs
StatePublished - Nov 2020

Keywords

  • Bayesian statistics
  • Model evaluation
  • Phylogenetics
  • Phylogeography

ASJC Scopus subject areas

  • Microbiology
  • Ecology, Evolution, Behavior and Systematics
  • Molecular Biology
  • Genetics
  • Microbiology (medical)
  • Infectious Diseases

Fingerprint Dive into the research topics of 'Going back to the roots: Evaluating Bayesian phylogeographic models with discrete trait uncertainty'. Together they form a unique fingerprint.

Cite this