The Effects of Sampling Location and Predictor Point Estimate Certainty on Posterior Support in Bayesian Phylogeographic Generalized Linear Models

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

The use of generalized linear models in Bayesian phylogeography has enabled researchers to simultaneously reconstruct the spatiotemporal history of a virus and quantify the contribution of predictor variables to that process. However, little is known about the sensitivity of this method to the choice of the discrete state partition. Here we investigate this question by analyzing a data set containing 299 sequences of the West Nile virus envelope gene sampled in the United States and fifteen predictors aggregated at four spatial levels. We demonstrate that although the topology of the viral phylogenies was consistent across analyses, support for the predictors depended on the level of aggregation. In particular, we found that the variance of the predictor support metrics was minimized at the most precise level for several predictors and maximized at more sparse levels of aggregation. These results suggest that caution should be taken when partitioning a region into discrete locations to ensure that interpretable, reproducible posterior estimates are obtained. These results also demonstrate why researchers should use the most precise discrete states possible to minimize the posterior variance in such estimates and reveal what truly drives the diffusion of viruses.

Original languageEnglish (US)
Article number5905
JournalScientific Reports
Volume8
Issue number1
DOIs
StatePublished - Dec 1 2018

Fingerprint

Linear Models
Research Personnel
Phylogeography
Viruses
West Nile virus
Phylogeny
History
Genes
Datasets

ASJC Scopus subject areas

  • General

Cite this

@article{62969e19274b43b59766bcf23200e707,
title = "The Effects of Sampling Location and Predictor Point Estimate Certainty on Posterior Support in Bayesian Phylogeographic Generalized Linear Models",
abstract = "The use of generalized linear models in Bayesian phylogeography has enabled researchers to simultaneously reconstruct the spatiotemporal history of a virus and quantify the contribution of predictor variables to that process. However, little is known about the sensitivity of this method to the choice of the discrete state partition. Here we investigate this question by analyzing a data set containing 299 sequences of the West Nile virus envelope gene sampled in the United States and fifteen predictors aggregated at four spatial levels. We demonstrate that although the topology of the viral phylogenies was consistent across analyses, support for the predictors depended on the level of aggregation. In particular, we found that the variance of the predictor support metrics was minimized at the most precise level for several predictors and maximized at more sparse levels of aggregation. These results suggest that caution should be taken when partitioning a region into discrete locations to ensure that interpretable, reproducible posterior estimates are obtained. These results also demonstrate why researchers should use the most precise discrete states possible to minimize the posterior variance in such estimates and reveal what truly drives the diffusion of viruses.",
author = "Daniel Magee and Jesse Taylor and Matthew Scotch",
year = "2018",
month = "12",
day = "1",
doi = "10.1038/s41598-018-24264-8",
language = "English (US)",
volume = "8",
journal = "Scientific Reports",
issn = "2045-2322",
publisher = "Nature Publishing Group",
number = "1",

}

TY - JOUR

T1 - The Effects of Sampling Location and Predictor Point Estimate Certainty on Posterior Support in Bayesian Phylogeographic Generalized Linear Models

AU - Magee, Daniel

AU - Taylor, Jesse

AU - Scotch, Matthew

PY - 2018/12/1

Y1 - 2018/12/1

N2 - The use of generalized linear models in Bayesian phylogeography has enabled researchers to simultaneously reconstruct the spatiotemporal history of a virus and quantify the contribution of predictor variables to that process. However, little is known about the sensitivity of this method to the choice of the discrete state partition. Here we investigate this question by analyzing a data set containing 299 sequences of the West Nile virus envelope gene sampled in the United States and fifteen predictors aggregated at four spatial levels. We demonstrate that although the topology of the viral phylogenies was consistent across analyses, support for the predictors depended on the level of aggregation. In particular, we found that the variance of the predictor support metrics was minimized at the most precise level for several predictors and maximized at more sparse levels of aggregation. These results suggest that caution should be taken when partitioning a region into discrete locations to ensure that interpretable, reproducible posterior estimates are obtained. These results also demonstrate why researchers should use the most precise discrete states possible to minimize the posterior variance in such estimates and reveal what truly drives the diffusion of viruses.

AB - The use of generalized linear models in Bayesian phylogeography has enabled researchers to simultaneously reconstruct the spatiotemporal history of a virus and quantify the contribution of predictor variables to that process. However, little is known about the sensitivity of this method to the choice of the discrete state partition. Here we investigate this question by analyzing a data set containing 299 sequences of the West Nile virus envelope gene sampled in the United States and fifteen predictors aggregated at four spatial levels. We demonstrate that although the topology of the viral phylogenies was consistent across analyses, support for the predictors depended on the level of aggregation. In particular, we found that the variance of the predictor support metrics was minimized at the most precise level for several predictors and maximized at more sparse levels of aggregation. These results suggest that caution should be taken when partitioning a region into discrete locations to ensure that interpretable, reproducible posterior estimates are obtained. These results also demonstrate why researchers should use the most precise discrete states possible to minimize the posterior variance in such estimates and reveal what truly drives the diffusion of viruses.

UR - http://www.scopus.com/inward/record.url?scp=85045477140&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85045477140&partnerID=8YFLogxK

U2 - 10.1038/s41598-018-24264-8

DO - 10.1038/s41598-018-24264-8

M3 - Article

C2 - 29651124

AN - SCOPUS:85045477140

VL - 8

JO - Scientific Reports

JF - Scientific Reports

SN - 2045-2322

IS - 1

M1 - 5905

ER -