The Effects of Sampling Location and Predictor Point Estimate Certainty on Posterior Support in Bayesian Phylogeographic Generalized Linear Models

Daniel Magee; Jesse Taylor; Matthew Scotch

doi:10.1038/s41598-018-24264-8

The Effects of Sampling Location and Predictor Point Estimate Certainty on Posterior Support in Bayesian Phylogeographic Generalized Linear Models

Daniel Magee, Jesse Taylor, Matthew Scotch

Research output: Contribution to journal › Article › peer-review

2 Scopus citations

Abstract

The use of generalized linear models in Bayesian phylogeography has enabled researchers to simultaneously reconstruct the spatiotemporal history of a virus and quantify the contribution of predictor variables to that process. However, little is known about the sensitivity of this method to the choice of the discrete state partition. Here we investigate this question by analyzing a data set containing 299 sequences of the West Nile virus envelope gene sampled in the United States and fifteen predictors aggregated at four spatial levels. We demonstrate that although the topology of the viral phylogenies was consistent across analyses, support for the predictors depended on the level of aggregation. In particular, we found that the variance of the predictor support metrics was minimized at the most precise level for several predictors and maximized at more sparse levels of aggregation. These results suggest that caution should be taken when partitioning a region into discrete locations to ensure that interpretable, reproducible posterior estimates are obtained. These results also demonstrate why researchers should use the most precise discrete states possible to minimize the posterior variance in such estimates and reveal what truly drives the diffusion of viruses.

Original language	English (US)
Article number	5905
Journal	Scientific reports
Volume	8
Issue number	1
DOIs	https://doi.org/10.1038/s41598-018-24264-8
State	Published - Dec 1 2018

ASJC Scopus subject areas

General

Access to Document

10.1038/s41598-018-24264-8

Cite this

@article{62969e19274b43b59766bcf23200e707,

title = "The Effects of Sampling Location and Predictor Point Estimate Certainty on Posterior Support in Bayesian Phylogeographic Generalized Linear Models",

abstract = "The use of generalized linear models in Bayesian phylogeography has enabled researchers to simultaneously reconstruct the spatiotemporal history of a virus and quantify the contribution of predictor variables to that process. However, little is known about the sensitivity of this method to the choice of the discrete state partition. Here we investigate this question by analyzing a data set containing 299 sequences of the West Nile virus envelope gene sampled in the United States and fifteen predictors aggregated at four spatial levels. We demonstrate that although the topology of the viral phylogenies was consistent across analyses, support for the predictors depended on the level of aggregation. In particular, we found that the variance of the predictor support metrics was minimized at the most precise level for several predictors and maximized at more sparse levels of aggregation. These results suggest that caution should be taken when partitioning a region into discrete locations to ensure that interpretable, reproducible posterior estimates are obtained. These results also demonstrate why researchers should use the most precise discrete states possible to minimize the posterior variance in such estimates and reveal what truly drives the diffusion of viruses.",

author = "Daniel Magee and Jesse Taylor and Matthew Scotch",

note = "Publisher Copyright: {\textcopyright} 2018 The Author(s).",

year = "2018",

month = dec,

day = "1",

doi = "10.1038/s41598-018-24264-8",

language = "English (US)",

volume = "8",

journal = "Scientific reports",

issn = "2045-2322",

publisher = "Nature Publishing Group",

number = "1",

}

TY - JOUR

T1 - The Effects of Sampling Location and Predictor Point Estimate Certainty on Posterior Support in Bayesian Phylogeographic Generalized Linear Models

AU - Magee, Daniel

AU - Taylor, Jesse

AU - Scotch, Matthew

PY - 2018/12/1

Y1 - 2018/12/1

N2 - The use of generalized linear models in Bayesian phylogeography has enabled researchers to simultaneously reconstruct the spatiotemporal history of a virus and quantify the contribution of predictor variables to that process. However, little is known about the sensitivity of this method to the choice of the discrete state partition. Here we investigate this question by analyzing a data set containing 299 sequences of the West Nile virus envelope gene sampled in the United States and fifteen predictors aggregated at four spatial levels. We demonstrate that although the topology of the viral phylogenies was consistent across analyses, support for the predictors depended on the level of aggregation. In particular, we found that the variance of the predictor support metrics was minimized at the most precise level for several predictors and maximized at more sparse levels of aggregation. These results suggest that caution should be taken when partitioning a region into discrete locations to ensure that interpretable, reproducible posterior estimates are obtained. These results also demonstrate why researchers should use the most precise discrete states possible to minimize the posterior variance in such estimates and reveal what truly drives the diffusion of viruses.

AB - The use of generalized linear models in Bayesian phylogeography has enabled researchers to simultaneously reconstruct the spatiotemporal history of a virus and quantify the contribution of predictor variables to that process. However, little is known about the sensitivity of this method to the choice of the discrete state partition. Here we investigate this question by analyzing a data set containing 299 sequences of the West Nile virus envelope gene sampled in the United States and fifteen predictors aggregated at four spatial levels. We demonstrate that although the topology of the viral phylogenies was consistent across analyses, support for the predictors depended on the level of aggregation. In particular, we found that the variance of the predictor support metrics was minimized at the most precise level for several predictors and maximized at more sparse levels of aggregation. These results suggest that caution should be taken when partitioning a region into discrete locations to ensure that interpretable, reproducible posterior estimates are obtained. These results also demonstrate why researchers should use the most precise discrete states possible to minimize the posterior variance in such estimates and reveal what truly drives the diffusion of viruses.

UR - http://www.scopus.com/inward/record.url?scp=85045477140&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85045477140&partnerID=8YFLogxK

U2 - 10.1038/s41598-018-24264-8

DO - 10.1038/s41598-018-24264-8

M3 - Article

C2 - 29651124

AN - SCOPUS:85045477140

SN - 2045-2322

VL - 8

JO - Scientific reports

JF - Scientific reports

IS - 1

M1 - 5905

ER -

The Effects of Sampling Location and Predictor Point Estimate Certainty on Posterior Support in Bayesian Phylogeographic Generalized Linear Models

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this