Abstract

Background When attempting to statistically distinguish between a null and an alternative hypothesis, many researchers in the life and social sciences turn to binned statistical analysis methods, or methods that are simply based on the moments of a distribution (such as the mean, and variance). These methods have the advantage of simplicity of implementation, and simplicity of explanation. However, when null and alternative hypotheses manifest themselves in subtle differences in patterns in the data, binned analysis methods may be insensitive to these differences, and researchers may erroneously fail to reject the null hypothesis when in fact more sensitive statistical analysis methods might produce a different result when the null hypothesis is actually false. Here, with a focus on two recent conflicting studies of contagion in mass killings as instructive examples, we discuss how the use of unbinned likelihood methods makes optimal use of the information in the data; a fact that has been long known in statistical theory, but perhaps is not as widely appreciated amongst general researchers in the life and social sciences. Methods In 2015, Towers et al published a paper that quantified the long-suspected contagion effect in mass killings. However, in 2017, Lankford & Tomek subsequently published a paper, based upon the same data, that claimed to contradict the results of the earlier study. The former used unbinned likelihood methods, and the latter used binned methods, and comparison of distribution moments. Using these analyses, we also discuss how visualization of the data can aid in determination of the most appropriate statistical analysis methods to distinguish between a null and alternate hypothesis. We also discuss the importance of assessment of the robustness of analysis results to methodological assumptions made (for example, arbitrary choices of number of bins and bin widths when using binned methods); an issue that is widely overlooked in the literature, but is critical to analysis reproducibility and robustness. Conclusions When an analysis cannot distinguish between a null and alternate hypothesis, care must be taken to ensure that the analysis methodology itself maximizes the use of information in the data that can distinguish between the two hypotheses. The use of binned methods by Lankford & Tomek (2017), that examined how many mass killings fell within a 14 day window from a previous mass killing, substantially reduced the sensitivity of their analysis to contagion effects. The unbinned likelihood methods used by Towers et al (2015) did not suffer from this problem. While a binned analysis might be favorable for simplicity and clarity of presentation, unbinned likelihood methods are preferable when effects might be somewhat subtle.

Original languageEnglish (US)
Article numbere0196863
JournalPLoS One
Volume13
Issue number5
DOIs
StatePublished - May 1 2018

Fingerprint

Statistical methods
Social sciences
Bins
Visualization
methodology
Biological Science Disciplines
Social Sciences
statistical analysis
social sciences
Research Personnel
researchers
reproducibility
data analysis

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)

Cite this

@article{0366bba8f3494766a1b615c955b80bab,
title = "Detecting the contagion effect in mass killings; a constructive example of the statistical advantages of unbinned likelihood methods",
abstract = "Background When attempting to statistically distinguish between a null and an alternative hypothesis, many researchers in the life and social sciences turn to binned statistical analysis methods, or methods that are simply based on the moments of a distribution (such as the mean, and variance). These methods have the advantage of simplicity of implementation, and simplicity of explanation. However, when null and alternative hypotheses manifest themselves in subtle differences in patterns in the data, binned analysis methods may be insensitive to these differences, and researchers may erroneously fail to reject the null hypothesis when in fact more sensitive statistical analysis methods might produce a different result when the null hypothesis is actually false. Here, with a focus on two recent conflicting studies of contagion in mass killings as instructive examples, we discuss how the use of unbinned likelihood methods makes optimal use of the information in the data; a fact that has been long known in statistical theory, but perhaps is not as widely appreciated amongst general researchers in the life and social sciences. Methods In 2015, Towers et al published a paper that quantified the long-suspected contagion effect in mass killings. However, in 2017, Lankford & Tomek subsequently published a paper, based upon the same data, that claimed to contradict the results of the earlier study. The former used unbinned likelihood methods, and the latter used binned methods, and comparison of distribution moments. Using these analyses, we also discuss how visualization of the data can aid in determination of the most appropriate statistical analysis methods to distinguish between a null and alternate hypothesis. We also discuss the importance of assessment of the robustness of analysis results to methodological assumptions made (for example, arbitrary choices of number of bins and bin widths when using binned methods); an issue that is widely overlooked in the literature, but is critical to analysis reproducibility and robustness. Conclusions When an analysis cannot distinguish between a null and alternate hypothesis, care must be taken to ensure that the analysis methodology itself maximizes the use of information in the data that can distinguish between the two hypotheses. The use of binned methods by Lankford & Tomek (2017), that examined how many mass killings fell within a 14 day window from a previous mass killing, substantially reduced the sensitivity of their analysis to contagion effects. The unbinned likelihood methods used by Towers et al (2015) did not suffer from this problem. While a binned analysis might be favorable for simplicity and clarity of presentation, unbinned likelihood methods are preferable when effects might be somewhat subtle.",
author = "Sherry Towers and Anuj Mubayi and Carlos Castillo-Chavez",
year = "2018",
month = "5",
day = "1",
doi = "10.1371/journal.pone.0196863",
language = "English (US)",
volume = "13",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "5",

}

TY - JOUR

T1 - Detecting the contagion effect in mass killings; a constructive example of the statistical advantages of unbinned likelihood methods

AU - Towers, Sherry

AU - Mubayi, Anuj

AU - Castillo-Chavez, Carlos

PY - 2018/5/1

Y1 - 2018/5/1

N2 - Background When attempting to statistically distinguish between a null and an alternative hypothesis, many researchers in the life and social sciences turn to binned statistical analysis methods, or methods that are simply based on the moments of a distribution (such as the mean, and variance). These methods have the advantage of simplicity of implementation, and simplicity of explanation. However, when null and alternative hypotheses manifest themselves in subtle differences in patterns in the data, binned analysis methods may be insensitive to these differences, and researchers may erroneously fail to reject the null hypothesis when in fact more sensitive statistical analysis methods might produce a different result when the null hypothesis is actually false. Here, with a focus on two recent conflicting studies of contagion in mass killings as instructive examples, we discuss how the use of unbinned likelihood methods makes optimal use of the information in the data; a fact that has been long known in statistical theory, but perhaps is not as widely appreciated amongst general researchers in the life and social sciences. Methods In 2015, Towers et al published a paper that quantified the long-suspected contagion effect in mass killings. However, in 2017, Lankford & Tomek subsequently published a paper, based upon the same data, that claimed to contradict the results of the earlier study. The former used unbinned likelihood methods, and the latter used binned methods, and comparison of distribution moments. Using these analyses, we also discuss how visualization of the data can aid in determination of the most appropriate statistical analysis methods to distinguish between a null and alternate hypothesis. We also discuss the importance of assessment of the robustness of analysis results to methodological assumptions made (for example, arbitrary choices of number of bins and bin widths when using binned methods); an issue that is widely overlooked in the literature, but is critical to analysis reproducibility and robustness. Conclusions When an analysis cannot distinguish between a null and alternate hypothesis, care must be taken to ensure that the analysis methodology itself maximizes the use of information in the data that can distinguish between the two hypotheses. The use of binned methods by Lankford & Tomek (2017), that examined how many mass killings fell within a 14 day window from a previous mass killing, substantially reduced the sensitivity of their analysis to contagion effects. The unbinned likelihood methods used by Towers et al (2015) did not suffer from this problem. While a binned analysis might be favorable for simplicity and clarity of presentation, unbinned likelihood methods are preferable when effects might be somewhat subtle.

AB - Background When attempting to statistically distinguish between a null and an alternative hypothesis, many researchers in the life and social sciences turn to binned statistical analysis methods, or methods that are simply based on the moments of a distribution (such as the mean, and variance). These methods have the advantage of simplicity of implementation, and simplicity of explanation. However, when null and alternative hypotheses manifest themselves in subtle differences in patterns in the data, binned analysis methods may be insensitive to these differences, and researchers may erroneously fail to reject the null hypothesis when in fact more sensitive statistical analysis methods might produce a different result when the null hypothesis is actually false. Here, with a focus on two recent conflicting studies of contagion in mass killings as instructive examples, we discuss how the use of unbinned likelihood methods makes optimal use of the information in the data; a fact that has been long known in statistical theory, but perhaps is not as widely appreciated amongst general researchers in the life and social sciences. Methods In 2015, Towers et al published a paper that quantified the long-suspected contagion effect in mass killings. However, in 2017, Lankford & Tomek subsequently published a paper, based upon the same data, that claimed to contradict the results of the earlier study. The former used unbinned likelihood methods, and the latter used binned methods, and comparison of distribution moments. Using these analyses, we also discuss how visualization of the data can aid in determination of the most appropriate statistical analysis methods to distinguish between a null and alternate hypothesis. We also discuss the importance of assessment of the robustness of analysis results to methodological assumptions made (for example, arbitrary choices of number of bins and bin widths when using binned methods); an issue that is widely overlooked in the literature, but is critical to analysis reproducibility and robustness. Conclusions When an analysis cannot distinguish between a null and alternate hypothesis, care must be taken to ensure that the analysis methodology itself maximizes the use of information in the data that can distinguish between the two hypotheses. The use of binned methods by Lankford & Tomek (2017), that examined how many mass killings fell within a 14 day window from a previous mass killing, substantially reduced the sensitivity of their analysis to contagion effects. The unbinned likelihood methods used by Towers et al (2015) did not suffer from this problem. While a binned analysis might be favorable for simplicity and clarity of presentation, unbinned likelihood methods are preferable when effects might be somewhat subtle.

UR - http://www.scopus.com/inward/record.url?scp=85046813966&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85046813966&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0196863

DO - 10.1371/journal.pone.0196863

M3 - Article

VL - 13

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 5

M1 - e0196863

ER -