A cautionary note on exact unconditional inference for a difference between two independent binomial proportions

Devan V. Mehrotra, Ivan S F Chan, Roger L. Berger

Research output: Contribution to journalArticle

73 Citations (Scopus)

Abstract

Fisher's exact test for comparing response proportions in a randomized experiment can be overly conservative when the group sizes are small or when the response proportions are close to zero or one. This is primarily because the null distribution of the test statistic becomes too discrete, a partial consequence of the inference being conditional on the total number of responders. Accordingly, exact unconditional procedures have gained in popularity, on the premise that power will increase because the null distribution of the test statistic will presumably be less discrete. However, we caution researchers that a poor choice of test statistic for exact unconditional inference can actually result in a substantially less powerful analysis than Fisher's conditional test. To illustrate, we study a real example and provide exact test size and power results for several competing tests, for both balanced and unbalanced designs. Our results reveal that Fisher's test generally outperforms exact unconditional tests based on using as the test statistic either the observed difference in proportions, or the observed difference divided by its estimated standard error under the alternative hypothesis, the latter for unbalanced designs only. On the other hand, the exact unconditional test based on the observed difference divided by its estimated standard error under the null hypothesis (score statistic) outperforms Fisher's test, and is recommended. Boschloo's test, in which the p-value from Fisher's test is used as the test statistic in an exact unconditional test, is uniformly more powerful than Fisher's test, and is also recommended.

Original languageEnglish (US)
Pages (from-to)441-450
Number of pages10
JournalBiometrics
Volume59
Issue number2
DOIs
StatePublished - Jun 2003
Externally publishedYes

Fingerprint

Exact Test
Proportion
Unconditional Test
Test Statistic
Statistics
Estimated standard error
Unbalanced Designs
Divided Differences
Null Distribution
testing
statistics
Randomized Experiments
Fisher's Exact Test
Conditional Inference
Conditional Test
Balanced Design
Score Statistic
p-Value
Null hypothesis
Research Personnel

Keywords

  • 2 × 2 contingency table
  • Berger and Boos confidence interval search
  • Boschloo's test
  • Conditional test
  • Discreteness
  • Fisher's exact test
  • Score test

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Public Health, Environmental and Occupational Health
  • Agricultural and Biological Sciences (miscellaneous)
  • Applied Mathematics
  • Statistics and Probability

Cite this

A cautionary note on exact unconditional inference for a difference between two independent binomial proportions. / Mehrotra, Devan V.; Chan, Ivan S F; Berger, Roger L.

In: Biometrics, Vol. 59, No. 2, 06.2003, p. 441-450.

Research output: Contribution to journalArticle

Mehrotra, Devan V. ; Chan, Ivan S F ; Berger, Roger L. / A cautionary note on exact unconditional inference for a difference between two independent binomial proportions. In: Biometrics. 2003 ; Vol. 59, No. 2. pp. 441-450.
@article{0d759d9cf7964a9b953d6607c0d127a0,
title = "A cautionary note on exact unconditional inference for a difference between two independent binomial proportions",
abstract = "Fisher's exact test for comparing response proportions in a randomized experiment can be overly conservative when the group sizes are small or when the response proportions are close to zero or one. This is primarily because the null distribution of the test statistic becomes too discrete, a partial consequence of the inference being conditional on the total number of responders. Accordingly, exact unconditional procedures have gained in popularity, on the premise that power will increase because the null distribution of the test statistic will presumably be less discrete. However, we caution researchers that a poor choice of test statistic for exact unconditional inference can actually result in a substantially less powerful analysis than Fisher's conditional test. To illustrate, we study a real example and provide exact test size and power results for several competing tests, for both balanced and unbalanced designs. Our results reveal that Fisher's test generally outperforms exact unconditional tests based on using as the test statistic either the observed difference in proportions, or the observed difference divided by its estimated standard error under the alternative hypothesis, the latter for unbalanced designs only. On the other hand, the exact unconditional test based on the observed difference divided by its estimated standard error under the null hypothesis (score statistic) outperforms Fisher's test, and is recommended. Boschloo's test, in which the p-value from Fisher's test is used as the test statistic in an exact unconditional test, is uniformly more powerful than Fisher's test, and is also recommended.",
keywords = "2 × 2 contingency table, Berger and Boos confidence interval search, Boschloo's test, Conditional test, Discreteness, Fisher's exact test, Score test",
author = "Mehrotra, {Devan V.} and Chan, {Ivan S F} and Berger, {Roger L.}",
year = "2003",
month = "6",
doi = "10.1111/1541-0420.00051",
language = "English (US)",
volume = "59",
pages = "441--450",
journal = "Biometrics",
issn = "0006-341X",
publisher = "Wiley-Blackwell",
number = "2",

}

TY - JOUR

T1 - A cautionary note on exact unconditional inference for a difference between two independent binomial proportions

AU - Mehrotra, Devan V.

AU - Chan, Ivan S F

AU - Berger, Roger L.

PY - 2003/6

Y1 - 2003/6

N2 - Fisher's exact test for comparing response proportions in a randomized experiment can be overly conservative when the group sizes are small or when the response proportions are close to zero or one. This is primarily because the null distribution of the test statistic becomes too discrete, a partial consequence of the inference being conditional on the total number of responders. Accordingly, exact unconditional procedures have gained in popularity, on the premise that power will increase because the null distribution of the test statistic will presumably be less discrete. However, we caution researchers that a poor choice of test statistic for exact unconditional inference can actually result in a substantially less powerful analysis than Fisher's conditional test. To illustrate, we study a real example and provide exact test size and power results for several competing tests, for both balanced and unbalanced designs. Our results reveal that Fisher's test generally outperforms exact unconditional tests based on using as the test statistic either the observed difference in proportions, or the observed difference divided by its estimated standard error under the alternative hypothesis, the latter for unbalanced designs only. On the other hand, the exact unconditional test based on the observed difference divided by its estimated standard error under the null hypothesis (score statistic) outperforms Fisher's test, and is recommended. Boschloo's test, in which the p-value from Fisher's test is used as the test statistic in an exact unconditional test, is uniformly more powerful than Fisher's test, and is also recommended.

AB - Fisher's exact test for comparing response proportions in a randomized experiment can be overly conservative when the group sizes are small or when the response proportions are close to zero or one. This is primarily because the null distribution of the test statistic becomes too discrete, a partial consequence of the inference being conditional on the total number of responders. Accordingly, exact unconditional procedures have gained in popularity, on the premise that power will increase because the null distribution of the test statistic will presumably be less discrete. However, we caution researchers that a poor choice of test statistic for exact unconditional inference can actually result in a substantially less powerful analysis than Fisher's conditional test. To illustrate, we study a real example and provide exact test size and power results for several competing tests, for both balanced and unbalanced designs. Our results reveal that Fisher's test generally outperforms exact unconditional tests based on using as the test statistic either the observed difference in proportions, or the observed difference divided by its estimated standard error under the alternative hypothesis, the latter for unbalanced designs only. On the other hand, the exact unconditional test based on the observed difference divided by its estimated standard error under the null hypothesis (score statistic) outperforms Fisher's test, and is recommended. Boschloo's test, in which the p-value from Fisher's test is used as the test statistic in an exact unconditional test, is uniformly more powerful than Fisher's test, and is also recommended.

KW - 2 × 2 contingency table

KW - Berger and Boos confidence interval search

KW - Boschloo's test

KW - Conditional test

KW - Discreteness

KW - Fisher's exact test

KW - Score test

UR - http://www.scopus.com/inward/record.url?scp=0038544636&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0038544636&partnerID=8YFLogxK

U2 - 10.1111/1541-0420.00051

DO - 10.1111/1541-0420.00051

M3 - Article

VL - 59

SP - 441

EP - 450

JO - Biometrics

JF - Biometrics

SN - 0006-341X

IS - 2

ER -