### Abstract

Fisher's exact test for comparing response proportions in a randomized experiment can be overly conservative when the group sizes are small or when the response proportions are close to zero or one. This is primarily because the null distribution of the test statistic becomes too discrete, a partial consequence of the inference being conditional on the total number of responders. Accordingly, exact unconditional procedures have gained in popularity, on the premise that power will increase because the null distribution of the test statistic will presumably be less discrete. However, we caution researchers that a poor choice of test statistic for exact unconditional inference can actually result in a substantially less powerful analysis than Fisher's conditional test. To illustrate, we study a real example and provide exact test size and power results for several competing tests, for both balanced and unbalanced designs. Our results reveal that Fisher's test generally outperforms exact unconditional tests based on using as the test statistic either the observed difference in proportions, or the observed difference divided by its estimated standard error under the alternative hypothesis, the latter for unbalanced designs only. On the other hand, the exact unconditional test based on the observed difference divided by its estimated standard error under the null hypothesis (score statistic) outperforms Fisher's test, and is recommended. Boschloo's test, in which the p-value from Fisher's test is used as the test statistic in an exact unconditional test, is uniformly more powerful than Fisher's test, and is also recommended.

Original language | English (US) |
---|---|

Pages (from-to) | 441-450 |

Number of pages | 10 |

Journal | Biometrics |

Volume | 59 |

Issue number | 2 |

DOIs | |

State | Published - Jun 2003 |

Externally published | Yes |

### Fingerprint

### Keywords

- 2 × 2 contingency table
- Berger and Boos confidence interval search
- Boschloo's test
- Conditional test
- Discreteness
- Fisher's exact test
- Score test

### ASJC Scopus subject areas

- Agricultural and Biological Sciences(all)
- Public Health, Environmental and Occupational Health
- Agricultural and Biological Sciences (miscellaneous)
- Applied Mathematics
- Statistics and Probability

### Cite this

*Biometrics*,

*59*(2), 441-450. https://doi.org/10.1111/1541-0420.00051

**A cautionary note on exact unconditional inference for a difference between two independent binomial proportions.** / Mehrotra, Devan V.; Chan, Ivan S F; Berger, Roger L.

Research output: Contribution to journal › Article

*Biometrics*, vol. 59, no. 2, pp. 441-450. https://doi.org/10.1111/1541-0420.00051

}

TY - JOUR

T1 - A cautionary note on exact unconditional inference for a difference between two independent binomial proportions

AU - Mehrotra, Devan V.

AU - Chan, Ivan S F

AU - Berger, Roger L.

PY - 2003/6

Y1 - 2003/6

N2 - Fisher's exact test for comparing response proportions in a randomized experiment can be overly conservative when the group sizes are small or when the response proportions are close to zero or one. This is primarily because the null distribution of the test statistic becomes too discrete, a partial consequence of the inference being conditional on the total number of responders. Accordingly, exact unconditional procedures have gained in popularity, on the premise that power will increase because the null distribution of the test statistic will presumably be less discrete. However, we caution researchers that a poor choice of test statistic for exact unconditional inference can actually result in a substantially less powerful analysis than Fisher's conditional test. To illustrate, we study a real example and provide exact test size and power results for several competing tests, for both balanced and unbalanced designs. Our results reveal that Fisher's test generally outperforms exact unconditional tests based on using as the test statistic either the observed difference in proportions, or the observed difference divided by its estimated standard error under the alternative hypothesis, the latter for unbalanced designs only. On the other hand, the exact unconditional test based on the observed difference divided by its estimated standard error under the null hypothesis (score statistic) outperforms Fisher's test, and is recommended. Boschloo's test, in which the p-value from Fisher's test is used as the test statistic in an exact unconditional test, is uniformly more powerful than Fisher's test, and is also recommended.

AB - Fisher's exact test for comparing response proportions in a randomized experiment can be overly conservative when the group sizes are small or when the response proportions are close to zero or one. This is primarily because the null distribution of the test statistic becomes too discrete, a partial consequence of the inference being conditional on the total number of responders. Accordingly, exact unconditional procedures have gained in popularity, on the premise that power will increase because the null distribution of the test statistic will presumably be less discrete. However, we caution researchers that a poor choice of test statistic for exact unconditional inference can actually result in a substantially less powerful analysis than Fisher's conditional test. To illustrate, we study a real example and provide exact test size and power results for several competing tests, for both balanced and unbalanced designs. Our results reveal that Fisher's test generally outperforms exact unconditional tests based on using as the test statistic either the observed difference in proportions, or the observed difference divided by its estimated standard error under the alternative hypothesis, the latter for unbalanced designs only. On the other hand, the exact unconditional test based on the observed difference divided by its estimated standard error under the null hypothesis (score statistic) outperforms Fisher's test, and is recommended. Boschloo's test, in which the p-value from Fisher's test is used as the test statistic in an exact unconditional test, is uniformly more powerful than Fisher's test, and is also recommended.

KW - 2 × 2 contingency table

KW - Berger and Boos confidence interval search

KW - Boschloo's test

KW - Conditional test

KW - Discreteness

KW - Fisher's exact test

KW - Score test

UR - http://www.scopus.com/inward/record.url?scp=0038544636&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0038544636&partnerID=8YFLogxK

U2 - 10.1111/1541-0420.00051

DO - 10.1111/1541-0420.00051

M3 - Article

VL - 59

SP - 441

EP - 450

JO - Biometrics

JF - Biometrics

SN - 0006-341X

IS - 2

ER -