Hierarchical models for cross-classified overdispersed multinomial data

Jeffrey Wilson, Kenneth J. Koehler

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

When a vector of sample proportions is not obtained through a simple random sampling, the covariance matrix for the sample vector can differ substantially from the one corresponding to the multinomial model (Wilson 1989). For example, clustering effects of subject effects in repeated-measure experiments can cause the variance of the observed proportions to be much larger than variances under the multinomial model. The phenomenon is generally referred to as overdispersion. Tallis (1962) proposed a model for identically distributed multinomials with a common measure of correlation and referred to it as the generalized multinomial model. This generalized multinomial model is extended in this article to account for overdispersion by allowing the vectors of proportions to vary according to a Dirichlet distribution. The generalized Dirichlet- multinomial model (as it is referred to here) allows for a second order of pairwise correlation among units, a type of assumption found reasonable in some biological data (Kupper and Haseman 1978) and introduced here to business data. An alternative derivation allowing for two kinds of variation is also considered. Asymptotic normal properties of parameter estimators are used to construct Wald statistics for testing hypotheses. The methods are illustrated with applications to performance evaluation monthly data and an integrated circuit yield analysis.

Original languageEnglish (US)
Pages (from-to)103-110
Number of pages8
JournalJournal of Business and Economic Statistics
Volume9
Issue number1
DOIs
StatePublished - 1991

Fingerprint

Multinomial Model
Hierarchical Model
Overdispersion
Proportion
Wald Statistic
Dirichlet Distribution
Testing Hypotheses
Simple Random Sampling
Repeated Measures
Integrated Circuits
hypothesis testing
Identically distributed
Covariance matrix
Dirichlet
Performance Evaluation
Pairwise
Hierarchical model
Clustering
Vary
data analysis

Keywords

  • Correlated
  • Crossed
  • Dirichlet
  • Generalized multinomial model
  • Nested

ASJC Scopus subject areas

  • Statistics and Probability
  • Economics and Econometrics
  • Statistics, Probability and Uncertainty
  • Social Sciences (miscellaneous)

Cite this

Hierarchical models for cross-classified overdispersed multinomial data. / Wilson, Jeffrey; Koehler, Kenneth J.

In: Journal of Business and Economic Statistics, Vol. 9, No. 1, 1991, p. 103-110.

Research output: Contribution to journalArticle

@article{b77559e88e9b433d83878e647cef1758,
title = "Hierarchical models for cross-classified overdispersed multinomial data",
abstract = "When a vector of sample proportions is not obtained through a simple random sampling, the covariance matrix for the sample vector can differ substantially from the one corresponding to the multinomial model (Wilson 1989). For example, clustering effects of subject effects in repeated-measure experiments can cause the variance of the observed proportions to be much larger than variances under the multinomial model. The phenomenon is generally referred to as overdispersion. Tallis (1962) proposed a model for identically distributed multinomials with a common measure of correlation and referred to it as the generalized multinomial model. This generalized multinomial model is extended in this article to account for overdispersion by allowing the vectors of proportions to vary according to a Dirichlet distribution. The generalized Dirichlet- multinomial model (as it is referred to here) allows for a second order of pairwise correlation among units, a type of assumption found reasonable in some biological data (Kupper and Haseman 1978) and introduced here to business data. An alternative derivation allowing for two kinds of variation is also considered. Asymptotic normal properties of parameter estimators are used to construct Wald statistics for testing hypotheses. The methods are illustrated with applications to performance evaluation monthly data and an integrated circuit yield analysis.",
keywords = "Correlated, Crossed, Dirichlet, Generalized multinomial model, Nested",
author = "Jeffrey Wilson and Koehler, {Kenneth J.}",
year = "1991",
doi = "10.1080/07350015.1991.10509832",
language = "English (US)",
volume = "9",
pages = "103--110",
journal = "Journal of Business and Economic Statistics",
issn = "0735-0015",
publisher = "American Statistical Association",
number = "1",

}

TY - JOUR

T1 - Hierarchical models for cross-classified overdispersed multinomial data

AU - Wilson, Jeffrey

AU - Koehler, Kenneth J.

PY - 1991

Y1 - 1991

N2 - When a vector of sample proportions is not obtained through a simple random sampling, the covariance matrix for the sample vector can differ substantially from the one corresponding to the multinomial model (Wilson 1989). For example, clustering effects of subject effects in repeated-measure experiments can cause the variance of the observed proportions to be much larger than variances under the multinomial model. The phenomenon is generally referred to as overdispersion. Tallis (1962) proposed a model for identically distributed multinomials with a common measure of correlation and referred to it as the generalized multinomial model. This generalized multinomial model is extended in this article to account for overdispersion by allowing the vectors of proportions to vary according to a Dirichlet distribution. The generalized Dirichlet- multinomial model (as it is referred to here) allows for a second order of pairwise correlation among units, a type of assumption found reasonable in some biological data (Kupper and Haseman 1978) and introduced here to business data. An alternative derivation allowing for two kinds of variation is also considered. Asymptotic normal properties of parameter estimators are used to construct Wald statistics for testing hypotheses. The methods are illustrated with applications to performance evaluation monthly data and an integrated circuit yield analysis.

AB - When a vector of sample proportions is not obtained through a simple random sampling, the covariance matrix for the sample vector can differ substantially from the one corresponding to the multinomial model (Wilson 1989). For example, clustering effects of subject effects in repeated-measure experiments can cause the variance of the observed proportions to be much larger than variances under the multinomial model. The phenomenon is generally referred to as overdispersion. Tallis (1962) proposed a model for identically distributed multinomials with a common measure of correlation and referred to it as the generalized multinomial model. This generalized multinomial model is extended in this article to account for overdispersion by allowing the vectors of proportions to vary according to a Dirichlet distribution. The generalized Dirichlet- multinomial model (as it is referred to here) allows for a second order of pairwise correlation among units, a type of assumption found reasonable in some biological data (Kupper and Haseman 1978) and introduced here to business data. An alternative derivation allowing for two kinds of variation is also considered. Asymptotic normal properties of parameter estimators are used to construct Wald statistics for testing hypotheses. The methods are illustrated with applications to performance evaluation monthly data and an integrated circuit yield analysis.

KW - Correlated

KW - Crossed

KW - Dirichlet

KW - Generalized multinomial model

KW - Nested

UR - http://www.scopus.com/inward/record.url?scp=0038842365&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0038842365&partnerID=8YFLogxK

U2 - 10.1080/07350015.1991.10509832

DO - 10.1080/07350015.1991.10509832

M3 - Article

VL - 9

SP - 103

EP - 110

JO - Journal of Business and Economic Statistics

JF - Journal of Business and Economic Statistics

SN - 0735-0015

IS - 1

ER -