When a vector of sample proportions is not obtained through a simple random sampling, the covariance matrix for the sample vector can differ substantially from the one corresponding to the multinomial model (Wilson 1989). For example, clustering effects of subject effects in repeated-measure experiments can cause the variance of the observed proportions to be much larger than variances under the multinomial model. The phenomenon is generally referred to as overdispersion. Tallis (1962) proposed a model for identically distributed multinomials with a common measure of correlation and referred to it as the generalized multinomial model. This generalized multinomial model is extended in this article to account for overdispersion by allowing the vectors of proportions to vary according to a Dirichlet distribution. The generalized Dirichlet- multinomial model (as it is referred to here) allows for a second order of pairwise correlation among units, a type of assumption found reasonable in some biological data (Kupper and Haseman 1978) and introduced here to business data. An alternative derivation allowing for two kinds of variation is also considered. Asymptotic normal properties of parameter estimators are used to construct Wald statistics for testing hypotheses. The methods are illustrated with applications to performance evaluation monthly data and an integrated circuit yield analysis.
- Generalized multinomial model
ASJC Scopus subject areas
- Statistics and Probability
- Social Sciences (miscellaneous)
- Economics and Econometrics
- Statistics, Probability and Uncertainty