Modeling Clustered Data with Very Few Clusters

Daniel McNeish; Laura M. Stapleton

doi:10.1080/00273171.2016.1167008

Modeling Clustered Data with Very Few Clusters

Daniel McNeish, Laura M. Stapleton

Research output: Contribution to journal › Article › peer-review

264 Scopus citations

Abstract

Small-sample inference with clustered data has received increased attention recently in the methodological literature, with several simulation studies being presented on the small-sample behavior of many methods. However, nearly all previous studies focus on a single class of methods (e.g., only multilevel models, only corrections to sandwich estimators), and the differential performance of various methods that can be implemented to accommodate clustered data with very few clusters is largely unknown, potentially due to the rigid disciplinary preferences. Furthermore, a majority of these studies focus on scenarios with 15 or more clusters and feature unrealistically simple data-generation models with very few predictors. This article, motivated by an applied educational psychology cluster randomized trial, presents a simulation study that simultaneously addresses the extreme small sample and differential performance (estimation bias, Type I error rates, and relative power) of 12 methods to account for clustered data with a model that features a more realistic number of predictors. The motivating data are then modeled with each method, and results are compared. Results show that generalized estimating equations perform poorly; the choice of Bayesian prior distributions affects performance; and fixed effect models perform quite well. Limitations and implications for applications are also discussed.

Original language	English (US)
Pages (from-to)	495-518
Number of pages	24
Journal	Multivariate Behavioral Research
Volume	51
Issue number	4
DOIs	https://doi.org/10.1080/00273171.2016.1167008
State	Published - Jul 3 2016
Externally published	Yes

Keywords

Bayesian
GEE
HLM
cluster randomized trial
fixed effect model
multilevel model
small sample

ASJC Scopus subject areas

Statistics and Probability
Experimental and Cognitive Psychology
Arts and Humanities (miscellaneous)

Access to Document

10.1080/00273171.2016.1167008

Cite this

@article{21caca590d4a410f84a52f0a9926ec52,

title = "Modeling Clustered Data with Very Few Clusters",

abstract = "Small-sample inference with clustered data has received increased attention recently in the methodological literature, with several simulation studies being presented on the small-sample behavior of many methods. However, nearly all previous studies focus on a single class of methods (e.g., only multilevel models, only corrections to sandwich estimators), and the differential performance of various methods that can be implemented to accommodate clustered data with very few clusters is largely unknown, potentially due to the rigid disciplinary preferences. Furthermore, a majority of these studies focus on scenarios with 15 or more clusters and feature unrealistically simple data-generation models with very few predictors. This article, motivated by an applied educational psychology cluster randomized trial, presents a simulation study that simultaneously addresses the extreme small sample and differential performance (estimation bias, Type I error rates, and relative power) of 12 methods to account for clustered data with a model that features a more realistic number of predictors. The motivating data are then modeled with each method, and results are compared. Results show that generalized estimating equations perform poorly; the choice of Bayesian prior distributions affects performance; and fixed effect models perform quite well. Limitations and implications for applications are also discussed.",

keywords = "Bayesian, GEE, HLM, cluster randomized trial, fixed effect model, multilevel model, small sample",

author = "Daniel McNeish and Stapleton, {Laura M.}",

note = "Publisher Copyright: {\textcopyright} 2016 Taylor & Francis Group, LLC.",

year = "2016",

month = jul,

day = "3",

doi = "10.1080/00273171.2016.1167008",

language = "English (US)",

volume = "51",

pages = "495--518",

journal = "Multivariate Behavioral Research",

issn = "0027-3171",

publisher = "Psychology Press Ltd",

number = "4",

}

TY - JOUR

T1 - Modeling Clustered Data with Very Few Clusters

AU - McNeish, Daniel

AU - Stapleton, Laura M.

PY - 2016/7/3

Y1 - 2016/7/3

N2 - Small-sample inference with clustered data has received increased attention recently in the methodological literature, with several simulation studies being presented on the small-sample behavior of many methods. However, nearly all previous studies focus on a single class of methods (e.g., only multilevel models, only corrections to sandwich estimators), and the differential performance of various methods that can be implemented to accommodate clustered data with very few clusters is largely unknown, potentially due to the rigid disciplinary preferences. Furthermore, a majority of these studies focus on scenarios with 15 or more clusters and feature unrealistically simple data-generation models with very few predictors. This article, motivated by an applied educational psychology cluster randomized trial, presents a simulation study that simultaneously addresses the extreme small sample and differential performance (estimation bias, Type I error rates, and relative power) of 12 methods to account for clustered data with a model that features a more realistic number of predictors. The motivating data are then modeled with each method, and results are compared. Results show that generalized estimating equations perform poorly; the choice of Bayesian prior distributions affects performance; and fixed effect models perform quite well. Limitations and implications for applications are also discussed.

AB - Small-sample inference with clustered data has received increased attention recently in the methodological literature, with several simulation studies being presented on the small-sample behavior of many methods. However, nearly all previous studies focus on a single class of methods (e.g., only multilevel models, only corrections to sandwich estimators), and the differential performance of various methods that can be implemented to accommodate clustered data with very few clusters is largely unknown, potentially due to the rigid disciplinary preferences. Furthermore, a majority of these studies focus on scenarios with 15 or more clusters and feature unrealistically simple data-generation models with very few predictors. This article, motivated by an applied educational psychology cluster randomized trial, presents a simulation study that simultaneously addresses the extreme small sample and differential performance (estimation bias, Type I error rates, and relative power) of 12 methods to account for clustered data with a model that features a more realistic number of predictors. The motivating data are then modeled with each method, and results are compared. Results show that generalized estimating equations perform poorly; the choice of Bayesian prior distributions affects performance; and fixed effect models perform quite well. Limitations and implications for applications are also discussed.

KW - Bayesian

KW - GEE

KW - HLM

KW - cluster randomized trial

KW - fixed effect model

KW - multilevel model

KW - small sample

UR - http://www.scopus.com/inward/record.url?scp=84973596359&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84973596359&partnerID=8YFLogxK

U2 - 10.1080/00273171.2016.1167008

DO - 10.1080/00273171.2016.1167008

M3 - Article

C2 - 27269278

AN - SCOPUS:84973596359

SN - 0027-3171

VL - 51

SP - 495

EP - 518

JO - Multivariate Behavioral Research

JF - Multivariate Behavioral Research

IS - 4

ER -

Modeling Clustered Data with Very Few Clusters

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this