Clustered data with small sample sizes

Comparing the performance of model-based and design-based approaches

Daniel McNeish, Jeffery R. Harring

Research output: Contribution to journalArticle

15 Citations (Scopus)

Abstract

Two classes of methods properly account for clustering of data: design-based methods and model-based methods. Estimates from both methods have been shown to be approximately equal with large samples. However, both classes are known to produce biased standard error estimates with small samples. This paper compares the bias of standard errors and statistical power of marginal effects for generalized estimating equations (a design-based method) and generalized/linear mixed effects models (model-based methods) with small sample sizes via a simulation study. Provided that the distributional assumptions are met, model-based methods produced the least-biased standard error estimates and greater relative statistical power.

Original languageEnglish (US)
Pages (from-to)855-869
Number of pages15
JournalCommunications in Statistics: Simulation and Computation
Volume46
Issue number2
DOIs
StatePublished - Feb 7 2017
Externally publishedYes

Fingerprint

Clustered Data
Small Sample Size
Model-based
Standard error
Statistical Power
Biased
Error Estimates
Linear Mixed Effects Model
Approximately equal
Generalized Estimating Equations
Design
Small Sample
Simulation Study
Clustering
Estimate

Keywords

  • GEE
  • Kenward-Roger
  • Mixed model
  • Multilevel model
  • Small sample size

ASJC Scopus subject areas

  • Statistics and Probability
  • Modeling and Simulation

Cite this

Clustered data with small sample sizes : Comparing the performance of model-based and design-based approaches. / McNeish, Daniel; Harring, Jeffery R.

In: Communications in Statistics: Simulation and Computation, Vol. 46, No. 2, 07.02.2017, p. 855-869.

Research output: Contribution to journalArticle

@article{e41699f41abb45ada8151c6471c84e3b,
title = "Clustered data with small sample sizes: Comparing the performance of model-based and design-based approaches",
abstract = "Two classes of methods properly account for clustering of data: design-based methods and model-based methods. Estimates from both methods have been shown to be approximately equal with large samples. However, both classes are known to produce biased standard error estimates with small samples. This paper compares the bias of standard errors and statistical power of marginal effects for generalized estimating equations (a design-based method) and generalized/linear mixed effects models (model-based methods) with small sample sizes via a simulation study. Provided that the distributional assumptions are met, model-based methods produced the least-biased standard error estimates and greater relative statistical power.",
keywords = "GEE, Kenward-Roger, Mixed model, Multilevel model, Small sample size",
author = "Daniel McNeish and Harring, {Jeffery R.}",
year = "2017",
month = "2",
day = "7",
doi = "10.1080/03610918.2014.983648",
language = "English (US)",
volume = "46",
pages = "855--869",
journal = "Communications in Statistics: Simulation and Computation",
issn = "0361-0918",
publisher = "Taylor and Francis Ltd.",
number = "2",

}

TY - JOUR

T1 - Clustered data with small sample sizes

T2 - Comparing the performance of model-based and design-based approaches

AU - McNeish, Daniel

AU - Harring, Jeffery R.

PY - 2017/2/7

Y1 - 2017/2/7

N2 - Two classes of methods properly account for clustering of data: design-based methods and model-based methods. Estimates from both methods have been shown to be approximately equal with large samples. However, both classes are known to produce biased standard error estimates with small samples. This paper compares the bias of standard errors and statistical power of marginal effects for generalized estimating equations (a design-based method) and generalized/linear mixed effects models (model-based methods) with small sample sizes via a simulation study. Provided that the distributional assumptions are met, model-based methods produced the least-biased standard error estimates and greater relative statistical power.

AB - Two classes of methods properly account for clustering of data: design-based methods and model-based methods. Estimates from both methods have been shown to be approximately equal with large samples. However, both classes are known to produce biased standard error estimates with small samples. This paper compares the bias of standard errors and statistical power of marginal effects for generalized estimating equations (a design-based method) and generalized/linear mixed effects models (model-based methods) with small sample sizes via a simulation study. Provided that the distributional assumptions are met, model-based methods produced the least-biased standard error estimates and greater relative statistical power.

KW - GEE

KW - Kenward-Roger

KW - Mixed model

KW - Multilevel model

KW - Small sample size

UR - http://www.scopus.com/inward/record.url?scp=84992437384&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84992437384&partnerID=8YFLogxK

U2 - 10.1080/03610918.2014.983648

DO - 10.1080/03610918.2014.983648

M3 - Article

VL - 46

SP - 855

EP - 869

JO - Communications in Statistics: Simulation and Computation

JF - Communications in Statistics: Simulation and Computation

SN - 0361-0918

IS - 2

ER -