Bootstrap ICC estimators in analysis of small clustered binary data

Bei Wang, Yi Zheng, Kyle M. Irimata, Jeffrey Wilson

Research output: Contribution to journalArticle

Abstract

Survey data are often obtained through a multilevel structure and, as such, require hierarchical modeling. While large sample approximation provides a mechanism to construct confidence intervals for the intraclass correlation coefficients (ICCs) in large datasets, challenges arise when we are faced with small-size clusters and binary outcomes. In this paper, we examine two bootstrapping methods, cluster bootstrapping and split bootstrapping. We use these methods to construct the confidence intervals for the ICCs (based on a latent variable approach) for small binary data obtained through a three-level or higher hierarchical data structure. We use 26 scenarios in our simulation study with the two bootstrapping methods. We find that the latent variable method performs well in terms of coverage. The split bootstrapping method provides confidence intervals close to the nominal coverage when the ratio of the ICC for the primary cluster to the ICC for the secondary cluster is small. While the cluster bootstrapping is preferred when the cluster size is larger and the ratio of the ICCs is larger. A numerical example based on teacher effectiveness is assessed.

Original languageEnglish (US)
JournalComputational Statistics
DOIs
StatePublished - Jan 1 2019

Fingerprint

Intraclass Correlation Coefficient
Clustered Data
Binary Data
Bootstrapping
Bootstrap
Data structures
Estimator
Confidence interval
Latent Variables
Coverage
Hierarchical Data
Binary Outcomes
Hierarchical Modeling
Survey Data
Hierarchical Structure
Large Data Sets
Categorical or nominal
Correlation coefficient
Data Structures
Simulation Study

Keywords

  • Generalized linear mixed model
  • Resampling scheme
  • Small sample inference

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty
  • Computational Mathematics

Cite this

Bootstrap ICC estimators in analysis of small clustered binary data. / Wang, Bei; Zheng, Yi; Irimata, Kyle M.; Wilson, Jeffrey.

In: Computational Statistics, 01.01.2019.

Research output: Contribution to journalArticle

@article{deb6c2217554421dae9e076a342fc4eb,
title = "Bootstrap ICC estimators in analysis of small clustered binary data",
abstract = "Survey data are often obtained through a multilevel structure and, as such, require hierarchical modeling. While large sample approximation provides a mechanism to construct confidence intervals for the intraclass correlation coefficients (ICCs) in large datasets, challenges arise when we are faced with small-size clusters and binary outcomes. In this paper, we examine two bootstrapping methods, cluster bootstrapping and split bootstrapping. We use these methods to construct the confidence intervals for the ICCs (based on a latent variable approach) for small binary data obtained through a three-level or higher hierarchical data structure. We use 26 scenarios in our simulation study with the two bootstrapping methods. We find that the latent variable method performs well in terms of coverage. The split bootstrapping method provides confidence intervals close to the nominal coverage when the ratio of the ICC for the primary cluster to the ICC for the secondary cluster is small. While the cluster bootstrapping is preferred when the cluster size is larger and the ratio of the ICCs is larger. A numerical example based on teacher effectiveness is assessed.",
keywords = "Generalized linear mixed model, Resampling scheme, Small sample inference",
author = "Bei Wang and Yi Zheng and Irimata, {Kyle M.} and Jeffrey Wilson",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/s00180-019-00885-z",
language = "English (US)",
journal = "Computational Statistics",
issn = "0943-4062",
publisher = "Springer Verlag",

}

TY - JOUR

T1 - Bootstrap ICC estimators in analysis of small clustered binary data

AU - Wang, Bei

AU - Zheng, Yi

AU - Irimata, Kyle M.

AU - Wilson, Jeffrey

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Survey data are often obtained through a multilevel structure and, as such, require hierarchical modeling. While large sample approximation provides a mechanism to construct confidence intervals for the intraclass correlation coefficients (ICCs) in large datasets, challenges arise when we are faced with small-size clusters and binary outcomes. In this paper, we examine two bootstrapping methods, cluster bootstrapping and split bootstrapping. We use these methods to construct the confidence intervals for the ICCs (based on a latent variable approach) for small binary data obtained through a three-level or higher hierarchical data structure. We use 26 scenarios in our simulation study with the two bootstrapping methods. We find that the latent variable method performs well in terms of coverage. The split bootstrapping method provides confidence intervals close to the nominal coverage when the ratio of the ICC for the primary cluster to the ICC for the secondary cluster is small. While the cluster bootstrapping is preferred when the cluster size is larger and the ratio of the ICCs is larger. A numerical example based on teacher effectiveness is assessed.

AB - Survey data are often obtained through a multilevel structure and, as such, require hierarchical modeling. While large sample approximation provides a mechanism to construct confidence intervals for the intraclass correlation coefficients (ICCs) in large datasets, challenges arise when we are faced with small-size clusters and binary outcomes. In this paper, we examine two bootstrapping methods, cluster bootstrapping and split bootstrapping. We use these methods to construct the confidence intervals for the ICCs (based on a latent variable approach) for small binary data obtained through a three-level or higher hierarchical data structure. We use 26 scenarios in our simulation study with the two bootstrapping methods. We find that the latent variable method performs well in terms of coverage. The split bootstrapping method provides confidence intervals close to the nominal coverage when the ratio of the ICC for the primary cluster to the ICC for the secondary cluster is small. While the cluster bootstrapping is preferred when the cluster size is larger and the ratio of the ICCs is larger. A numerical example based on teacher effectiveness is assessed.

KW - Generalized linear mixed model

KW - Resampling scheme

KW - Small sample inference

UR - http://www.scopus.com/inward/record.url?scp=85064147765&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85064147765&partnerID=8YFLogxK

U2 - 10.1007/s00180-019-00885-z

DO - 10.1007/s00180-019-00885-z

M3 - Article

JO - Computational Statistics

JF - Computational Statistics

SN - 0943-4062

ER -