Model Selection in Finite Mixture Models: A k-Fold Cross-Validation Approach

Kevin Grimm, Gina L. Mazza, Pega Davoudzadeh

Research output: Contribution to journalArticle

21 Citations (Scopus)

Abstract

Finite mixture models, whether latent class models, growth mixture models, latent profile models, or factor mixture models, have become an important statistical tool in social science research. One of the biggest and most debated challenges in mixture modeling is the evaluation of model fit and model comparison. In the application of mixture models, researchers often fit a collection of models and then decide on a single optimal model based on a variety of model fit information. We propose a k-fold cross-validation procedure to model selection whereby the model is repeatedly fit to (Formula presented.) different partitions of the data set, the resulting model is then applied to kth partition of the sample, and the distribution of fit indexes is examined. This method is illustrated with growth mixture models fit to longitudinal data on reading ability collected as part of the Early Childhood Longitudinal Study–Kindergarten Cohort.

Original languageEnglish (US)
Pages (from-to)1-11
Number of pages11
JournalStructural Equation Modeling
DOIs
StateAccepted/In press - Dec 4 2016

Fingerprint

Finite Mixture Models
Cross-validation
Model Selection
Fold
Mixture Model
Growth Model
Partition
Model
Latent Class Model
Mixture Modeling
Model Comparison
Factor Models
Social Sciences
Longitudinal Data
Model selection
Finite mixture models
Model-based
model comparison
Evaluation
Mixture model

Keywords

  • change
  • finite mixture
  • growth
  • growth mixture

ASJC Scopus subject areas

  • Decision Sciences(all)
  • Modeling and Simulation
  • Sociology and Political Science
  • Economics, Econometrics and Finance(all)

Cite this

Model Selection in Finite Mixture Models : A k-Fold Cross-Validation Approach. / Grimm, Kevin; Mazza, Gina L.; Davoudzadeh, Pega.

In: Structural Equation Modeling, 04.12.2016, p. 1-11.

Research output: Contribution to journalArticle

@article{8601e778e2dd47639b1b5eafa66964eb,
title = "Model Selection in Finite Mixture Models: A k-Fold Cross-Validation Approach",
abstract = "Finite mixture models, whether latent class models, growth mixture models, latent profile models, or factor mixture models, have become an important statistical tool in social science research. One of the biggest and most debated challenges in mixture modeling is the evaluation of model fit and model comparison. In the application of mixture models, researchers often fit a collection of models and then decide on a single optimal model based on a variety of model fit information. We propose a k-fold cross-validation procedure to model selection whereby the model is repeatedly fit to (Formula presented.) different partitions of the data set, the resulting model is then applied to kth partition of the sample, and the distribution of fit indexes is examined. This method is illustrated with growth mixture models fit to longitudinal data on reading ability collected as part of the Early Childhood Longitudinal Study–Kindergarten Cohort.",
keywords = "change, finite mixture, growth, growth mixture",
author = "Kevin Grimm and Mazza, {Gina L.} and Pega Davoudzadeh",
year = "2016",
month = "12",
day = "4",
doi = "10.1080/10705511.2016.1250638",
language = "English (US)",
pages = "1--11",
journal = "Structural Equation Modeling",
issn = "1070-5511",
publisher = "Psychology Press Ltd",

}

TY - JOUR

T1 - Model Selection in Finite Mixture Models

T2 - A k-Fold Cross-Validation Approach

AU - Grimm, Kevin

AU - Mazza, Gina L.

AU - Davoudzadeh, Pega

PY - 2016/12/4

Y1 - 2016/12/4

N2 - Finite mixture models, whether latent class models, growth mixture models, latent profile models, or factor mixture models, have become an important statistical tool in social science research. One of the biggest and most debated challenges in mixture modeling is the evaluation of model fit and model comparison. In the application of mixture models, researchers often fit a collection of models and then decide on a single optimal model based on a variety of model fit information. We propose a k-fold cross-validation procedure to model selection whereby the model is repeatedly fit to (Formula presented.) different partitions of the data set, the resulting model is then applied to kth partition of the sample, and the distribution of fit indexes is examined. This method is illustrated with growth mixture models fit to longitudinal data on reading ability collected as part of the Early Childhood Longitudinal Study–Kindergarten Cohort.

AB - Finite mixture models, whether latent class models, growth mixture models, latent profile models, or factor mixture models, have become an important statistical tool in social science research. One of the biggest and most debated challenges in mixture modeling is the evaluation of model fit and model comparison. In the application of mixture models, researchers often fit a collection of models and then decide on a single optimal model based on a variety of model fit information. We propose a k-fold cross-validation procedure to model selection whereby the model is repeatedly fit to (Formula presented.) different partitions of the data set, the resulting model is then applied to kth partition of the sample, and the distribution of fit indexes is examined. This method is illustrated with growth mixture models fit to longitudinal data on reading ability collected as part of the Early Childhood Longitudinal Study–Kindergarten Cohort.

KW - change

KW - finite mixture

KW - growth

KW - growth mixture

UR - http://www.scopus.com/inward/record.url?scp=85001976691&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85001976691&partnerID=8YFLogxK

U2 - 10.1080/10705511.2016.1250638

DO - 10.1080/10705511.2016.1250638

M3 - Article

AN - SCOPUS:85001976691

SP - 1

EP - 11

JO - Structural Equation Modeling

JF - Structural Equation Modeling

SN - 1070-5511

ER -