Validity of the chi‐square test in dichotomous variable factor analysis when expected frequencies are small

Mark Reiser, Maria VandenBerg

Research output: Contribution to journalArticle

28 Citations (Scopus)

Abstract

This paper presents a comparison of results from two methods for estimating and testing a model for the factor analysis of dichotomous variables. For k manifest dichotomous variables, the data can be cross‐classified to form a vector of 2k frequencies, and nonlinear methods that use the full information in these 2k frequencies are available for factor analysis. In addition, another method that uses only the limited information in the first‐, and second‐order marginal frequencies is available for the same model. As k becomes larger, substantial differences between the full‐information and limited‐information methods become apparent in results from the test of fit. For large k. Type I and Type II error rates may be higher in the full‐information approach, because as the vector of 2k frequencies becomes sparse, the chi‐square approximation for the distribution of the goodness‐of‐fit test statistic becomes poorer. In this paper, Monte Carlo experiments are used under a variety of conditions to compare the methods for rate of Type I errors when the model matches the simulated data and for the rate of Type II errors when the model does not match the simulated data. 1994 The British Psychological Society

Original languageEnglish (US)
Pages (from-to)85-107
Number of pages23
JournalBritish Journal of Mathematical and Statistical Psychology
Volume47
Issue number1
DOIs
StatePublished - 1994

Fingerprint

Chi-squared test
Factor Analysis
Statistical Factor Analysis
Type II error
Addition method
Chi-square
Type I error
Monte Carlo Experiment
Model
Test Statistic
Error Rate
Testing
Chi-square Test
Approximation

ASJC Scopus subject areas

  • Statistics and Probability
  • Arts and Humanities (miscellaneous)
  • Psychology(all)

Cite this

@article{66068e61d0144b9b8040ded2339d9110,
title = "Validity of the chi‐square test in dichotomous variable factor analysis when expected frequencies are small",
abstract = "This paper presents a comparison of results from two methods for estimating and testing a model for the factor analysis of dichotomous variables. For k manifest dichotomous variables, the data can be cross‐classified to form a vector of 2k frequencies, and nonlinear methods that use the full information in these 2k frequencies are available for factor analysis. In addition, another method that uses only the limited information in the first‐, and second‐order marginal frequencies is available for the same model. As k becomes larger, substantial differences between the full‐information and limited‐information methods become apparent in results from the test of fit. For large k. Type I and Type II error rates may be higher in the full‐information approach, because as the vector of 2k frequencies becomes sparse, the chi‐square approximation for the distribution of the goodness‐of‐fit test statistic becomes poorer. In this paper, Monte Carlo experiments are used under a variety of conditions to compare the methods for rate of Type I errors when the model matches the simulated data and for the rate of Type II errors when the model does not match the simulated data. 1994 The British Psychological Society",
author = "Mark Reiser and Maria VandenBerg",
year = "1994",
doi = "10.1111/j.2044-8317.1994.tb01026.x",
language = "English (US)",
volume = "47",
pages = "85--107",
journal = "British Journal of Mathematical and Statistical Psychology",
issn = "0007-1102",
publisher = "Wiley-Blackwell",
number = "1",

}

TY - JOUR

T1 - Validity of the chi‐square test in dichotomous variable factor analysis when expected frequencies are small

AU - Reiser, Mark

AU - VandenBerg, Maria

PY - 1994

Y1 - 1994

N2 - This paper presents a comparison of results from two methods for estimating and testing a model for the factor analysis of dichotomous variables. For k manifest dichotomous variables, the data can be cross‐classified to form a vector of 2k frequencies, and nonlinear methods that use the full information in these 2k frequencies are available for factor analysis. In addition, another method that uses only the limited information in the first‐, and second‐order marginal frequencies is available for the same model. As k becomes larger, substantial differences between the full‐information and limited‐information methods become apparent in results from the test of fit. For large k. Type I and Type II error rates may be higher in the full‐information approach, because as the vector of 2k frequencies becomes sparse, the chi‐square approximation for the distribution of the goodness‐of‐fit test statistic becomes poorer. In this paper, Monte Carlo experiments are used under a variety of conditions to compare the methods for rate of Type I errors when the model matches the simulated data and for the rate of Type II errors when the model does not match the simulated data. 1994 The British Psychological Society

AB - This paper presents a comparison of results from two methods for estimating and testing a model for the factor analysis of dichotomous variables. For k manifest dichotomous variables, the data can be cross‐classified to form a vector of 2k frequencies, and nonlinear methods that use the full information in these 2k frequencies are available for factor analysis. In addition, another method that uses only the limited information in the first‐, and second‐order marginal frequencies is available for the same model. As k becomes larger, substantial differences between the full‐information and limited‐information methods become apparent in results from the test of fit. For large k. Type I and Type II error rates may be higher in the full‐information approach, because as the vector of 2k frequencies becomes sparse, the chi‐square approximation for the distribution of the goodness‐of‐fit test statistic becomes poorer. In this paper, Monte Carlo experiments are used under a variety of conditions to compare the methods for rate of Type I errors when the model matches the simulated data and for the rate of Type II errors when the model does not match the simulated data. 1994 The British Psychological Society

UR - http://www.scopus.com/inward/record.url?scp=85004830124&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85004830124&partnerID=8YFLogxK

U2 - 10.1111/j.2044-8317.1994.tb01026.x

DO - 10.1111/j.2044-8317.1994.tb01026.x

M3 - Article

AN - SCOPUS:85004830124

VL - 47

SP - 85

EP - 107

JO - British Journal of Mathematical and Statistical Psychology

JF - British Journal of Mathematical and Statistical Psychology

SN - 0007-1102

IS - 1

ER -