A Monte Carlo Comparison Study of the Power of the Analysis of Covariance, Simple Difference, and Residual Change Scores in Testing Two-Wave Data

Yasemin Kisbu-Sakarya, David Mackinnon, Leona S. Aiken

Research output: Contribution to journalArticle

16 Citations (Scopus)

Abstract

This study compares the analysis of covariance (ANCOVA), difference score, and residual change score methods in testing the group effect for pretest-posttest data in terms of statistical power and Type I error rates using a Monte Carlo simulation. Previous research has mathematically shown the effect of stability of individual scores from pretest to posttest, reliability, and nonrandomization (i.e., pretest imbalance) on the performance of the ANCOVA, difference score, and residual change score methods. However, related power issues have not been adequately addressed. The authors examined the impact of stability of measurement over time, reliability of covariate and criterion, nonrandomization, sample size, and treatment effect size on statistical power of the three methods. Across conditions, ANCOVA and residual change score methods had similar power rates. When reliability was less than perfect, ANCOVA had more power than the difference score method when there was an increase from pretest to posttest and a positive baseline imbalance (i.e., treatment group had higher pretest scores than the control group), or when there was a decrease from pretest to posttest and a negative baseline imbalance, and vice versa. In case of perfect reliability, the statistical power of ANCOVA did not differ from the difference score method. For the difference score method, when reliability was low, there was no effect of stability on power, whereas when reliability was high or perfect, power increased as stability increased for medium and large effect sizes. Difference scores may be preferred over ANCOVA under certain circumstances.

Original languageEnglish (US)
Pages (from-to)47-62
Number of pages16
JournalEducational and Psychological Measurement
Volume73
Issue number1
DOIs
StatePublished - Feb 2013

Fingerprint

Analysis of Covariance
Pre-test
Testing
Statistical Power
Effect Size
Baseline
Sample Size
Type I Error Rate
Group
Size Effect
Treatment Effects
Control Groups
Covariates
Monte Carlo Simulation
Research

Keywords

  • ANCOVA
  • difference score
  • residual change score
  • statistical power

ASJC Scopus subject areas

  • Algebra and Number Theory
  • Psychology(all)
  • Developmental and Educational Psychology
  • Psychology (miscellaneous)
  • Education

Cite this

@article{207d8d3a970a4cd996be66cfe7dad39b,
title = "A Monte Carlo Comparison Study of the Power of the Analysis of Covariance, Simple Difference, and Residual Change Scores in Testing Two-Wave Data",
abstract = "This study compares the analysis of covariance (ANCOVA), difference score, and residual change score methods in testing the group effect for pretest-posttest data in terms of statistical power and Type I error rates using a Monte Carlo simulation. Previous research has mathematically shown the effect of stability of individual scores from pretest to posttest, reliability, and nonrandomization (i.e., pretest imbalance) on the performance of the ANCOVA, difference score, and residual change score methods. However, related power issues have not been adequately addressed. The authors examined the impact of stability of measurement over time, reliability of covariate and criterion, nonrandomization, sample size, and treatment effect size on statistical power of the three methods. Across conditions, ANCOVA and residual change score methods had similar power rates. When reliability was less than perfect, ANCOVA had more power than the difference score method when there was an increase from pretest to posttest and a positive baseline imbalance (i.e., treatment group had higher pretest scores than the control group), or when there was a decrease from pretest to posttest and a negative baseline imbalance, and vice versa. In case of perfect reliability, the statistical power of ANCOVA did not differ from the difference score method. For the difference score method, when reliability was low, there was no effect of stability on power, whereas when reliability was high or perfect, power increased as stability increased for medium and large effect sizes. Difference scores may be preferred over ANCOVA under certain circumstances.",
keywords = "ANCOVA, difference score, residual change score, statistical power",
author = "Yasemin Kisbu-Sakarya and David Mackinnon and Aiken, {Leona S.}",
year = "2013",
month = "2",
doi = "10.1177/0013164412450574",
language = "English (US)",
volume = "73",
pages = "47--62",
journal = "Educational and Psychological Measurement",
issn = "0013-1644",
publisher = "SAGE Publications Inc.",
number = "1",

}

TY - JOUR

T1 - A Monte Carlo Comparison Study of the Power of the Analysis of Covariance, Simple Difference, and Residual Change Scores in Testing Two-Wave Data

AU - Kisbu-Sakarya, Yasemin

AU - Mackinnon, David

AU - Aiken, Leona S.

PY - 2013/2

Y1 - 2013/2

N2 - This study compares the analysis of covariance (ANCOVA), difference score, and residual change score methods in testing the group effect for pretest-posttest data in terms of statistical power and Type I error rates using a Monte Carlo simulation. Previous research has mathematically shown the effect of stability of individual scores from pretest to posttest, reliability, and nonrandomization (i.e., pretest imbalance) on the performance of the ANCOVA, difference score, and residual change score methods. However, related power issues have not been adequately addressed. The authors examined the impact of stability of measurement over time, reliability of covariate and criterion, nonrandomization, sample size, and treatment effect size on statistical power of the three methods. Across conditions, ANCOVA and residual change score methods had similar power rates. When reliability was less than perfect, ANCOVA had more power than the difference score method when there was an increase from pretest to posttest and a positive baseline imbalance (i.e., treatment group had higher pretest scores than the control group), or when there was a decrease from pretest to posttest and a negative baseline imbalance, and vice versa. In case of perfect reliability, the statistical power of ANCOVA did not differ from the difference score method. For the difference score method, when reliability was low, there was no effect of stability on power, whereas when reliability was high or perfect, power increased as stability increased for medium and large effect sizes. Difference scores may be preferred over ANCOVA under certain circumstances.

AB - This study compares the analysis of covariance (ANCOVA), difference score, and residual change score methods in testing the group effect for pretest-posttest data in terms of statistical power and Type I error rates using a Monte Carlo simulation. Previous research has mathematically shown the effect of stability of individual scores from pretest to posttest, reliability, and nonrandomization (i.e., pretest imbalance) on the performance of the ANCOVA, difference score, and residual change score methods. However, related power issues have not been adequately addressed. The authors examined the impact of stability of measurement over time, reliability of covariate and criterion, nonrandomization, sample size, and treatment effect size on statistical power of the three methods. Across conditions, ANCOVA and residual change score methods had similar power rates. When reliability was less than perfect, ANCOVA had more power than the difference score method when there was an increase from pretest to posttest and a positive baseline imbalance (i.e., treatment group had higher pretest scores than the control group), or when there was a decrease from pretest to posttest and a negative baseline imbalance, and vice versa. In case of perfect reliability, the statistical power of ANCOVA did not differ from the difference score method. For the difference score method, when reliability was low, there was no effect of stability on power, whereas when reliability was high or perfect, power increased as stability increased for medium and large effect sizes. Difference scores may be preferred over ANCOVA under certain circumstances.

KW - ANCOVA

KW - difference score

KW - residual change score

KW - statistical power

UR - http://www.scopus.com/inward/record.url?scp=84871297652&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84871297652&partnerID=8YFLogxK

U2 - 10.1177/0013164412450574

DO - 10.1177/0013164412450574

M3 - Article

AN - SCOPUS:84871297652

VL - 73

SP - 47

EP - 62

JO - Educational and Psychological Measurement

JF - Educational and Psychological Measurement

SN - 0013-1644

IS - 1

ER -