Exogenous variables and value-added assessments: A fatal flaw

David Berliner

Exogenous variables and value-added assessments: A fatal flaw

David Berliner

Research output: Contribution to journal › Article › peer-review

Abstract

Background: There has been rapid growth in value-added assessment of teachers to meet the widely supported policy goal of identifying the most effective and the most ineffective teachers in a school system. The former group is to be rewarded while the latter group is to be helped or fired for their poor performance. But, value-added approaches to teacher evaluation have many problems. Chief among them is the commonly found class-to-class and year-to-year unreliability in the scores obtained. Teacher value-added scores appear to be highly unstable across two classes of the same subject that they teach in the same semester, or from class to class across two adjacent years. Focus of Study: This literature review first focuses on the confusion in the minds of the public and politicians between teachers' effects on individual students, which may be great and usually positive, and teachers' effects on classroom mean achievement scores, which may be limited by the huge number of exogenous variables affecting classroom achievement scores. Exogenous variables are unaccounted for influences on the data, such as peer classroom effects, school compositional effects, and characteristics of the neighborhoods in which some students live. Further, even if some of these variables are measured, the interactions among these many variables often go unexamined. But, two-way and three-way interactions are quite likely to be occurring and influencing classroom achievement. This analysis promotes the idea that the ubiquitous and powerful effects on value-added scores of these myriad exogenous variables is the reason that almost all current research finds instability in teachers' classroom behavior and instability in teachers' value-added scores. This may pose a fatal flaw in implementing value-added assessments of teaching competency. Research Design: This is an analytic essay, including a selective literature review that includes some secondary analyses. Conclusions: I conclude that because of the effects of countless exogenous variables on student classroom achievement, value-added assessments do not now and may never be stable enough from class to class or year to year to be used in evaluating teachers. The hope is that with three or more years of value-added data, the identification of extremely good and bad teachers might be possible; but, that goal is not assured, and empirical results suggest that it really is quite hard to reliably identify extremely good and extremely bad groups of teachers. In fact, when picking extremes among teachers, both luck and regression to the mean will combine with the interactions of many variables to produce instability in the value-added scores that are obtained. Examination of the apparently simple policy goal of identifying the best and worst teachers in a school system reveals a morally problematic and psychometrically inadequate base for those policies. In fact, the belief that there are thousands of consistently inadequate teachers may be like the search for welfare queens and disability scam artists-more sensationalism than it is reality.

Original language	English (US)
Article number	17293
Journal	Teachers College Record
Volume	116
Issue number	1
State	Published - 2014

ASJC Scopus subject areas

Education

Cite this

@article{d3f1975aa66948b6b09baaab68977382,

title = "Exogenous variables and value-added assessments: A fatal flaw",

abstract = "Background: There has been rapid growth in value-added assessment of teachers to meet the widely supported policy goal of identifying the most effective and the most ineffective teachers in a school system. The former group is to be rewarded while the latter group is to be helped or fired for their poor performance. But, value-added approaches to teacher evaluation have many problems. Chief among them is the commonly found class-to-class and year-to-year unreliability in the scores obtained. Teacher value-added scores appear to be highly unstable across two classes of the same subject that they teach in the same semester, or from class to class across two adjacent years. Focus of Study: This literature review first focuses on the confusion in the minds of the public and politicians between teachers' effects on individual students, which may be great and usually positive, and teachers' effects on classroom mean achievement scores, which may be limited by the huge number of exogenous variables affecting classroom achievement scores. Exogenous variables are unaccounted for influences on the data, such as peer classroom effects, school compositional effects, and characteristics of the neighborhoods in which some students live. Further, even if some of these variables are measured, the interactions among these many variables often go unexamined. But, two-way and three-way interactions are quite likely to be occurring and influencing classroom achievement. This analysis promotes the idea that the ubiquitous and powerful effects on value-added scores of these myriad exogenous variables is the reason that almost all current research finds instability in teachers' classroom behavior and instability in teachers' value-added scores. This may pose a fatal flaw in implementing value-added assessments of teaching competency. Research Design: This is an analytic essay, including a selective literature review that includes some secondary analyses. Conclusions: I conclude that because of the effects of countless exogenous variables on student classroom achievement, value-added assessments do not now and may never be stable enough from class to class or year to year to be used in evaluating teachers. The hope is that with three or more years of value-added data, the identification of extremely good and bad teachers might be possible; but, that goal is not assured, and empirical results suggest that it really is quite hard to reliably identify extremely good and extremely bad groups of teachers. In fact, when picking extremes among teachers, both luck and regression to the mean will combine with the interactions of many variables to produce instability in the value-added scores that are obtained. Examination of the apparently simple policy goal of identifying the best and worst teachers in a school system reveals a morally problematic and psychometrically inadequate base for those policies. In fact, the belief that there are thousands of consistently inadequate teachers may be like the search for welfare queens and disability scam artists-more sensationalism than it is reality.",

author = "David Berliner",

year = "2014",

language = "English (US)",

volume = "116",

journal = "Teachers College Record",

issn = "0161-4681",

publisher = "Teachers College Record",

number = "1",

}

TY - JOUR

T1 - Exogenous variables and value-added assessments

T2 - A fatal flaw

AU - Berliner, David

PY - 2014

Y1 - 2014

N2 - Background: There has been rapid growth in value-added assessment of teachers to meet the widely supported policy goal of identifying the most effective and the most ineffective teachers in a school system. The former group is to be rewarded while the latter group is to be helped or fired for their poor performance. But, value-added approaches to teacher evaluation have many problems. Chief among them is the commonly found class-to-class and year-to-year unreliability in the scores obtained. Teacher value-added scores appear to be highly unstable across two classes of the same subject that they teach in the same semester, or from class to class across two adjacent years. Focus of Study: This literature review first focuses on the confusion in the minds of the public and politicians between teachers' effects on individual students, which may be great and usually positive, and teachers' effects on classroom mean achievement scores, which may be limited by the huge number of exogenous variables affecting classroom achievement scores. Exogenous variables are unaccounted for influences on the data, such as peer classroom effects, school compositional effects, and characteristics of the neighborhoods in which some students live. Further, even if some of these variables are measured, the interactions among these many variables often go unexamined. But, two-way and three-way interactions are quite likely to be occurring and influencing classroom achievement. This analysis promotes the idea that the ubiquitous and powerful effects on value-added scores of these myriad exogenous variables is the reason that almost all current research finds instability in teachers' classroom behavior and instability in teachers' value-added scores. This may pose a fatal flaw in implementing value-added assessments of teaching competency. Research Design: This is an analytic essay, including a selective literature review that includes some secondary analyses. Conclusions: I conclude that because of the effects of countless exogenous variables on student classroom achievement, value-added assessments do not now and may never be stable enough from class to class or year to year to be used in evaluating teachers. The hope is that with three or more years of value-added data, the identification of extremely good and bad teachers might be possible; but, that goal is not assured, and empirical results suggest that it really is quite hard to reliably identify extremely good and extremely bad groups of teachers. In fact, when picking extremes among teachers, both luck and regression to the mean will combine with the interactions of many variables to produce instability in the value-added scores that are obtained. Examination of the apparently simple policy goal of identifying the best and worst teachers in a school system reveals a morally problematic and psychometrically inadequate base for those policies. In fact, the belief that there are thousands of consistently inadequate teachers may be like the search for welfare queens and disability scam artists-more sensationalism than it is reality.

AB - Background: There has been rapid growth in value-added assessment of teachers to meet the widely supported policy goal of identifying the most effective and the most ineffective teachers in a school system. The former group is to be rewarded while the latter group is to be helped or fired for their poor performance. But, value-added approaches to teacher evaluation have many problems. Chief among them is the commonly found class-to-class and year-to-year unreliability in the scores obtained. Teacher value-added scores appear to be highly unstable across two classes of the same subject that they teach in the same semester, or from class to class across two adjacent years. Focus of Study: This literature review first focuses on the confusion in the minds of the public and politicians between teachers' effects on individual students, which may be great and usually positive, and teachers' effects on classroom mean achievement scores, which may be limited by the huge number of exogenous variables affecting classroom achievement scores. Exogenous variables are unaccounted for influences on the data, such as peer classroom effects, school compositional effects, and characteristics of the neighborhoods in which some students live. Further, even if some of these variables are measured, the interactions among these many variables often go unexamined. But, two-way and three-way interactions are quite likely to be occurring and influencing classroom achievement. This analysis promotes the idea that the ubiquitous and powerful effects on value-added scores of these myriad exogenous variables is the reason that almost all current research finds instability in teachers' classroom behavior and instability in teachers' value-added scores. This may pose a fatal flaw in implementing value-added assessments of teaching competency. Research Design: This is an analytic essay, including a selective literature review that includes some secondary analyses. Conclusions: I conclude that because of the effects of countless exogenous variables on student classroom achievement, value-added assessments do not now and may never be stable enough from class to class or year to year to be used in evaluating teachers. The hope is that with three or more years of value-added data, the identification of extremely good and bad teachers might be possible; but, that goal is not assured, and empirical results suggest that it really is quite hard to reliably identify extremely good and extremely bad groups of teachers. In fact, when picking extremes among teachers, both luck and regression to the mean will combine with the interactions of many variables to produce instability in the value-added scores that are obtained. Examination of the apparently simple policy goal of identifying the best and worst teachers in a school system reveals a morally problematic and psychometrically inadequate base for those policies. In fact, the belief that there are thousands of consistently inadequate teachers may be like the search for welfare queens and disability scam artists-more sensationalism than it is reality.

UR - http://www.scopus.com/inward/record.url?scp=84893178561&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84893178561&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84893178561

SN - 0161-4681

VL - 116

JO - Teachers College Record

JF - Teachers College Record

IS - 1

M1 - 17293

ER -

Exogenous variables and value-added assessments: A fatal flaw

Abstract

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this