TY - JOUR
T1 - The critical role of anchor paper selection in writing assessment
AU - Popp, Sharon E Osborn
AU - Ryan, Joseph M.
AU - Thompson, Marilyn
PY - 2009/7
Y1 - 2009/7
N2 - Scoring rubrics are routinely used to evaluate the quality of writing samples produced for writing performance assessments, with anchor papers chosen to represent score points defined in the rubric. Although the careful selection of anchor papers is associated with best practices for scoring, little research has been conducted on the role of anchor paper selection in writing assessment. This study examined the consequences of differential selection of anchor papers to represent a common scoring rubric. A set of writing samples was scored under two conditions-one using anchors selected from within grade and one using anchors selected from across three grade levels. Observed ratings were analyzed using three- and four-facet Rasch (one-parameter logistic) models. Ratings differed in magnitude and rank-order, with the difference presumed to be due to the anchor paper conditions and not a difference in overall severity between the rater groups. Results shed light on potential threats to validity within conventional context-dependent scoring practices and raise issues that have not been investigated with respect to the selection of anchor papers, such as the interpretation of results at different grade levels, implications for the assessment of progress over time, and the reliability of anchor paper selection within a scoring context.
AB - Scoring rubrics are routinely used to evaluate the quality of writing samples produced for writing performance assessments, with anchor papers chosen to represent score points defined in the rubric. Although the careful selection of anchor papers is associated with best practices for scoring, little research has been conducted on the role of anchor paper selection in writing assessment. This study examined the consequences of differential selection of anchor papers to represent a common scoring rubric. A set of writing samples was scored under two conditions-one using anchors selected from within grade and one using anchors selected from across three grade levels. Observed ratings were analyzed using three- and four-facet Rasch (one-parameter logistic) models. Ratings differed in magnitude and rank-order, with the difference presumed to be due to the anchor paper conditions and not a difference in overall severity between the rater groups. Results shed light on potential threats to validity within conventional context-dependent scoring practices and raise issues that have not been investigated with respect to the selection of anchor papers, such as the interpretation of results at different grade levels, implications for the assessment of progress over time, and the reliability of anchor paper selection within a scoring context.
UR - http://www.scopus.com/inward/record.url?scp=74949085384&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=74949085384&partnerID=8YFLogxK
U2 - 10.1080/08957340902984026
DO - 10.1080/08957340902984026
M3 - Article
AN - SCOPUS:74949085384
SN - 0895-7347
VL - 22
SP - 255
EP - 271
JO - Applied Measurement in Education
JF - Applied Measurement in Education
IS - 3
ER -