TY - JOUR
T1 - The components of paraphrase evaluations
AU - McCarthy, Philip M.
AU - Guess, Rebekah H.
AU - McNamara, Danielle S.
N1 - Funding Information:
This research was supported in part by the Institute for Education Sciences (Grants R305GA080589, R305G020018-02, and R305G040046), and in part by the National Science Foundation (Grant IIS-0735682). The views expressed in this article do not necessarily reflect the views of the IES or the NSF.
PY - 2009/8
Y1 - 2009/8
N2 - Two sentences are paraphrases if their meanings are equivalent but their words and syntax are different. Paraphrasing can be used to aid comprehension, stimulate prior knowledge, and assist in writing-skills development. As such, paraphrasing is a feature of fields as diverse as discourse psychology, composition, and computer science. Although automated paraphrase assessment is both commonplace and useful, research has centered solely on artificial, edited paraphrases and has used only binary dimensions (i.e., is or is not a paraphrase). In this study, we use an extensive database (N 5 1,998) of natural paraphrases generated by high school students that have been assessed along 10 dimensions (e.g., semantic completeness, lexical similarity, syntactical similarity). This study investigates the components of paraphrase quality emerging from these dimensions and examines whether computational approaches can simulate those human evaluations. The results suggest that semantic and syntactic evaluations are the primary components of paraphrase quality, and that computationally light systems such as latent semantic analysis (semantics) and minimal edit distances (syntax) present promising approaches to simulating human evaluations of paraphrases.
AB - Two sentences are paraphrases if their meanings are equivalent but their words and syntax are different. Paraphrasing can be used to aid comprehension, stimulate prior knowledge, and assist in writing-skills development. As such, paraphrasing is a feature of fields as diverse as discourse psychology, composition, and computer science. Although automated paraphrase assessment is both commonplace and useful, research has centered solely on artificial, edited paraphrases and has used only binary dimensions (i.e., is or is not a paraphrase). In this study, we use an extensive database (N 5 1,998) of natural paraphrases generated by high school students that have been assessed along 10 dimensions (e.g., semantic completeness, lexical similarity, syntactical similarity). This study investigates the components of paraphrase quality emerging from these dimensions and examines whether computational approaches can simulate those human evaluations. The results suggest that semantic and syntactic evaluations are the primary components of paraphrase quality, and that computationally light systems such as latent semantic analysis (semantics) and minimal edit distances (syntax) present promising approaches to simulating human evaluations of paraphrases.
UR - http://www.scopus.com/inward/record.url?scp=68949191151&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=68949191151&partnerID=8YFLogxK
U2 - 10.3758/BRM.41.3.682
DO - 10.3758/BRM.41.3.682
M3 - Article
C2 - 19587179
AN - SCOPUS:68949191151
SN - 1554-351X
VL - 41
SP - 682
EP - 690
JO - Behavior Research Methods
JF - Behavior Research Methods
IS - 3
ER -