Computational replication of human paraphrase assessment

Philip M. McCarthy, Zhigiang Cai, Danielle McNamara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Two sentences are paraphrases if their meanings are equivalent but their words and syntax are different. Paraphrasing can be used to aid comprehension, stimulate prior knowledge, and assist in writing skills development. While automated paraphrase assessment is both common-place and useful, research has centered solely on artificial, edited paraphrases and has used only binary dimensions (i.e., is or is-not a paraphrase). In this study, we use 1998 natural paraphrases generated by high school students that have been assessed along 10 dimensions of paraphrase (e.g., semantic completeness). This study investigates the components of paraphrase quality emerging from these dimensions, and examines whether computational approaches (e.g. LSA, MED) can simulate those human evaluations. The results suggest that semantic and syntactic evaluations are the primary components of paraphrase quality, and that computationally light systems such as LSA (semantics) and MED (syntax) present promising approaches to simulating human evaluations of paraphrases.

Original languageEnglish (US)
Title of host publicationProceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22
Pages266-271
Number of pages6
StatePublished - 2009
Externally publishedYes
Event22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22 - Sanibel Island, FL, United States
Duration: Mar 19 2009Mar 21 2009

Other

Other22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22
CountryUnited States
CitySanibel Island, FL
Period3/19/093/21/09

Fingerprint

Semantics
Syntactics
Students

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Software

Cite this

McCarthy, P. M., Cai, Z., & McNamara, D. (2009). Computational replication of human paraphrase assessment. In Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22 (pp. 266-271)

Computational replication of human paraphrase assessment. / McCarthy, Philip M.; Cai, Zhigiang; McNamara, Danielle.

Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22. 2009. p. 266-271.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

McCarthy, PM, Cai, Z & McNamara, D 2009, Computational replication of human paraphrase assessment. in Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22. pp. 266-271, 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22, Sanibel Island, FL, United States, 3/19/09.
McCarthy PM, Cai Z, McNamara D. Computational replication of human paraphrase assessment. In Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22. 2009. p. 266-271
McCarthy, Philip M. ; Cai, Zhigiang ; McNamara, Danielle. / Computational replication of human paraphrase assessment. Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22. 2009. pp. 266-271
@inproceedings{c7273cb4c29f4ee595c72e49685afb5f,
title = "Computational replication of human paraphrase assessment",
abstract = "Two sentences are paraphrases if their meanings are equivalent but their words and syntax are different. Paraphrasing can be used to aid comprehension, stimulate prior knowledge, and assist in writing skills development. While automated paraphrase assessment is both common-place and useful, research has centered solely on artificial, edited paraphrases and has used only binary dimensions (i.e., is or is-not a paraphrase). In this study, we use 1998 natural paraphrases generated by high school students that have been assessed along 10 dimensions of paraphrase (e.g., semantic completeness). This study investigates the components of paraphrase quality emerging from these dimensions, and examines whether computational approaches (e.g. LSA, MED) can simulate those human evaluations. The results suggest that semantic and syntactic evaluations are the primary components of paraphrase quality, and that computationally light systems such as LSA (semantics) and MED (syntax) present promising approaches to simulating human evaluations of paraphrases.",
author = "McCarthy, {Philip M.} and Zhigiang Cai and Danielle McNamara",
year = "2009",
language = "English (US)",
isbn = "9781577354192",
pages = "266--271",
booktitle = "Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22",

}

TY - GEN

T1 - Computational replication of human paraphrase assessment

AU - McCarthy, Philip M.

AU - Cai, Zhigiang

AU - McNamara, Danielle

PY - 2009

Y1 - 2009

N2 - Two sentences are paraphrases if their meanings are equivalent but their words and syntax are different. Paraphrasing can be used to aid comprehension, stimulate prior knowledge, and assist in writing skills development. While automated paraphrase assessment is both common-place and useful, research has centered solely on artificial, edited paraphrases and has used only binary dimensions (i.e., is or is-not a paraphrase). In this study, we use 1998 natural paraphrases generated by high school students that have been assessed along 10 dimensions of paraphrase (e.g., semantic completeness). This study investigates the components of paraphrase quality emerging from these dimensions, and examines whether computational approaches (e.g. LSA, MED) can simulate those human evaluations. The results suggest that semantic and syntactic evaluations are the primary components of paraphrase quality, and that computationally light systems such as LSA (semantics) and MED (syntax) present promising approaches to simulating human evaluations of paraphrases.

AB - Two sentences are paraphrases if their meanings are equivalent but their words and syntax are different. Paraphrasing can be used to aid comprehension, stimulate prior knowledge, and assist in writing skills development. While automated paraphrase assessment is both common-place and useful, research has centered solely on artificial, edited paraphrases and has used only binary dimensions (i.e., is or is-not a paraphrase). In this study, we use 1998 natural paraphrases generated by high school students that have been assessed along 10 dimensions of paraphrase (e.g., semantic completeness). This study investigates the components of paraphrase quality emerging from these dimensions, and examines whether computational approaches (e.g. LSA, MED) can simulate those human evaluations. The results suggest that semantic and syntactic evaluations are the primary components of paraphrase quality, and that computationally light systems such as LSA (semantics) and MED (syntax) present promising approaches to simulating human evaluations of paraphrases.

UR - http://www.scopus.com/inward/record.url?scp=70350521072&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70350521072&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9781577354192

SP - 266

EP - 271

BT - Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22

ER -