Computational replication of human paraphrase assessment

Philip M. McCarthy; Zhigiang Cai; Danielle S. McNamara

Computational replication of human paraphrase assessment

Philip M. McCarthy, Zhigiang Cai, Danielle S. McNamara

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Two sentences are paraphrases if their meanings are equivalent but their words and syntax are different. Paraphrasing can be used to aid comprehension, stimulate prior knowledge, and assist in writing skills development. While automated paraphrase assessment is both common-place and useful, research has centered solely on artificial, edited paraphrases and has used only binary dimensions (i.e., is or is-not a paraphrase). In this study, we use 1998 natural paraphrases generated by high school students that have been assessed along 10 dimensions of paraphrase (e.g., semantic completeness). This study investigates the components of paraphrase quality emerging from these dimensions, and examines whether computational approaches (e.g. LSA, MED) can simulate those human evaluations. The results suggest that semantic and syntactic evaluations are the primary components of paraphrase quality, and that computationally light systems such as LSA (semantics) and MED (syntax) present promising approaches to simulating human evaluations of paraphrases.

Original language	English (US)
Title of host publication	Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22
Pages	266-271
Number of pages	6
State	Published - 2009
Externally published	Yes
Event	22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22 - Sanibel Island, FL, United States Duration: Mar 19 2009 → Mar 21 2009

Publication series

Name	Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22

Other

Other	22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22
Country/Territory	United States
City	Sanibel Island, FL
Period	3/19/09 → 3/21/09

ASJC Scopus subject areas

Artificial Intelligence
Computer Networks and Communications
Software

Cite this

Computational replication of human paraphrase assessment. / McCarthy, Philip M.; Cai, Zhigiang; McNamara, Danielle S.
Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22. 2009. p. 266-271 (Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

McCarthy, PM, Cai, Z & McNamara, DS 2009, Computational replication of human paraphrase assessment. in Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22. Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22, pp. 266-271, 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22, Sanibel Island, FL, United States, 3/19/09.

@inproceedings{c7273cb4c29f4ee595c72e49685afb5f,

title = "Computational replication of human paraphrase assessment",

abstract = "Two sentences are paraphrases if their meanings are equivalent but their words and syntax are different. Paraphrasing can be used to aid comprehension, stimulate prior knowledge, and assist in writing skills development. While automated paraphrase assessment is both common-place and useful, research has centered solely on artificial, edited paraphrases and has used only binary dimensions (i.e., is or is-not a paraphrase). In this study, we use 1998 natural paraphrases generated by high school students that have been assessed along 10 dimensions of paraphrase (e.g., semantic completeness). This study investigates the components of paraphrase quality emerging from these dimensions, and examines whether computational approaches (e.g. LSA, MED) can simulate those human evaluations. The results suggest that semantic and syntactic evaluations are the primary components of paraphrase quality, and that computationally light systems such as LSA (semantics) and MED (syntax) present promising approaches to simulating human evaluations of paraphrases.",

author = "McCarthy, {Philip M.} and Zhigiang Cai and McNamara, {Danielle S.}",

year = "2009",

language = "English (US)",

isbn = "9781577354192",

series = "Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22",

pages = "266--271",

booktitle = "Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22",

note = "22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22 ; Conference date: 19-03-2009 Through 21-03-2009",

}

TY - GEN

T1 - Computational replication of human paraphrase assessment

AU - McCarthy, Philip M.

AU - Cai, Zhigiang

AU - McNamara, Danielle S.

PY - 2009

Y1 - 2009

N2 - Two sentences are paraphrases if their meanings are equivalent but their words and syntax are different. Paraphrasing can be used to aid comprehension, stimulate prior knowledge, and assist in writing skills development. While automated paraphrase assessment is both common-place and useful, research has centered solely on artificial, edited paraphrases and has used only binary dimensions (i.e., is or is-not a paraphrase). In this study, we use 1998 natural paraphrases generated by high school students that have been assessed along 10 dimensions of paraphrase (e.g., semantic completeness). This study investigates the components of paraphrase quality emerging from these dimensions, and examines whether computational approaches (e.g. LSA, MED) can simulate those human evaluations. The results suggest that semantic and syntactic evaluations are the primary components of paraphrase quality, and that computationally light systems such as LSA (semantics) and MED (syntax) present promising approaches to simulating human evaluations of paraphrases.

AB - Two sentences are paraphrases if their meanings are equivalent but their words and syntax are different. Paraphrasing can be used to aid comprehension, stimulate prior knowledge, and assist in writing skills development. While automated paraphrase assessment is both common-place and useful, research has centered solely on artificial, edited paraphrases and has used only binary dimensions (i.e., is or is-not a paraphrase). In this study, we use 1998 natural paraphrases generated by high school students that have been assessed along 10 dimensions of paraphrase (e.g., semantic completeness). This study investigates the components of paraphrase quality emerging from these dimensions, and examines whether computational approaches (e.g. LSA, MED) can simulate those human evaluations. The results suggest that semantic and syntactic evaluations are the primary components of paraphrase quality, and that computationally light systems such as LSA (semantics) and MED (syntax) present promising approaches to simulating human evaluations of paraphrases.

UR - http://www.scopus.com/inward/record.url?scp=70350521072&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70350521072&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:70350521072

SN - 9781577354192

T3 - Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22

SP - 266

EP - 271

BT - Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22

T2 - 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22

Y2 - 19 March 2009 through 21 March 2009

ER -

Computational replication of human paraphrase assessment

Abstract

Publication series

Other

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this