Computational considerations in correcting user-language

Adam M. Renner; Philip M. McCarthy; Danielle S. McNamara

Computational considerations in correcting user-language

Adam M. Renner, Philip M. McCarthy, Danielle S. McNamara

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

This study evaluates the robustness of established computational indices used to assess text relatedness in user-language. The original User-Language Paraphrase Corpus (ULPC) was compared to a corrected version, in which each paraphrase was corrected for typographical and grammatical errors. Error correction significantly affected values for each of five computational indices, indicating greater similarity of the target sentence to the corrected paraphrase than to the original paraphrase. Moreover, misspelled target words accounted for a large proportion of the differences. This study also evaluated potential effects on correlations between computational indices and human ratings of paraphrases. The corrections did not yield assessments that were any more or less comparable to trained human raters than were the original paraphrases containing typographical or grammatical errors. The results suggest that although correcting for errors may optimize certain computational indices, the corrections are not necessary for comparing the indices to expert ratings.

Original language	English (US)
Title of host publication	Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22
Pages	278-283
Number of pages	6
State	Published - 2009
Externally published	Yes
Event	22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22 - Sanibel Island, FL, United States Duration: Mar 19 2009 → Mar 21 2009

Publication series

Name	Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22

Other

Other	22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22
Country/Territory	United States
City	Sanibel Island, FL
Period	3/19/09 → 3/21/09

ASJC Scopus subject areas

Artificial Intelligence
Computer Networks and Communications
Software

Cite this

Computational considerations in correcting user-language. / Renner, Adam M.; McCarthy, Philip M.; McNamara, Danielle S.
Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22. 2009. p. 278-283 (Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Renner, AM, McCarthy, PM & McNamara, DS 2009, Computational considerations in correcting user-language. in Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22. Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22, pp. 278-283, 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22, Sanibel Island, FL, United States, 3/19/09.

@inproceedings{dde90613dc8c4157b7ac8fe45e296f30,

title = "Computational considerations in correcting user-language",

abstract = "This study evaluates the robustness of established computational indices used to assess text relatedness in user-language. The original User-Language Paraphrase Corpus (ULPC) was compared to a corrected version, in which each paraphrase was corrected for typographical and grammatical errors. Error correction significantly affected values for each of five computational indices, indicating greater similarity of the target sentence to the corrected paraphrase than to the original paraphrase. Moreover, misspelled target words accounted for a large proportion of the differences. This study also evaluated potential effects on correlations between computational indices and human ratings of paraphrases. The corrections did not yield assessments that were any more or less comparable to trained human raters than were the original paraphrases containing typographical or grammatical errors. The results suggest that although correcting for errors may optimize certain computational indices, the corrections are not necessary for comparing the indices to expert ratings.",

author = "Renner, {Adam M.} and McCarthy, {Philip M.} and McNamara, {Danielle S.}",

year = "2009",

language = "English (US)",

isbn = "9781577354192",

series = "Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22",

pages = "278--283",

booktitle = "Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22",

note = "22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22 ; Conference date: 19-03-2009 Through 21-03-2009",

}

TY - GEN

T1 - Computational considerations in correcting user-language

AU - Renner, Adam M.

AU - McCarthy, Philip M.

AU - McNamara, Danielle S.

PY - 2009

Y1 - 2009

N2 - This study evaluates the robustness of established computational indices used to assess text relatedness in user-language. The original User-Language Paraphrase Corpus (ULPC) was compared to a corrected version, in which each paraphrase was corrected for typographical and grammatical errors. Error correction significantly affected values for each of five computational indices, indicating greater similarity of the target sentence to the corrected paraphrase than to the original paraphrase. Moreover, misspelled target words accounted for a large proportion of the differences. This study also evaluated potential effects on correlations between computational indices and human ratings of paraphrases. The corrections did not yield assessments that were any more or less comparable to trained human raters than were the original paraphrases containing typographical or grammatical errors. The results suggest that although correcting for errors may optimize certain computational indices, the corrections are not necessary for comparing the indices to expert ratings.

AB - This study evaluates the robustness of established computational indices used to assess text relatedness in user-language. The original User-Language Paraphrase Corpus (ULPC) was compared to a corrected version, in which each paraphrase was corrected for typographical and grammatical errors. Error correction significantly affected values for each of five computational indices, indicating greater similarity of the target sentence to the corrected paraphrase than to the original paraphrase. Moreover, misspelled target words accounted for a large proportion of the differences. This study also evaluated potential effects on correlations between computational indices and human ratings of paraphrases. The corrections did not yield assessments that were any more or less comparable to trained human raters than were the original paraphrases containing typographical or grammatical errors. The results suggest that although correcting for errors may optimize certain computational indices, the corrections are not necessary for comparing the indices to expert ratings.

UR - http://www.scopus.com/inward/record.url?scp=68949182340&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=68949182340&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:68949182340

SN - 9781577354192

T3 - Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22

SP - 278

EP - 283

BT - Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22

T2 - 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22

Y2 - 19 March 2009 through 21 March 2009

ER -

Computational considerations in correcting user-language

Abstract

Publication series

Other

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this