Computational considerations in correcting user-language

Adam M. Renner, Philip M. McCarthy, Danielle S. McNamara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

This study evaluates the robustness of established computational indices used to assess text relatedness in user-language. The original User-Language Paraphrase Corpus (ULPC) was compared to a corrected version, in which each paraphrase was corrected for typographical and grammatical errors. Error correction significantly affected values for each of five computational indices, indicating greater similarity of the target sentence to the corrected paraphrase than to the original paraphrase. Moreover, misspelled target words accounted for a large proportion of the differences. This study also evaluated potential effects on correlations between computational indices and human ratings of paraphrases. The corrections did not yield assessments that were any more or less comparable to trained human raters than were the original paraphrases containing typographical or grammatical errors. The results suggest that although correcting for errors may optimize certain computational indices, the corrections are not necessary for comparing the indices to expert ratings.

Original languageEnglish (US)
Title of host publicationProceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22
Pages278-283
Number of pages6
StatePublished - 2009
Externally publishedYes
Event22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22 - Sanibel Island, FL, United States
Duration: Mar 19 2009Mar 21 2009

Publication series

NameProceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22

Other

Other22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22
Country/TerritoryUnited States
CitySanibel Island, FL
Period3/19/093/21/09

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Software

Fingerprint

Dive into the research topics of 'Computational considerations in correcting user-language'. Together they form a unique fingerprint.

Cite this