A continuing problem for ANLP (compared with NLP) is that language tends to be more natural in ANLP than that examined in more controlled natural language processing (NLP) studies. Specifically, ineffective or misleading feedback can result from faulty assessment of misspelled words. This chapter describes the Harmonizer system for addressing the problem of user input irregularities (e.g., typos). The Harmonizer is specifically designed for Intelligence Tutoring Systems (ITSs) that use NLP to provide assessment and feedback based on the typed input of the user. Our approach is to "harmonize" similar words to the same form in the benchmark, rather than correcting them to dictionary entries. This chapter describes the Harmonizer, and evaluates its performance using various computational approaches on unedited input from high school students in the context of an ITS (i.e., iSTART). Our results indicate that various metric approaches to NLP (such as word-overlap cohesion scores) are moderately affected when student errors are filtered by the Harmonizer. Given the prevalence of typing errors in the sample, the study substantiates the need to "clean" typed input in comparable NLP-based learning systems. The Harmonizer provides such ability and is easy to implement with light processing requirements.
|Original language||English (US)|
|Title of host publication||Applied Natural Language Processing|
|Subtitle of host publication||Identification, Investigation and Resolution|
|Number of pages||19|
|State||Published - Dec 1 2011|
ASJC Scopus subject areas
- Computer Science(all)