Applications of text analysis tools for spoken response grading

Scott Crossley, Danielle McNamara

Research output: Contribution to journalArticlepeer-review

53 Scopus citations

Abstract

This study explores the potential for automated indices related to speech delivery, language use, and topic development to model human judgments of TOEFL speaking proficiency in second language (L2) speech samples. For this study, 244 transcribed TOEFL speech samples taken from 244 L2 learners were analyzed using automated indices taken from Coh-Metrix, CPIDR, and LIWC. A stepwise linear regression was used to explain the variance in human judgments of independent speaking ability and overall speaking proficiency. Automated indices related to word type counts, causal cohesion, and lexical diversity predicted 52%of the variance in human ratings for the independent speech samples. Automated indices related to word type counts and word frequency predicted 61% of the variance of the human scores of overall speaking proficiency. These analyses demonstrate that, even in the absence of indices related to pronunciation and prosody (e.g., phonological accuracy, intonation, and stress), automated indices related to vocabulary size, causality, and word frequency can predict a significant amount of the variance in human ratings of speaking proficiency. These findings have important implications for understanding the construct of speaking proficiency and for the development of automatic scoring techniques.

Original languageEnglish (US)
Pages (from-to)171-192
Number of pages22
JournalLanguage Learning and Technology
Volume17
Issue number2
StatePublished - Jun 2013

Keywords

  • Computational linguistics
  • Corpus linguistics
  • Language testing
  • Machine learning
  • Speaking proficiency

ASJC Scopus subject areas

  • Education
  • Language and Linguistics
  • Linguistics and Language
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Applications of text analysis tools for spoken response grading'. Together they form a unique fingerprint.

Cite this