Predicting math performance using natural language processing tools

Scott Crossley, Ran Liu, Danielle McNamara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

A number of studies have demonstrated links between linguistic knowledge and performance in math. Studies examining these links in first language speakers of English have traditionally relied on correlational analyses between linguistic knowledge tests and standardized math tests. For second language (L2) speakers, the majority of studies have compared math performance between proficient and non-proficient speakers of English. In this study, we take a novel approach and examine the linguistic features of student language while they are engaged in collaborative problem solving within an on-line math tutoring system. We transcribe the students' speech and use natural language processing tools to extract linguistic information related to text cohesion, lexical sophistication, and sentiment. Our criterion variables are individuals' pretest and posttest math performance scores. In addition to examining relations between linguistic features of student language production and math scores, we also control for a number of non-linguistic factors including gender, age, grade, school, and content focus (procedural versus conceptual). Linear mixed effect modeling indicates that non-linguistic factors are not predictive of math scores. However, linguistic features related to cohesion affect and lexical proficiency explained approximately 30% of the variance (R2 =.303) in the math scores.

Original languageEnglish (US)
Title of host publicationLAK 2017 Conference Proceedings - 7th International Learning Analytics and Knowledge Conference: Understanding, Informing and Improving Learning with Data
PublisherAssociation for Computing Machinery
Pages339-347
Number of pages9
VolumePart F126742
ISBN (Electronic)9781450348706
DOIs
StatePublished - Mar 13 2017
Event7th International Conference on Learning Analytics and Knowledge, LAK 2017 - Vancouver, Canada
Duration: Mar 13 2017Mar 17 2017

Other

Other7th International Conference on Learning Analytics and Knowledge, LAK 2017
CountryCanada
CityVancouver
Period3/13/173/17/17

Fingerprint

Linguistics
Processing
Students

Keywords

  • Educational data mining
  • Natural language processing
  • On-line tutoring systems
  • Predictive analytics
  • Sentiment analysis

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

Cite this

Crossley, S., Liu, R., & McNamara, D. (2017). Predicting math performance using natural language processing tools. In LAK 2017 Conference Proceedings - 7th International Learning Analytics and Knowledge Conference: Understanding, Informing and Improving Learning with Data (Vol. Part F126742, pp. 339-347). Association for Computing Machinery. https://doi.org/10.1145/3027385.3027399

Predicting math performance using natural language processing tools. / Crossley, Scott; Liu, Ran; McNamara, Danielle.

LAK 2017 Conference Proceedings - 7th International Learning Analytics and Knowledge Conference: Understanding, Informing and Improving Learning with Data. Vol. Part F126742 Association for Computing Machinery, 2017. p. 339-347.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Crossley, S, Liu, R & McNamara, D 2017, Predicting math performance using natural language processing tools. in LAK 2017 Conference Proceedings - 7th International Learning Analytics and Knowledge Conference: Understanding, Informing and Improving Learning with Data. vol. Part F126742, Association for Computing Machinery, pp. 339-347, 7th International Conference on Learning Analytics and Knowledge, LAK 2017, Vancouver, Canada, 3/13/17. https://doi.org/10.1145/3027385.3027399
Crossley S, Liu R, McNamara D. Predicting math performance using natural language processing tools. In LAK 2017 Conference Proceedings - 7th International Learning Analytics and Knowledge Conference: Understanding, Informing and Improving Learning with Data. Vol. Part F126742. Association for Computing Machinery. 2017. p. 339-347 https://doi.org/10.1145/3027385.3027399
Crossley, Scott ; Liu, Ran ; McNamara, Danielle. / Predicting math performance using natural language processing tools. LAK 2017 Conference Proceedings - 7th International Learning Analytics and Knowledge Conference: Understanding, Informing and Improving Learning with Data. Vol. Part F126742 Association for Computing Machinery, 2017. pp. 339-347
@inproceedings{f74a00beb9c64d9b922d5fa96d5a7020,
title = "Predicting math performance using natural language processing tools",
abstract = "A number of studies have demonstrated links between linguistic knowledge and performance in math. Studies examining these links in first language speakers of English have traditionally relied on correlational analyses between linguistic knowledge tests and standardized math tests. For second language (L2) speakers, the majority of studies have compared math performance between proficient and non-proficient speakers of English. In this study, we take a novel approach and examine the linguistic features of student language while they are engaged in collaborative problem solving within an on-line math tutoring system. We transcribe the students' speech and use natural language processing tools to extract linguistic information related to text cohesion, lexical sophistication, and sentiment. Our criterion variables are individuals' pretest and posttest math performance scores. In addition to examining relations between linguistic features of student language production and math scores, we also control for a number of non-linguistic factors including gender, age, grade, school, and content focus (procedural versus conceptual). Linear mixed effect modeling indicates that non-linguistic factors are not predictive of math scores. However, linguistic features related to cohesion affect and lexical proficiency explained approximately 30{\%} of the variance (R2 =.303) in the math scores.",
keywords = "Educational data mining, Natural language processing, On-line tutoring systems, Predictive analytics, Sentiment analysis",
author = "Scott Crossley and Ran Liu and Danielle McNamara",
year = "2017",
month = "3",
day = "13",
doi = "10.1145/3027385.3027399",
language = "English (US)",
volume = "Part F126742",
pages = "339--347",
booktitle = "LAK 2017 Conference Proceedings - 7th International Learning Analytics and Knowledge Conference: Understanding, Informing and Improving Learning with Data",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Predicting math performance using natural language processing tools

AU - Crossley, Scott

AU - Liu, Ran

AU - McNamara, Danielle

PY - 2017/3/13

Y1 - 2017/3/13

N2 - A number of studies have demonstrated links between linguistic knowledge and performance in math. Studies examining these links in first language speakers of English have traditionally relied on correlational analyses between linguistic knowledge tests and standardized math tests. For second language (L2) speakers, the majority of studies have compared math performance between proficient and non-proficient speakers of English. In this study, we take a novel approach and examine the linguistic features of student language while they are engaged in collaborative problem solving within an on-line math tutoring system. We transcribe the students' speech and use natural language processing tools to extract linguistic information related to text cohesion, lexical sophistication, and sentiment. Our criterion variables are individuals' pretest and posttest math performance scores. In addition to examining relations between linguistic features of student language production and math scores, we also control for a number of non-linguistic factors including gender, age, grade, school, and content focus (procedural versus conceptual). Linear mixed effect modeling indicates that non-linguistic factors are not predictive of math scores. However, linguistic features related to cohesion affect and lexical proficiency explained approximately 30% of the variance (R2 =.303) in the math scores.

AB - A number of studies have demonstrated links between linguistic knowledge and performance in math. Studies examining these links in first language speakers of English have traditionally relied on correlational analyses between linguistic knowledge tests and standardized math tests. For second language (L2) speakers, the majority of studies have compared math performance between proficient and non-proficient speakers of English. In this study, we take a novel approach and examine the linguistic features of student language while they are engaged in collaborative problem solving within an on-line math tutoring system. We transcribe the students' speech and use natural language processing tools to extract linguistic information related to text cohesion, lexical sophistication, and sentiment. Our criterion variables are individuals' pretest and posttest math performance scores. In addition to examining relations between linguistic features of student language production and math scores, we also control for a number of non-linguistic factors including gender, age, grade, school, and content focus (procedural versus conceptual). Linear mixed effect modeling indicates that non-linguistic factors are not predictive of math scores. However, linguistic features related to cohesion affect and lexical proficiency explained approximately 30% of the variance (R2 =.303) in the math scores.

KW - Educational data mining

KW - Natural language processing

KW - On-line tutoring systems

KW - Predictive analytics

KW - Sentiment analysis

UR - http://www.scopus.com/inward/record.url?scp=85016478620&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85016478620&partnerID=8YFLogxK

U2 - 10.1145/3027385.3027399

DO - 10.1145/3027385.3027399

M3 - Conference contribution

VL - Part F126742

SP - 339

EP - 347

BT - LAK 2017 Conference Proceedings - 7th International Learning Analytics and Knowledge Conference: Understanding, Informing and Improving Learning with Data

PB - Association for Computing Machinery

ER -