TY - GEN
T1 - Predicting math performance using natural language processing tools
AU - Crossley, Scott
AU - Liu, Ran
AU - McNamara, Danielle
N1 - Publisher Copyright:
© 2017 ACM.
PY - 2017/3/13
Y1 - 2017/3/13
N2 - A number of studies have demonstrated links between linguistic knowledge and performance in math. Studies examining these links in first language speakers of English have traditionally relied on correlational analyses between linguistic knowledge tests and standardized math tests. For second language (L2) speakers, the majority of studies have compared math performance between proficient and non-proficient speakers of English. In this study, we take a novel approach and examine the linguistic features of student language while they are engaged in collaborative problem solving within an on-line math tutoring system. We transcribe the students' speech and use natural language processing tools to extract linguistic information related to text cohesion, lexical sophistication, and sentiment. Our criterion variables are individuals' pretest and posttest math performance scores. In addition to examining relations between linguistic features of student language production and math scores, we also control for a number of non-linguistic factors including gender, age, grade, school, and content focus (procedural versus conceptual). Linear mixed effect modeling indicates that non-linguistic factors are not predictive of math scores. However, linguistic features related to cohesion affect and lexical proficiency explained approximately 30% of the variance (R2 =.303) in the math scores.
AB - A number of studies have demonstrated links between linguistic knowledge and performance in math. Studies examining these links in first language speakers of English have traditionally relied on correlational analyses between linguistic knowledge tests and standardized math tests. For second language (L2) speakers, the majority of studies have compared math performance between proficient and non-proficient speakers of English. In this study, we take a novel approach and examine the linguistic features of student language while they are engaged in collaborative problem solving within an on-line math tutoring system. We transcribe the students' speech and use natural language processing tools to extract linguistic information related to text cohesion, lexical sophistication, and sentiment. Our criterion variables are individuals' pretest and posttest math performance scores. In addition to examining relations between linguistic features of student language production and math scores, we also control for a number of non-linguistic factors including gender, age, grade, school, and content focus (procedural versus conceptual). Linear mixed effect modeling indicates that non-linguistic factors are not predictive of math scores. However, linguistic features related to cohesion affect and lexical proficiency explained approximately 30% of the variance (R2 =.303) in the math scores.
KW - Educational data mining
KW - Natural language processing
KW - On-line tutoring systems
KW - Predictive analytics
KW - Sentiment analysis
UR - http://www.scopus.com/inward/record.url?scp=85016478620&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85016478620&partnerID=8YFLogxK
U2 - 10.1145/3027385.3027399
DO - 10.1145/3027385.3027399
M3 - Conference contribution
AN - SCOPUS:85016478620
T3 - ACM International Conference Proceeding Series
SP - 339
EP - 347
BT - LAK 2017 Conference Proceedings - 7th International Learning Analytics and Knowledge Conference
PB - Association for Computing Machinery
T2 - 7th International Conference on Learning Analytics and Knowledge, LAK 2017
Y2 - 13 March 2017 through 17 March 2017
ER -