Computational assessment of lexical differences in L1 and L2 writing

Scott A. Crossley; Danielle S. McNamara

doi:10.1016/j.jslw.2009.02.002

Computational assessment of lexical differences in L1 and L2 writing

Scott A. Crossley, Danielle S. McNamara

Research output: Contribution to journal › Article › peer-review

133 Scopus citations

Abstract

The purpose of this paper is to provide a detailed analysis of how lexical differences related to cohesion and connectionist models can distinguish first language (L1) writers of English from second language (L2) writers of English. Key to this analysis is the use of the computational tool Coh-Metrix, which measures cohesion and text difficulty at various levels of language, discourse, and conceptual analysis, and a statistical method known as discriminant function analysis. Results show that L1 and L2 written texts vary in several dimensions related to the writer's use of lexical choices. These dimensions correlate to lexical depth of knowledge, variation, and sophistication. These findings, together with the relevance of the new computational tools for the text analysis used in the study, are discussed.

Original language	English (US)
Pages (from-to)	119-135
Number of pages	17
Journal	Journal of Second Language Writing
Volume	18
Issue number	2
DOIs	https://doi.org/10.1016/j.jslw.2009.02.002
State	Published - Jun 2009
Externally published	Yes

Keywords

Cohesion
Computational linguistics
Corpus linguistics
Lexical networks
Lexical proficiency
Second language writing

ASJC Scopus subject areas

Language and Linguistics
Education
Linguistics and Language

Access to Document

10.1016/j.jslw.2009.02.002

Cite this

@article{67a155f886ae44768cfe2d5f8dfb83d8,

title = "Computational assessment of lexical differences in L1 and L2 writing",

abstract = "The purpose of this paper is to provide a detailed analysis of how lexical differences related to cohesion and connectionist models can distinguish first language (L1) writers of English from second language (L2) writers of English. Key to this analysis is the use of the computational tool Coh-Metrix, which measures cohesion and text difficulty at various levels of language, discourse, and conceptual analysis, and a statistical method known as discriminant function analysis. Results show that L1 and L2 written texts vary in several dimensions related to the writer's use of lexical choices. These dimensions correlate to lexical depth of knowledge, variation, and sophistication. These findings, together with the relevance of the new computational tools for the text analysis used in the study, are discussed.",

keywords = "Cohesion, Computational linguistics, Corpus linguistics, Lexical networks, Lexical proficiency, Second language writing",

author = "Crossley, {Scott A.} and McNamara, {Danielle S.}",

note = "Funding Information: The research was supported in part by the Institute for Education Sciences (IES R3056020018-02 and R305A080589). Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the IES. The authors are indebted to Dr. Philip McCarthy of the Institute for Intelligent Systems for his assistance with the statistical analyses reported in this paper.",

year = "2009",

month = jun,

doi = "10.1016/j.jslw.2009.02.002",

language = "English (US)",

volume = "18",

pages = "119--135",

journal = "Journal of Second Language Writing",

issn = "1060-3743",

publisher = "Elsevier Limited",

number = "2",

}

TY - JOUR

T1 - Computational assessment of lexical differences in L1 and L2 writing

AU - Crossley, Scott A.

AU - McNamara, Danielle S.

N1 - Funding Information: The research was supported in part by the Institute for Education Sciences (IES R3056020018-02 and R305A080589). Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the IES. The authors are indebted to Dr. Philip McCarthy of the Institute for Intelligent Systems for his assistance with the statistical analyses reported in this paper.

PY - 2009/6

Y1 - 2009/6

N2 - The purpose of this paper is to provide a detailed analysis of how lexical differences related to cohesion and connectionist models can distinguish first language (L1) writers of English from second language (L2) writers of English. Key to this analysis is the use of the computational tool Coh-Metrix, which measures cohesion and text difficulty at various levels of language, discourse, and conceptual analysis, and a statistical method known as discriminant function analysis. Results show that L1 and L2 written texts vary in several dimensions related to the writer's use of lexical choices. These dimensions correlate to lexical depth of knowledge, variation, and sophistication. These findings, together with the relevance of the new computational tools for the text analysis used in the study, are discussed.

AB - The purpose of this paper is to provide a detailed analysis of how lexical differences related to cohesion and connectionist models can distinguish first language (L1) writers of English from second language (L2) writers of English. Key to this analysis is the use of the computational tool Coh-Metrix, which measures cohesion and text difficulty at various levels of language, discourse, and conceptual analysis, and a statistical method known as discriminant function analysis. Results show that L1 and L2 written texts vary in several dimensions related to the writer's use of lexical choices. These dimensions correlate to lexical depth of knowledge, variation, and sophistication. These findings, together with the relevance of the new computational tools for the text analysis used in the study, are discussed.

KW - Cohesion

KW - Computational linguistics

KW - Corpus linguistics

KW - Lexical networks

KW - Lexical proficiency

KW - Second language writing

UR - http://www.scopus.com/inward/record.url?scp=67349241371&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=67349241371&partnerID=8YFLogxK

U2 - 10.1016/j.jslw.2009.02.002

DO - 10.1016/j.jslw.2009.02.002

M3 - Article

AN - SCOPUS:67349241371

SN - 1060-3743

VL - 18

SP - 119

EP - 135

JO - Journal of Second Language Writing

JF - Journal of Second Language Writing

IS - 2

ER -

Computational assessment of lexical differences in L1 and L2 writing

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this