Source inclusion in synthesis writing: an NLP approach to understanding argumentation, sourcing, and essay quality

Scott Crossley, Qian Wan, Laura Allen, Danielle McNamara

Research output: Contribution to journalArticlepeer-review

3 Scopus citations


Synthesis writing is widely taught across domains and serves as an important means of assessing writing ability, text comprehension, and content learning. Synthesis writing differs from other types of writing in terms of both cognitive and task demands because it requires writers to integrate information across source materials. However, little is known about how integration of source material may influence overall writing quality for synthesis tasks. This study examined approximately 900 source-based essays written in response to four different synthesis prompts which instructed writers to use information from the sources to illustrate and support their arguments and clearly indicate from which sources they were drawing (i.e., citation use). The essays were then scored by expert raters for holistic quality, argumentation, and source use/inferencing. Hand-crafted natural language processing (NLP) features and pre-existing NLP tools were used to examine semantic and keyword overlap between the essays and the source texts, plagiarism from the source texts, and instances of source citation and quoting. These variables along with text length and prompt were then used to predict essays scores. Results reported strong models for predicting human ratings that explained between 47 and 52% of the variance in scores. The results indicate that text length was the strongest predictor of score but also that more successful writers include stronger, semantically-related information from the source, provide more citations and do so later in the text, and copy less from the text. This work introduces the use of NLP techniques to assess source integration, provides details on the types of source integration used by writers, and highlights the effects of source integration on writing quality.

Original languageEnglish (US)
JournalReading and Writing
StateAccepted/In press - 2021
Externally publishedYes


  • Corpus linguistics
  • Natural language processing
  • Synthesis writing

ASJC Scopus subject areas

  • Neuropsychology and Physiological Psychology
  • Education
  • Linguistics and Language
  • Speech and Hearing


Dive into the research topics of 'Source inclusion in synthesis writing: an NLP approach to understanding argumentation, sourcing, and essay quality'. Together they form a unique fingerprint.

Cite this