Multitask Summary Scoring with Longformers

Robert Mihai Botarleanu; Mihai Dascalu; Laura K. Allen; Scott Andrew Crossley; Danielle S. McNamara

doi:10.1007/978-3-031-11644-5_79

Multitask Summary Scoring with Longformers

Robert Mihai Botarleanu, Mihai Dascalu, Laura K. Allen, Scott Andrew Crossley, Danielle S. McNamara

Psychology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

5 Scopus citations

Abstract

Automated scoring of student language is a complex task that requires systems to emulate complex and multi-faceted human evaluation criteria. Summary scoring brings an additional layer of complexity to automated scoring because it involves two texts of differing lengths that must be compared. In this study, we present our approach to automate summary scoring by evaluating a corpus of approximately 5,000 summaries based on 103 source texts, each summary being scored on a 4-point Likert scale for seven different evaluation criteria. We train and evaluate a series of Machine Learning models that use a combination of independent textual complexity indices from the ReaderBench framework and Deep Learning models based on the Transformer architecture in a multitask setup to predict concurrently all criteria. Our models achieve significantly lower errors than previous work using a similar dataset, with MAE ranging from 0.10–0.16 and corresponding R² values of up to 0.64. Our findings indicate that Longformer-based [1] models are adequate for contextualizing longer text sequences and effectively scoring summaries according to a variety of human-defined evaluation criteria using a single Neural Network.

Original language	English (US)
Title of host publication	Artificial Intelligence in Education - 23rd International Conference, AIED 2022, Proceedings
Editors	Maria Mercedes Rodrigo, Noburu Matsuda, Alexandra I. Cristea, Vania Dimitrova
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	756-761
Number of pages	6
ISBN (Print)	9783031116438
DOIs	https://doi.org/10.1007/978-3-031-11644-5_79
State	Published - 2022
Event	23rd International Conference on Artificial Intelligence in Education, AIED 2022 - Durham, United Kingdom Duration: Jul 27 2022 → Jul 31 2022

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	13355 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	23rd International Conference on Artificial Intelligence in Education, AIED 2022
Country/Territory	United Kingdom
City	Durham
Period	7/27/22 → 7/31/22

Keywords

Automated summary scoring
Multitask learning
Natural language processing
Text summarization

ASJC Scopus subject areas

Theoretical Computer Science
General Computer Science

Access to Document

10.1007/978-3-031-11644-5_79

Cite this

Botarleanu, R. M., Dascalu, M., Allen, L. K., Crossley, S. A., & McNamara, D. S. (2022). Multitask Summary Scoring with Longformers. In M. M. Rodrigo, N. Matsuda, A. I. Cristea, & V. Dimitrova (Eds.), Artificial Intelligence in Education - 23rd International Conference, AIED 2022, Proceedings (pp. 756-761). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13355 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-11644-5_79

Multitask Summary Scoring with Longformers. / Botarleanu, Robert Mihai; Dascalu, Mihai; Allen, Laura K. et al.
Artificial Intelligence in Education - 23rd International Conference, AIED 2022, Proceedings. ed. / Maria Mercedes Rodrigo; Noburu Matsuda; Alexandra I. Cristea; Vania Dimitrova. Springer Science and Business Media Deutschland GmbH, 2022. p. 756-761 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13355 LNCS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Botarleanu, RM, Dascalu, M, Allen, LK, Crossley, SA & McNamara, DS 2022, Multitask Summary Scoring with Longformers. in MM Rodrigo, N Matsuda, AI Cristea & V Dimitrova (eds), Artificial Intelligence in Education - 23rd International Conference, AIED 2022, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 13355 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 756-761, 23rd International Conference on Artificial Intelligence in Education, AIED 2022, Durham, United Kingdom, 7/27/22. https://doi.org/10.1007/978-3-031-11644-5_79

Botarleanu RM, Dascalu M, Allen LK, Crossley SA, McNamara DS. Multitask Summary Scoring with Longformers. In Rodrigo MM, Matsuda N, Cristea AI, Dimitrova V, editors, Artificial Intelligence in Education - 23rd International Conference, AIED 2022, Proceedings. Springer Science and Business Media Deutschland GmbH. 2022. p. 756-761. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-031-11644-5_79

Botarleanu, Robert Mihai ; Dascalu, Mihai ; Allen, Laura K. et al. / Multitask Summary Scoring with Longformers. Artificial Intelligence in Education - 23rd International Conference, AIED 2022, Proceedings. editor / Maria Mercedes Rodrigo ; Noburu Matsuda ; Alexandra I. Cristea ; Vania Dimitrova. Springer Science and Business Media Deutschland GmbH, 2022. pp. 756-761 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{545958a947854730a56b803b79a1a27c,

title = "Multitask Summary Scoring with Longformers",

abstract = "Automated scoring of student language is a complex task that requires systems to emulate complex and multi-faceted human evaluation criteria. Summary scoring brings an additional layer of complexity to automated scoring because it involves two texts of differing lengths that must be compared. In this study, we present our approach to automate summary scoring by evaluating a corpus of approximately 5,000 summaries based on 103 source texts, each summary being scored on a 4-point Likert scale for seven different evaluation criteria. We train and evaluate a series of Machine Learning models that use a combination of independent textual complexity indices from the ReaderBench framework and Deep Learning models based on the Transformer architecture in a multitask setup to predict concurrently all criteria. Our models achieve significantly lower errors than previous work using a similar dataset, with MAE ranging from 0.10–0.16 and corresponding R2 values of up to 0.64. Our findings indicate that Longformer-based [1] models are adequate for contextualizing longer text sequences and effectively scoring summaries according to a variety of human-defined evaluation criteria using a single Neural Network.",

keywords = "Automated summary scoring, Multitask learning, Natural language processing, Text summarization",

author = "Botarleanu, {Robert Mihai} and Mihai Dascalu and Allen, {Laura K.} and Crossley, {Scott Andrew} and McNamara, {Danielle S.}",

note = "Funding Information: Acknowledgments. This research was supported by a grant from the Romanian National Authority for Scientific Research and Innovation, CNCS – UEFISCDI, project number TE 70 PN-III-P1-1.1-TE-2019-2209, ATES – “Automated Text Evaluation and Simplification”, the Institute of Education Sciences (R305A180144 and R305A180261), and the Office of Naval Research (N00014-17-1-2300; N00014-20-1-2623; N00014-19-1-2424, N00014-20-1-2627). The opinions expressed are those of the authors and do not represent the views of the IES or ONR. Publisher Copyright: {\textcopyright} 2022, Springer Nature Switzerland AG.; 23rd International Conference on Artificial Intelligence in Education, AIED 2022 ; Conference date: 27-07-2022 Through 31-07-2022",

year = "2022",

doi = "10.1007/978-3-031-11644-5_79",

language = "English (US)",

isbn = "9783031116438",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "756--761",

editor = "Rodrigo, {Maria Mercedes} and Noburu Matsuda and Cristea, {Alexandra I.} and Vania Dimitrova",

booktitle = "Artificial Intelligence in Education - 23rd International Conference, AIED 2022, Proceedings",

address = "Germany",

}

TY - GEN

T1 - Multitask Summary Scoring with Longformers

AU - Botarleanu, Robert Mihai

AU - Dascalu, Mihai

AU - Allen, Laura K.

AU - Crossley, Scott Andrew

AU - McNamara, Danielle S.

N1 - Funding Information: Acknowledgments. This research was supported by a grant from the Romanian National Authority for Scientific Research and Innovation, CNCS – UEFISCDI, project number TE 70 PN-III-P1-1.1-TE-2019-2209, ATES – “Automated Text Evaluation and Simplification”, the Institute of Education Sciences (R305A180144 and R305A180261), and the Office of Naval Research (N00014-17-1-2300; N00014-20-1-2623; N00014-19-1-2424, N00014-20-1-2627). The opinions expressed are those of the authors and do not represent the views of the IES or ONR. Publisher Copyright: © 2022, Springer Nature Switzerland AG.

PY - 2022

Y1 - 2022

N2 - Automated scoring of student language is a complex task that requires systems to emulate complex and multi-faceted human evaluation criteria. Summary scoring brings an additional layer of complexity to automated scoring because it involves two texts of differing lengths that must be compared. In this study, we present our approach to automate summary scoring by evaluating a corpus of approximately 5,000 summaries based on 103 source texts, each summary being scored on a 4-point Likert scale for seven different evaluation criteria. We train and evaluate a series of Machine Learning models that use a combination of independent textual complexity indices from the ReaderBench framework and Deep Learning models based on the Transformer architecture in a multitask setup to predict concurrently all criteria. Our models achieve significantly lower errors than previous work using a similar dataset, with MAE ranging from 0.10–0.16 and corresponding R2 values of up to 0.64. Our findings indicate that Longformer-based [1] models are adequate for contextualizing longer text sequences and effectively scoring summaries according to a variety of human-defined evaluation criteria using a single Neural Network.

AB - Automated scoring of student language is a complex task that requires systems to emulate complex and multi-faceted human evaluation criteria. Summary scoring brings an additional layer of complexity to automated scoring because it involves two texts of differing lengths that must be compared. In this study, we present our approach to automate summary scoring by evaluating a corpus of approximately 5,000 summaries based on 103 source texts, each summary being scored on a 4-point Likert scale for seven different evaluation criteria. We train and evaluate a series of Machine Learning models that use a combination of independent textual complexity indices from the ReaderBench framework and Deep Learning models based on the Transformer architecture in a multitask setup to predict concurrently all criteria. Our models achieve significantly lower errors than previous work using a similar dataset, with MAE ranging from 0.10–0.16 and corresponding R2 values of up to 0.64. Our findings indicate that Longformer-based [1] models are adequate for contextualizing longer text sequences and effectively scoring summaries according to a variety of human-defined evaluation criteria using a single Neural Network.

KW - Automated summary scoring

KW - Multitask learning

KW - Natural language processing

KW - Text summarization

UR - http://www.scopus.com/inward/record.url?scp=85135889420&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85135889420&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-11644-5_79

DO - 10.1007/978-3-031-11644-5_79

M3 - Conference contribution

AN - SCOPUS:85135889420

SN - 9783031116438

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 756

EP - 761

BT - Artificial Intelligence in Education - 23rd International Conference, AIED 2022, Proceedings

A2 - Rodrigo, Maria Mercedes

A2 - Matsuda, Noburu

A2 - Cristea, Alexandra I.

A2 - Dimitrova, Vania

PB - Springer Science and Business Media Deutschland GmbH

T2 - 23rd International Conference on Artificial Intelligence in Education, AIED 2022

Y2 - 27 July 2022 through 31 July 2022

ER -

Multitask Summary Scoring with Longformers

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this