TY - GEN
T1 - Automated Summary Scoring with ReaderBench
AU - Botarleanu, Robert Mihai
AU - Dascalu, Mihai
AU - Allen, Laura K.
AU - Crossley, Scott Andrew
AU - McNamara, Danielle S.
N1 - Funding Information:
Acknowledgments. The work was funded by a grant of the Romanian National Authority for Scientific Research and Innovation, CNCS – UEFISCDI, project number TE 70 PN-III-P1-1.1-TE-2019-2209, ATES – “Automated Text Evaluation and Simplification”. This research was also supported in part by the Institute of Education Sciences (R305A190063) and the Office of Naval Research (N00014-17-1-2300 and N00014-19-1-2424). The opinions expressed are those of the authors and do not represent views of the IES or ONR.
Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - Text summarization is an effective reading comprehension strategy. However, summary evaluation is complex and must account for various factors including the summary and the reference text. This study examines a corpus of approximately 3,000 summaries based on 87 reference texts, with each summary being manually scored on a 4-point Likert scale. Machine learning models leveraging Natural Language Processing (NLP) techniques were trained to predict the extent to which summaries capture the main idea of the target text. The NLP models combined both domain and language independent textual complexity indices from the ReaderBench framework, as well as state-of-the-art language models and deep learning architectures to provide semantic contextualization. The models achieve low errors – normalized MAE ranging from 0.13–0.17 with corresponding R2 values of up to 0.46. Our approach consistently outperforms baselines that use TF-IDF vectors and linear models, as well as Transfomer-based regression using BERT. These results indicate that NLP algorithms that combine linguistic and semantic indices are accurate and robust, while ensuring generalizability to a wide array of topics.
AB - Text summarization is an effective reading comprehension strategy. However, summary evaluation is complex and must account for various factors including the summary and the reference text. This study examines a corpus of approximately 3,000 summaries based on 87 reference texts, with each summary being manually scored on a 4-point Likert scale. Machine learning models leveraging Natural Language Processing (NLP) techniques were trained to predict the extent to which summaries capture the main idea of the target text. The NLP models combined both domain and language independent textual complexity indices from the ReaderBench framework, as well as state-of-the-art language models and deep learning architectures to provide semantic contextualization. The models achieve low errors – normalized MAE ranging from 0.13–0.17 with corresponding R2 values of up to 0.46. Our approach consistently outperforms baselines that use TF-IDF vectors and linear models, as well as Transfomer-based regression using BERT. These results indicate that NLP algorithms that combine linguistic and semantic indices are accurate and robust, while ensuring generalizability to a wide array of topics.
KW - Automated scoring
KW - Natural language processing
KW - Text summarization
UR - http://www.scopus.com/inward/record.url?scp=85112277902&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85112277902&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-80421-3_35
DO - 10.1007/978-3-030-80421-3_35
M3 - Conference contribution
AN - SCOPUS:85112277902
SN - 9783030804206
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 321
EP - 332
BT - Intelligent Tutoring Systems - 17th International Conference, ITS 2021, Proceedings
A2 - Cristea, Alexandra I.
A2 - Troussas, Christos
PB - Springer Science and Business Media Deutschland GmbH
T2 - 17th International Conference on Intelligent Tutoring Systems, ITS 2021
Y2 - 7 June 2021 through 11 June 2021
ER -