TY - GEN
T1 - Sequence-to-Sequence Models for Automated Text Simplification
AU - Botarleanu, Robert Mihai
AU - Dascalu, Mihai
AU - Crossley, Scott Andrew
AU - McNamara, Danielle S.
N1 - Funding Information:
This work was supported by a grant of the Romanian National Authority for Scientific Research and Innovation, CNCS ? UEFISCDI, project number PN-III 54PCCDI ? 2018, INTELLIT ? ?Prezervarea ?i valorificarea patrimoniului literar rom?nesc folosind solu?ii digitale inteligente pentru extragerea ?i sistematizarea de cuno?tin?e?. This research was also supported in part by the Institute of Education Sciences (R305A190063) and the Office of Naval Research (N00014-17-1-2300 and N00014-19-1-2424). The opinions expressed are those of the authors and do not represent views of the IES or ONR.
PY - 2020
Y1 - 2020
N2 - A key writing skill is the capability to clearly convey desired meaning using available linguistic knowledge. Consequently, writers must select from a large array of idioms, vocabulary terms that are semantically equivalent, and discourse features that simultaneously reflect content and allow readers to grasp meaning. In many cases, a simplified version of a text is needed to ensure comprehension on the part of a targeted audience (e.g., second language learners). To address this need, we propose an automated method to simplify texts based on paraphrasing. Specifically, we explore the potential for a deep learning model, previously used for machine translation, to learn a simplified version of the English language within the context of short phrases. The best model, based on an Universal Transformer architecture, achieved a BLEU score of 66.01. We also evaluated this model’s capability to perform similar transformation to texts that were simplified by human experts at different levels.
AB - A key writing skill is the capability to clearly convey desired meaning using available linguistic knowledge. Consequently, writers must select from a large array of idioms, vocabulary terms that are semantically equivalent, and discourse features that simultaneously reflect content and allow readers to grasp meaning. In many cases, a simplified version of a text is needed to ensure comprehension on the part of a targeted audience (e.g., second language learners). To address this need, we propose an automated method to simplify texts based on paraphrasing. Specifically, we explore the potential for a deep learning model, previously used for machine translation, to learn a simplified version of the English language within the context of short phrases. The best model, based on an Universal Transformer architecture, achieved a BLEU score of 66.01. We also evaluated this model’s capability to perform similar transformation to texts that were simplified by human experts at different levels.
KW - Natural language processing
KW - Paraphrasing
KW - Sequence-to-sequence model
KW - Text simplification
UR - http://www.scopus.com/inward/record.url?scp=85088556291&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85088556291&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-52240-7_6
DO - 10.1007/978-3-030-52240-7_6
M3 - Conference contribution
AN - SCOPUS:85088556291
SN - 9783030522391
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 31
EP - 36
BT - Artificial Intelligence in Education - 21st International Conference, AIED 2020, Proceedings
A2 - Bittencourt, Ig Ibert
A2 - Cukurova, Mutlu
A2 - Luckin, Rose
A2 - Muldner, Kasia
A2 - Millán, Eva
PB - Springer
T2 - 21st International Conference on Artificial Intelligence in Education, AIED 2020
Y2 - 6 July 2020 through 10 July 2020
ER -