“Let’s Eat Grandma”: Does Punctuation Matter in Sentence Representation?

Mansooreh Karami, Ahmadreza Mosallanezhad, Michelle V. Mancenido, Huan Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Neural network-based embeddings have been the mainstream approach for creating a vector representation of the text to capture lexical and semantic similarities and dissimilarities. In general, existing encoding methods dismiss the punctuation as insignificant information; consequently, they are routinely treated as a predefined token/word or eliminated in the pre-processing phase. However, punctuation could play a significant role in the semantics of the sentences, as in “Let’s eat, grandma” and “Let’s eat grandma”. We hypothesize that a punctuation-aware representation model would affect the performance of the downstream tasks. Thereby, we propose a model-agnostic method that incorporates both syntactic and contextual information to improve the performance of the sentiment classification task. We corroborate our findings by conducting experiments on publicly available datasets and provide case studies that our model generates representations with respect to the punctuation in the sentence.

Original languageEnglish (US)
Title of host publicationMachine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2022, Proceedings
EditorsMassih-Reza Amini, Stéphane Canu, Asja Fischer, Tias Guns, Petra Kralj Novak, Grigorios Tsoumakas
PublisherSpringer Science and Business Media Deutschland GmbH
Pages588-604
Number of pages17
ISBN (Print)9783031263897
DOIs
StatePublished - 2023
Event22nd Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2022 - Grenoble, France
Duration: Sep 19 2022Sep 23 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13714 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference22nd Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2022
Country/TerritoryFrance
CityGrenoble
Period9/19/229/23/22

Keywords

  • Punctuation
  • Representation learning
  • Sentiment analysis
  • Structural embedding

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of '“Let’s Eat Grandma”: Does Punctuation Matter in Sentence Representation?'. Together they form a unique fingerprint.

Cite this