Incorporating emoji descriptions improves tweet classification

Abhishek Singh, Eduardo Blanco, Wei Jin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

44 Scopus citations

Abstract

Tweets are short messages that often include specialized language such as hashtags and emojis. In this paper, we present a simple strategy to process emojis: replace them with their natural language description and use pretrained word embeddings as normally done with standard words. We show that this strategy is more effective than using pretrained emoji embeddings for tweet classification. Specifically, we obtain new state-of-the-art results in irony detection and sentiment analysis despite our neural network is simpler than previous proposals.

Original languageEnglish (US)
Title of host publicationLong and Short Papers
PublisherAssociation for Computational Linguistics (ACL)
Pages2096-2101
Number of pages6
ISBN (Electronic)9781950737130
StatePublished - 2019
Externally publishedYes
Event2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2019 - Minneapolis, United States
Duration: Jun 2 2019Jun 7 2019

Publication series

NameNAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference
Volume1

Conference

Conference2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2019
Country/TerritoryUnited States
CityMinneapolis
Period6/2/196/7/19

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science Applications
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Incorporating emoji descriptions improves tweet classification'. Together they form a unique fingerprint.

Cite this