Twitter vs. printed English

An information-theoretic comparison

Emma Glennon, Lalitha Sankar, H. Vincent Poor

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The popular social networking and microblogging service Twitter contains language that is very different from what is considered proper. This paper quantifies those linguistic differences between printed English and Tweetspeak using information-theoretic concepts. Letter-based n-gram entropies are calculated and compared to analagous data from two corpora of printed English to demonstrate that 1) Twitter's entropy is overall higher than that of printed English, and 2) individual users' entropies are on average higher the less conventional their language use is. The implications for digitally-mediated communication in general are also discussed.

Original languageEnglish (US)
Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Pages3069-3072
Number of pages4
DOIs
StatePublished - 2012
Externally publishedYes
Event2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Kyoto, Japan
Duration: Mar 25 2012Mar 30 2012

Other

Other2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012
CountryJapan
CityKyoto
Period3/25/123/30/12

Fingerprint

Entropy
Linguistics
Communication

Keywords

  • computer mediated communication
  • information entropy
  • information theory
  • redundancy
  • Twitter

ASJC Scopus subject areas

  • Signal Processing
  • Software
  • Electrical and Electronic Engineering

Cite this

Glennon, E., Sankar, L., & Poor, H. V. (2012). Twitter vs. printed English: An information-theoretic comparison. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (pp. 3069-3072). [6288563] https://doi.org/10.1109/ICASSP.2012.6288563

Twitter vs. printed English : An information-theoretic comparison. / Glennon, Emma; Sankar, Lalitha; Poor, H. Vincent.

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 2012. p. 3069-3072 6288563.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Glennon, E, Sankar, L & Poor, HV 2012, Twitter vs. printed English: An information-theoretic comparison. in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings., 6288563, pp. 3069-3072, 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012, Kyoto, Japan, 3/25/12. https://doi.org/10.1109/ICASSP.2012.6288563
Glennon E, Sankar L, Poor HV. Twitter vs. printed English: An information-theoretic comparison. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 2012. p. 3069-3072. 6288563 https://doi.org/10.1109/ICASSP.2012.6288563
Glennon, Emma ; Sankar, Lalitha ; Poor, H. Vincent. / Twitter vs. printed English : An information-theoretic comparison. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 2012. pp. 3069-3072
@inproceedings{13573e23827a4584b5b1bd24683f5cb7,
title = "Twitter vs. printed English: An information-theoretic comparison",
abstract = "The popular social networking and microblogging service Twitter contains language that is very different from what is considered proper. This paper quantifies those linguistic differences between printed English and Tweetspeak using information-theoretic concepts. Letter-based n-gram entropies are calculated and compared to analagous data from two corpora of printed English to demonstrate that 1) Twitter's entropy is overall higher than that of printed English, and 2) individual users' entropies are on average higher the less conventional their language use is. The implications for digitally-mediated communication in general are also discussed.",
keywords = "computer mediated communication, information entropy, information theory, redundancy, Twitter",
author = "Emma Glennon and Lalitha Sankar and Poor, {H. Vincent}",
year = "2012",
doi = "10.1109/ICASSP.2012.6288563",
language = "English (US)",
isbn = "9781467300469",
pages = "3069--3072",
booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

}

TY - GEN

T1 - Twitter vs. printed English

T2 - An information-theoretic comparison

AU - Glennon, Emma

AU - Sankar, Lalitha

AU - Poor, H. Vincent

PY - 2012

Y1 - 2012

N2 - The popular social networking and microblogging service Twitter contains language that is very different from what is considered proper. This paper quantifies those linguistic differences between printed English and Tweetspeak using information-theoretic concepts. Letter-based n-gram entropies are calculated and compared to analagous data from two corpora of printed English to demonstrate that 1) Twitter's entropy is overall higher than that of printed English, and 2) individual users' entropies are on average higher the less conventional their language use is. The implications for digitally-mediated communication in general are also discussed.

AB - The popular social networking and microblogging service Twitter contains language that is very different from what is considered proper. This paper quantifies those linguistic differences between printed English and Tweetspeak using information-theoretic concepts. Letter-based n-gram entropies are calculated and compared to analagous data from two corpora of printed English to demonstrate that 1) Twitter's entropy is overall higher than that of printed English, and 2) individual users' entropies are on average higher the less conventional their language use is. The implications for digitally-mediated communication in general are also discussed.

KW - computer mediated communication

KW - information entropy

KW - information theory

KW - redundancy

KW - Twitter

UR - http://www.scopus.com/inward/record.url?scp=84867596031&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867596031&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2012.6288563

DO - 10.1109/ICASSP.2012.6288563

M3 - Conference contribution

SN - 9781467300469

SP - 3069

EP - 3072

BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

ER -