Dude, srsly?

The surprisingly formal nature of Twitter's language

Yuheng Hu, Kartik Talamadupula, Subbarao Kambhampati

Research output: Chapter in Book/Report/Conference proceedingConference contribution

39 Citations (Scopus)

Abstract

Twitter has become the de facto information sharing and communication platform. Given the factors that influence language on Twitter - size limitation as well as communication and content-sharing mechanisms - there is a continuing debate about the position of Twitter's language in the spectrum of language on various established mediums. These include SMS and chat on the one hand (size limitations) and email (communication), blogs and newspapers (content sharing) on the other. To provide a way of determining this, we propose a computational framework that offers insights into the linguistic style of all these mediums. Our framework consists of two parts. The first part builds upon a set of linguistic features to quantify the language of a given medium. The second part introduces a flexible factorization framework, SOCLIN, which conducts a psycholinguistic analysis of a given medium with the help of an external cognitive and affective knowledge base. Applying this analytical framework to various corpora from several major mediums, we gather statistics in order to compare the linguistics of Twitter with these other mediums via a quantitative comparative study. We present several key insights: (1) Twitter's language is surprisingly more conservative, and less informal than SMS and online chat; (2) Twitter users appear to be developing linguistically unique styles; (3) Twitter's usage of temporal references is similar to SMS and chat; and (4) Twitter has less variation of affect than other more formal mediums. The language of Twitter can thus be seen as a projection of a more formal register into a size-restricted space.

Original languageEnglish (US)
Title of host publicationProceedings of the 7th International Conference on Weblogs and Social Media, ICWSM 2013
PublisherAAAI press
Pages244-253
Number of pages10
StatePublished - 2013
Event7th International AAAI Conference on Weblogs and Social Media, ICWSM 2013 - Cambridge, MA, United States
Duration: Jul 8 2013Jul 11 2013

Other

Other7th International AAAI Conference on Weblogs and Social Media, ICWSM 2013
CountryUnited States
CityCambridge, MA
Period7/8/137/11/13

Fingerprint

Linguistics
Communication
Online conferencing
Blogs
Electronic mail
Factorization
Statistics

ASJC Scopus subject areas

  • Media Technology

Cite this

Hu, Y., Talamadupula, K., & Kambhampati, S. (2013). Dude, srsly? The surprisingly formal nature of Twitter's language. In Proceedings of the 7th International Conference on Weblogs and Social Media, ICWSM 2013 (pp. 244-253). AAAI press.

Dude, srsly? The surprisingly formal nature of Twitter's language. / Hu, Yuheng; Talamadupula, Kartik; Kambhampati, Subbarao.

Proceedings of the 7th International Conference on Weblogs and Social Media, ICWSM 2013. AAAI press, 2013. p. 244-253.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hu, Y, Talamadupula, K & Kambhampati, S 2013, Dude, srsly? The surprisingly formal nature of Twitter's language. in Proceedings of the 7th International Conference on Weblogs and Social Media, ICWSM 2013. AAAI press, pp. 244-253, 7th International AAAI Conference on Weblogs and Social Media, ICWSM 2013, Cambridge, MA, United States, 7/8/13.
Hu Y, Talamadupula K, Kambhampati S. Dude, srsly? The surprisingly formal nature of Twitter's language. In Proceedings of the 7th International Conference on Weblogs and Social Media, ICWSM 2013. AAAI press. 2013. p. 244-253
Hu, Yuheng ; Talamadupula, Kartik ; Kambhampati, Subbarao. / Dude, srsly? The surprisingly formal nature of Twitter's language. Proceedings of the 7th International Conference on Weblogs and Social Media, ICWSM 2013. AAAI press, 2013. pp. 244-253
@inproceedings{43b63d630d4d466fb249818bf47ee405,
title = "Dude, srsly?: The surprisingly formal nature of Twitter's language",
abstract = "Twitter has become the de facto information sharing and communication platform. Given the factors that influence language on Twitter - size limitation as well as communication and content-sharing mechanisms - there is a continuing debate about the position of Twitter's language in the spectrum of language on various established mediums. These include SMS and chat on the one hand (size limitations) and email (communication), blogs and newspapers (content sharing) on the other. To provide a way of determining this, we propose a computational framework that offers insights into the linguistic style of all these mediums. Our framework consists of two parts. The first part builds upon a set of linguistic features to quantify the language of a given medium. The second part introduces a flexible factorization framework, SOCLIN, which conducts a psycholinguistic analysis of a given medium with the help of an external cognitive and affective knowledge base. Applying this analytical framework to various corpora from several major mediums, we gather statistics in order to compare the linguistics of Twitter with these other mediums via a quantitative comparative study. We present several key insights: (1) Twitter's language is surprisingly more conservative, and less informal than SMS and online chat; (2) Twitter users appear to be developing linguistically unique styles; (3) Twitter's usage of temporal references is similar to SMS and chat; and (4) Twitter has less variation of affect than other more formal mediums. The language of Twitter can thus be seen as a projection of a more formal register into a size-restricted space.",
author = "Yuheng Hu and Kartik Talamadupula and Subbarao Kambhampati",
year = "2013",
language = "English (US)",
pages = "244--253",
booktitle = "Proceedings of the 7th International Conference on Weblogs and Social Media, ICWSM 2013",
publisher = "AAAI press",

}

TY - GEN

T1 - Dude, srsly?

T2 - The surprisingly formal nature of Twitter's language

AU - Hu, Yuheng

AU - Talamadupula, Kartik

AU - Kambhampati, Subbarao

PY - 2013

Y1 - 2013

N2 - Twitter has become the de facto information sharing and communication platform. Given the factors that influence language on Twitter - size limitation as well as communication and content-sharing mechanisms - there is a continuing debate about the position of Twitter's language in the spectrum of language on various established mediums. These include SMS and chat on the one hand (size limitations) and email (communication), blogs and newspapers (content sharing) on the other. To provide a way of determining this, we propose a computational framework that offers insights into the linguistic style of all these mediums. Our framework consists of two parts. The first part builds upon a set of linguistic features to quantify the language of a given medium. The second part introduces a flexible factorization framework, SOCLIN, which conducts a psycholinguistic analysis of a given medium with the help of an external cognitive and affective knowledge base. Applying this analytical framework to various corpora from several major mediums, we gather statistics in order to compare the linguistics of Twitter with these other mediums via a quantitative comparative study. We present several key insights: (1) Twitter's language is surprisingly more conservative, and less informal than SMS and online chat; (2) Twitter users appear to be developing linguistically unique styles; (3) Twitter's usage of temporal references is similar to SMS and chat; and (4) Twitter has less variation of affect than other more formal mediums. The language of Twitter can thus be seen as a projection of a more formal register into a size-restricted space.

AB - Twitter has become the de facto information sharing and communication platform. Given the factors that influence language on Twitter - size limitation as well as communication and content-sharing mechanisms - there is a continuing debate about the position of Twitter's language in the spectrum of language on various established mediums. These include SMS and chat on the one hand (size limitations) and email (communication), blogs and newspapers (content sharing) on the other. To provide a way of determining this, we propose a computational framework that offers insights into the linguistic style of all these mediums. Our framework consists of two parts. The first part builds upon a set of linguistic features to quantify the language of a given medium. The second part introduces a flexible factorization framework, SOCLIN, which conducts a psycholinguistic analysis of a given medium with the help of an external cognitive and affective knowledge base. Applying this analytical framework to various corpora from several major mediums, we gather statistics in order to compare the linguistics of Twitter with these other mediums via a quantitative comparative study. We present several key insights: (1) Twitter's language is surprisingly more conservative, and less informal than SMS and online chat; (2) Twitter users appear to be developing linguistically unique styles; (3) Twitter's usage of temporal references is similar to SMS and chat; and (4) Twitter has less variation of affect than other more formal mediums. The language of Twitter can thus be seen as a projection of a more formal register into a size-restricted space.

UR - http://www.scopus.com/inward/record.url?scp=84900445786&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84900445786&partnerID=8YFLogxK

M3 - Conference contribution

SP - 244

EP - 253

BT - Proceedings of the 7th International Conference on Weblogs and Social Media, ICWSM 2013

PB - AAAI press

ER -