When is it biased? Assessing the representativeness of twitter's streaming API

Fred Morstatter, Jürgen Pfeffer, Huan Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

89 Scopus citations

Abstract

Twitter shares a free 1% sample of its tweets through the Streaming API". Recently, research has pointed to evidence of bias in this source. The methodologies proposed in previous work rely on the restrictive and expensive Firehose to find the bias in the Streaming API data. We tackle the problem of finding sample bias without costly and restrictive Firehose data. We propose a solution that focuses on using an open data source to find bias in the Streaming API.

Original languageEnglish (US)
Title of host publicationWWW 2014 Companion - Proceedings of the 23rd International Conference on World Wide Web
PublisherAssociation for Computing Machinery, Inc
Pages555-556
Number of pages2
ISBN (Electronic)9781450327459
DOIs
StatePublished - Apr 7 2014
Event23rd International Conference on World Wide Web, WWW 2014 - Seoul, Korea, Republic of
Duration: Apr 7 2014Apr 11 2014

Publication series

NameWWW 2014 Companion - Proceedings of the 23rd International Conference on World Wide Web

Other

Other23rd International Conference on World Wide Web, WWW 2014
Country/TerritoryKorea, Republic of
CitySeoul
Period4/7/144/11/14

Keywords

  • Big data
  • Data sampling
  • Sampling bias
  • Twitter analysis

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Software

Fingerprint

Dive into the research topics of 'When is it biased? Assessing the representativeness of twitter's streaming API'. Together they form a unique fingerprint.

Cite this