Abstract

Big data is ubiquitous and can only become bigger, which challenges traditional data mining and machine learning methods. Social media is a new source of data that is significantly different from conventional ones. Social media data are mostly user-generated, and are big, linked, and heterogeneous. We present the good, the bad and the ugly associated with the multi-faceted social media data and exemplify the importance of some original problems with real-world examples. We discuss bias in social media data, evaluation dilemma, data reduction, inferring invisible information, and big-data paradox. We illuminate new opportunities of developing novel algorithms and tools for data science. In our endeavor of employing the good to tame the bad with the help of the ugly, we deepen the understanding of ever growing and continuously evolving data and create innovative solutions with interdisciplinary and collaborative research of data science.

Original languageEnglish (US)
Pages (from-to)137-143
Number of pages7
JournalInternational Journal of Data Science and Analytics
Volume1
Issue number3-4
DOIs
StatePublished - Nov 1 2016

Keywords

  • Big-data paradox
  • Data analytics
  • Data mining
  • Evaluation
  • Social media

ASJC Scopus subject areas

  • Information Systems
  • Modeling and Simulation
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'The good, the bad, and the ugly: uncovering novel research opportunities in social media mining'. Together they form a unique fingerprint.

Cite this