Abstract

Drug use and abuse is a serious societal problem. The fast development and adoption of social media and smart mobile devices in recent years bring about new opportunities for advancing computer-based strategies for understanding and intervention of drug-related behaviors. However, the existing literature still lacks principled ways of building computational models for supporting effective analysis of large-scale, often unstructured social media data. Part of the challenge stems from the difficulty of obtaining so-called ground-truth data that are typically required for training computational models. This paper presents a progressive semi-supervised learning approach to identifying Twitter tweets that are related to personal and recreational use of marijuana. Based on a small, labeled dataset, the proposed approach first learns optimal mapping of raw features from the tweets for classification, using a method of weakly hierarchical lasso. The learned feature model is then used to support unsupervised clustering of Web-scale data. Experiments with realistic data crawled from Twitter are used to validate the proposed approach, demonstrating its effectiveness.

Original languageEnglish (US)
Title of host publicationProceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages447-452
Number of pages6
ISBN (Electronic)9781509028467
DOIs
StatePublished - Nov 21 2016
Event2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016 - San Francisco, United States
Duration: Aug 18 2016Aug 21 2016

Other

Other2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016
CountryUnited States
CitySan Francisco
Period8/18/168/21/16

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Sociology and Political Science
  • Communication

Fingerprint Dive into the research topics of 'Finding needles of interested tweets in the haystack of Twitter network'. Together they form a unique fingerprint.

Cite this