Abstract

Drug use and abuse is a serious societal problem. The fast development and adoption of social media and smart mobile devices in recent years bring about new opportunities for advancing computer-based strategies for understanding and intervention of drug-related behaviors. However, the existing literature still lacks principled ways of building computational models for supporting effective analysis of large-scale, often unstructured social media data. Part of the challenge stems from the difficulty of obtaining so-called ground-truth data that are typically required for training computational models. This paper presents a progressive semi-supervised learning approach to identifying Twitter tweets that are related to personal and recreational use of marijuana. Based on a small, labeled dataset, the proposed approach first learns optimal mapping of raw features from the tweets for classification, using a method of weakly hierarchical lasso. The learned feature model is then used to support unsupervised clustering of Web-scale data. Experiments with realistic data crawled from Twitter are used to validate the proposed approach, demonstrating its effectiveness.

Original languageEnglish (US)
Title of host publicationProceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages447-452
Number of pages6
ISBN (Electronic)9781509028467
DOIs
StatePublished - Nov 21 2016
Event2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016 - San Francisco, United States
Duration: Aug 18 2016Aug 21 2016

Other

Other2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016
CountryUnited States
CitySan Francisco
Period8/18/168/21/16

Fingerprint

twitter
Needles
social media
Supervised learning
Mobile devices
drug abuse
drug use
drug
lack
experiment
Experiments
learning

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Sociology and Political Science
  • Communication

Cite this

Tian, Q., Lagisetty, J., & Li, B. (2016). Finding needles of interested tweets in the haystack of Twitter network. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016 (pp. 447-452). [7752273] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ASONAM.2016.7752273

Finding needles of interested tweets in the haystack of Twitter network. / Tian, Qiongjie; Lagisetty, Jashmi; Li, Baoxin.

Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016. Institute of Electrical and Electronics Engineers Inc., 2016. p. 447-452 7752273.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Tian, Q, Lagisetty, J & Li, B 2016, Finding needles of interested tweets in the haystack of Twitter network. in Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016., 7752273, Institute of Electrical and Electronics Engineers Inc., pp. 447-452, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016, San Francisco, United States, 8/18/16. https://doi.org/10.1109/ASONAM.2016.7752273
Tian Q, Lagisetty J, Li B. Finding needles of interested tweets in the haystack of Twitter network. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016. Institute of Electrical and Electronics Engineers Inc. 2016. p. 447-452. 7752273 https://doi.org/10.1109/ASONAM.2016.7752273
Tian, Qiongjie ; Lagisetty, Jashmi ; Li, Baoxin. / Finding needles of interested tweets in the haystack of Twitter network. Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 447-452
@inproceedings{651f929abbe84399b272e4918d69f6ee,
title = "Finding needles of interested tweets in the haystack of Twitter network",
abstract = "Drug use and abuse is a serious societal problem. The fast development and adoption of social media and smart mobile devices in recent years bring about new opportunities for advancing computer-based strategies for understanding and intervention of drug-related behaviors. However, the existing literature still lacks principled ways of building computational models for supporting effective analysis of large-scale, often unstructured social media data. Part of the challenge stems from the difficulty of obtaining so-called ground-truth data that are typically required for training computational models. This paper presents a progressive semi-supervised learning approach to identifying Twitter tweets that are related to personal and recreational use of marijuana. Based on a small, labeled dataset, the proposed approach first learns optimal mapping of raw features from the tweets for classification, using a method of weakly hierarchical lasso. The learned feature model is then used to support unsupervised clustering of Web-scale data. Experiments with realistic data crawled from Twitter are used to validate the proposed approach, demonstrating its effectiveness.",
author = "Qiongjie Tian and Jashmi Lagisetty and Baoxin Li",
year = "2016",
month = "11",
day = "21",
doi = "10.1109/ASONAM.2016.7752273",
language = "English (US)",
pages = "447--452",
booktitle = "Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

TY - GEN

T1 - Finding needles of interested tweets in the haystack of Twitter network

AU - Tian, Qiongjie

AU - Lagisetty, Jashmi

AU - Li, Baoxin

PY - 2016/11/21

Y1 - 2016/11/21

N2 - Drug use and abuse is a serious societal problem. The fast development and adoption of social media and smart mobile devices in recent years bring about new opportunities for advancing computer-based strategies for understanding and intervention of drug-related behaviors. However, the existing literature still lacks principled ways of building computational models for supporting effective analysis of large-scale, often unstructured social media data. Part of the challenge stems from the difficulty of obtaining so-called ground-truth data that are typically required for training computational models. This paper presents a progressive semi-supervised learning approach to identifying Twitter tweets that are related to personal and recreational use of marijuana. Based on a small, labeled dataset, the proposed approach first learns optimal mapping of raw features from the tweets for classification, using a method of weakly hierarchical lasso. The learned feature model is then used to support unsupervised clustering of Web-scale data. Experiments with realistic data crawled from Twitter are used to validate the proposed approach, demonstrating its effectiveness.

AB - Drug use and abuse is a serious societal problem. The fast development and adoption of social media and smart mobile devices in recent years bring about new opportunities for advancing computer-based strategies for understanding and intervention of drug-related behaviors. However, the existing literature still lacks principled ways of building computational models for supporting effective analysis of large-scale, often unstructured social media data. Part of the challenge stems from the difficulty of obtaining so-called ground-truth data that are typically required for training computational models. This paper presents a progressive semi-supervised learning approach to identifying Twitter tweets that are related to personal and recreational use of marijuana. Based on a small, labeled dataset, the proposed approach first learns optimal mapping of raw features from the tweets for classification, using a method of weakly hierarchical lasso. The learned feature model is then used to support unsupervised clustering of Web-scale data. Experiments with realistic data crawled from Twitter are used to validate the proposed approach, demonstrating its effectiveness.

UR - http://www.scopus.com/inward/record.url?scp=85006716950&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85006716950&partnerID=8YFLogxK

U2 - 10.1109/ASONAM.2016.7752273

DO - 10.1109/ASONAM.2016.7752273

M3 - Conference contribution

SP - 447

EP - 452

BT - Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016

PB - Institute of Electrical and Electronics Engineers Inc.

ER -