Abstract

When a piece of information (microblog, photograph, video, link, etc.) starts to spread in a social network, an important question arises: will it spread to "viral" proportions - where "viral" is defined as an order-of-magnitude increase. However, several previous studies have established that cascade size and frequency are related through a power-law - which leads to a severe imbalance in this classification problem. In this paper, we devise a suite of measurements based on "structural diversity" - the variety of social contexts (communities) in which individuals partaking in a given cascade engage. We demonstrate these measures are able to distinguish viral from non-viral cascades, despite the severe imbalance of the data for this problem. Further, we leverage these measurements as features in a classification approach, successfully predicting microblogs that grow from 50 to 500 reposts with precision of 0.69 and recall of 0.52 for the viral class - despite this class comprising under 2% of samples. This significantly outperforms our baseline approach as well as the current state-of-the-art. Our work also demonstrates how we can tradeoff between precision and recall.

Original languageEnglish (US)
Title of host publicationProceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015
PublisherAssociation for Computing Machinery, Inc
Pages1610-1613
Number of pages4
ISBN (Print)9781450338547
DOIs
StatePublished - Aug 25 2015
EventIEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015 - Paris, France
Duration: Aug 25 2015Aug 28 2015

Other

OtherIEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015
CountryFrance
CityParis
Period8/25/158/28/15

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Networks and Communications

Cite this

Guo, R., Shaabani, E., Bhatnagar, A., & Shakarian, P. (2015). Toward order-of-magnitude cascade prediction. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015 (pp. 1610-1613). Association for Computing Machinery, Inc. https://doi.org/10.1145/2808797.2809358

Toward order-of-magnitude cascade prediction. / Guo, Ruocheng; Shaabani, Elham; Bhatnagar, Abhinav; Shakarian, Paulo.

Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015. Association for Computing Machinery, Inc, 2015. p. 1610-1613.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Guo, R, Shaabani, E, Bhatnagar, A & Shakarian, P 2015, Toward order-of-magnitude cascade prediction. in Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015. Association for Computing Machinery, Inc, pp. 1610-1613, IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015, Paris, France, 8/25/15. https://doi.org/10.1145/2808797.2809358
Guo R, Shaabani E, Bhatnagar A, Shakarian P. Toward order-of-magnitude cascade prediction. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015. Association for Computing Machinery, Inc. 2015. p. 1610-1613 https://doi.org/10.1145/2808797.2809358
Guo, Ruocheng ; Shaabani, Elham ; Bhatnagar, Abhinav ; Shakarian, Paulo. / Toward order-of-magnitude cascade prediction. Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015. Association for Computing Machinery, Inc, 2015. pp. 1610-1613
@inproceedings{f701ca20599b4cde800325bb09f0aa7d,
title = "Toward order-of-magnitude cascade prediction",
abstract = "When a piece of information (microblog, photograph, video, link, etc.) starts to spread in a social network, an important question arises: will it spread to {"}viral{"} proportions - where {"}viral{"} is defined as an order-of-magnitude increase. However, several previous studies have established that cascade size and frequency are related through a power-law - which leads to a severe imbalance in this classification problem. In this paper, we devise a suite of measurements based on {"}structural diversity{"} - the variety of social contexts (communities) in which individuals partaking in a given cascade engage. We demonstrate these measures are able to distinguish viral from non-viral cascades, despite the severe imbalance of the data for this problem. Further, we leverage these measurements as features in a classification approach, successfully predicting microblogs that grow from 50 to 500 reposts with precision of 0.69 and recall of 0.52 for the viral class - despite this class comprising under 2{\%} of samples. This significantly outperforms our baseline approach as well as the current state-of-the-art. Our work also demonstrates how we can tradeoff between precision and recall.",
author = "Ruocheng Guo and Elham Shaabani and Abhinav Bhatnagar and Paulo Shakarian",
year = "2015",
month = "8",
day = "25",
doi = "10.1145/2808797.2809358",
language = "English (US)",
isbn = "9781450338547",
pages = "1610--1613",
booktitle = "Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015",
publisher = "Association for Computing Machinery, Inc",

}

TY - GEN

T1 - Toward order-of-magnitude cascade prediction

AU - Guo, Ruocheng

AU - Shaabani, Elham

AU - Bhatnagar, Abhinav

AU - Shakarian, Paulo

PY - 2015/8/25

Y1 - 2015/8/25

N2 - When a piece of information (microblog, photograph, video, link, etc.) starts to spread in a social network, an important question arises: will it spread to "viral" proportions - where "viral" is defined as an order-of-magnitude increase. However, several previous studies have established that cascade size and frequency are related through a power-law - which leads to a severe imbalance in this classification problem. In this paper, we devise a suite of measurements based on "structural diversity" - the variety of social contexts (communities) in which individuals partaking in a given cascade engage. We demonstrate these measures are able to distinguish viral from non-viral cascades, despite the severe imbalance of the data for this problem. Further, we leverage these measurements as features in a classification approach, successfully predicting microblogs that grow from 50 to 500 reposts with precision of 0.69 and recall of 0.52 for the viral class - despite this class comprising under 2% of samples. This significantly outperforms our baseline approach as well as the current state-of-the-art. Our work also demonstrates how we can tradeoff between precision and recall.

AB - When a piece of information (microblog, photograph, video, link, etc.) starts to spread in a social network, an important question arises: will it spread to "viral" proportions - where "viral" is defined as an order-of-magnitude increase. However, several previous studies have established that cascade size and frequency are related through a power-law - which leads to a severe imbalance in this classification problem. In this paper, we devise a suite of measurements based on "structural diversity" - the variety of social contexts (communities) in which individuals partaking in a given cascade engage. We demonstrate these measures are able to distinguish viral from non-viral cascades, despite the severe imbalance of the data for this problem. Further, we leverage these measurements as features in a classification approach, successfully predicting microblogs that grow from 50 to 500 reposts with precision of 0.69 and recall of 0.52 for the viral class - despite this class comprising under 2% of samples. This significantly outperforms our baseline approach as well as the current state-of-the-art. Our work also demonstrates how we can tradeoff between precision and recall.

UR - http://www.scopus.com/inward/record.url?scp=84962582467&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84962582467&partnerID=8YFLogxK

U2 - 10.1145/2808797.2809358

DO - 10.1145/2808797.2809358

M3 - Conference contribution

AN - SCOPUS:84962582467

SN - 9781450338547

SP - 1610

EP - 1613

BT - Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015

PB - Association for Computing Machinery, Inc

ER -