Bot detection

Will focusing on recall cause overall performance deterioration?

Tahora H. Nazer, Matthew Davis, Mansooreh Karami, Leman Akoglu, David Koelle, Huan Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Social bots are an effective tool in the arsenal of malicious actors who manipulate discussions on social media. Bots help spread misinformation, promote political propaganda, and inflate the popularity of users and content. Hence, it is necessary to differentiate bot accounts and human users. There are several bot detection methods that approach this problem. Conventional methods either focus on precision regardless of the overall performance or optimize overall performance, say F1, without monitoring its effect on precision or recall. Focusing on precision means that those users marked as bots are more likely than not bots but a large portion of the bots could remain undetected. From a user’s perspective, however, it is more desirable to have less interaction with bots, even if it would incur a loss in precision. This can be achieved by a detection method with higher recall. A trivial, but useless, solution for high recall is to classify every account (human or bot) as bot, hence, resulting in poor overall performance. In this work, we investigate if it is feasible for a method to focus on recall without considerable loss in overall performance. Extensive experiments with recall and precision trade-off suggest that high recall can be achieved without much overall performance deterioration. This research leads to a recall-focused approach to bot detection, REFOCUS, with some lessons learned and future directions.

Original languageEnglish (US)
Title of host publicationSocial, Cultural, and Behavioral Modeling - 12th International Conference, SBP-BRiMS 2019, Proceedings
EditorsRobert Thomson, Christopher Dancy, Ayaz Hyder, Halil Bisgin
PublisherSpringer Verlag
Pages39-49
Number of pages11
ISBN (Print)9783030217402
DOIs
StatePublished - Jan 1 2019
Event12th International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction and Behavior Representation in Modeling and Simulation, SBP-BRiMS 2019 - Washington D.C., United States
Duration: Jul 9 2019Jul 12 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11549 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference12th International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction and Behavior Representation in Modeling and Simulation, SBP-BRiMS 2019
CountryUnited States
CityWashington D.C.
Period7/9/197/12/19

Fingerprint

Deterioration
Arsenals
Monitoring
Social Media
Differentiate
Experiments
Trivial
Trade-offs
Likely
Classify
Optimise
Necessary
Interaction
Experiment
Human

Keywords

  • Bot detection
  • Recall
  • Social bots
  • Social media
  • Twitter

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

H. Nazer, T., Davis, M., Karami, M., Akoglu, L., Koelle, D., & Liu, H. (2019). Bot detection: Will focusing on recall cause overall performance deterioration? In R. Thomson, C. Dancy, A. Hyder, & H. Bisgin (Eds.), Social, Cultural, and Behavioral Modeling - 12th International Conference, SBP-BRiMS 2019, Proceedings (pp. 39-49). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11549 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-21741-9_5

Bot detection : Will focusing on recall cause overall performance deterioration? / H. Nazer, Tahora; Davis, Matthew; Karami, Mansooreh; Akoglu, Leman; Koelle, David; Liu, Huan.

Social, Cultural, and Behavioral Modeling - 12th International Conference, SBP-BRiMS 2019, Proceedings. ed. / Robert Thomson; Christopher Dancy; Ayaz Hyder; Halil Bisgin. Springer Verlag, 2019. p. 39-49 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11549 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

H. Nazer, T, Davis, M, Karami, M, Akoglu, L, Koelle, D & Liu, H 2019, Bot detection: Will focusing on recall cause overall performance deterioration? in R Thomson, C Dancy, A Hyder & H Bisgin (eds), Social, Cultural, and Behavioral Modeling - 12th International Conference, SBP-BRiMS 2019, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11549 LNCS, Springer Verlag, pp. 39-49, 12th International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction and Behavior Representation in Modeling and Simulation, SBP-BRiMS 2019, Washington D.C., United States, 7/9/19. https://doi.org/10.1007/978-3-030-21741-9_5
H. Nazer T, Davis M, Karami M, Akoglu L, Koelle D, Liu H. Bot detection: Will focusing on recall cause overall performance deterioration? In Thomson R, Dancy C, Hyder A, Bisgin H, editors, Social, Cultural, and Behavioral Modeling - 12th International Conference, SBP-BRiMS 2019, Proceedings. Springer Verlag. 2019. p. 39-49. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-21741-9_5
H. Nazer, Tahora ; Davis, Matthew ; Karami, Mansooreh ; Akoglu, Leman ; Koelle, David ; Liu, Huan. / Bot detection : Will focusing on recall cause overall performance deterioration?. Social, Cultural, and Behavioral Modeling - 12th International Conference, SBP-BRiMS 2019, Proceedings. editor / Robert Thomson ; Christopher Dancy ; Ayaz Hyder ; Halil Bisgin. Springer Verlag, 2019. pp. 39-49 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{4db4bb5b728e46509560238c77b4f627,
title = "Bot detection: Will focusing on recall cause overall performance deterioration?",
abstract = "Social bots are an effective tool in the arsenal of malicious actors who manipulate discussions on social media. Bots help spread misinformation, promote political propaganda, and inflate the popularity of users and content. Hence, it is necessary to differentiate bot accounts and human users. There are several bot detection methods that approach this problem. Conventional methods either focus on precision regardless of the overall performance or optimize overall performance, say F1, without monitoring its effect on precision or recall. Focusing on precision means that those users marked as bots are more likely than not bots but a large portion of the bots could remain undetected. From a user’s perspective, however, it is more desirable to have less interaction with bots, even if it would incur a loss in precision. This can be achieved by a detection method with higher recall. A trivial, but useless, solution for high recall is to classify every account (human or bot) as bot, hence, resulting in poor overall performance. In this work, we investigate if it is feasible for a method to focus on recall without considerable loss in overall performance. Extensive experiments with recall and precision trade-off suggest that high recall can be achieved without much overall performance deterioration. This research leads to a recall-focused approach to bot detection, REFOCUS, with some lessons learned and future directions.",
keywords = "Bot detection, Recall, Social bots, Social media, Twitter",
author = "{H. Nazer}, Tahora and Matthew Davis and Mansooreh Karami and Leman Akoglu and David Koelle and Huan Liu",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/978-3-030-21741-9_5",
language = "English (US)",
isbn = "9783030217402",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "39--49",
editor = "Robert Thomson and Christopher Dancy and Ayaz Hyder and Halil Bisgin",
booktitle = "Social, Cultural, and Behavioral Modeling - 12th International Conference, SBP-BRiMS 2019, Proceedings",

}

TY - GEN

T1 - Bot detection

T2 - Will focusing on recall cause overall performance deterioration?

AU - H. Nazer, Tahora

AU - Davis, Matthew

AU - Karami, Mansooreh

AU - Akoglu, Leman

AU - Koelle, David

AU - Liu, Huan

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Social bots are an effective tool in the arsenal of malicious actors who manipulate discussions on social media. Bots help spread misinformation, promote political propaganda, and inflate the popularity of users and content. Hence, it is necessary to differentiate bot accounts and human users. There are several bot detection methods that approach this problem. Conventional methods either focus on precision regardless of the overall performance or optimize overall performance, say F1, without monitoring its effect on precision or recall. Focusing on precision means that those users marked as bots are more likely than not bots but a large portion of the bots could remain undetected. From a user’s perspective, however, it is more desirable to have less interaction with bots, even if it would incur a loss in precision. This can be achieved by a detection method with higher recall. A trivial, but useless, solution for high recall is to classify every account (human or bot) as bot, hence, resulting in poor overall performance. In this work, we investigate if it is feasible for a method to focus on recall without considerable loss in overall performance. Extensive experiments with recall and precision trade-off suggest that high recall can be achieved without much overall performance deterioration. This research leads to a recall-focused approach to bot detection, REFOCUS, with some lessons learned and future directions.

AB - Social bots are an effective tool in the arsenal of malicious actors who manipulate discussions on social media. Bots help spread misinformation, promote political propaganda, and inflate the popularity of users and content. Hence, it is necessary to differentiate bot accounts and human users. There are several bot detection methods that approach this problem. Conventional methods either focus on precision regardless of the overall performance or optimize overall performance, say F1, without monitoring its effect on precision or recall. Focusing on precision means that those users marked as bots are more likely than not bots but a large portion of the bots could remain undetected. From a user’s perspective, however, it is more desirable to have less interaction with bots, even if it would incur a loss in precision. This can be achieved by a detection method with higher recall. A trivial, but useless, solution for high recall is to classify every account (human or bot) as bot, hence, resulting in poor overall performance. In this work, we investigate if it is feasible for a method to focus on recall without considerable loss in overall performance. Extensive experiments with recall and precision trade-off suggest that high recall can be achieved without much overall performance deterioration. This research leads to a recall-focused approach to bot detection, REFOCUS, with some lessons learned and future directions.

KW - Bot detection

KW - Recall

KW - Social bots

KW - Social media

KW - Twitter

UR - http://www.scopus.com/inward/record.url?scp=85068154080&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85068154080&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-21741-9_5

DO - 10.1007/978-3-030-21741-9_5

M3 - Conference contribution

SN - 9783030217402

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 39

EP - 49

BT - Social, Cultural, and Behavioral Modeling - 12th International Conference, SBP-BRiMS 2019, Proceedings

A2 - Thomson, Robert

A2 - Dancy, Christopher

A2 - Hyder, Ayaz

A2 - Bisgin, Halil

PB - Springer Verlag

ER -