Bias Mitigation for Toxicity Detection via Sequential Decisions

Lu Cheng, Ahmadreza Mosallanezhad, Yasin N. Silva, Deborah Hall, Huan Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Increased social media use has contributed to the greater prevalence of abusive, rude, and offensive textual comments. Machine learning models have been developed to detect toxic comments online, yet these models tend to show biases against users with marginalized or minority identities (e.g., females and African Americans). Established research in debiasing toxicity classifiers often (1) takes a static or batch approach, assuming that all information is available and then making a one-time decision; and (2) uses a generic strategy to mitigate different biases (e.g., gender and racial biases) that assumes the biases are independent of one another. However, in real scenarios, the input typically arrives as a sequence of comments/words over time instead of all at once. Thus, decisions based on partial information must be made while additional input is arriving. Moreover, social bias is complex by nature. Each type of bias is defined within its unique context, which, consistent with intersectionality theory within the social sciences, might be correlated with the contexts of other forms of bias. In this work, we consider debiasing toxicity detection as a sequential decision-making process where different biases can be interdependent. In particular, we study debiasing toxicity detection with two aims: (1) to examine whether different biases tend to correlate with each other; and (2) to investigate how to jointly mitigate these correlated biases in an interactive manner to minimize the total amount of bias. At the core of our approach is a framework built upon theories of sequential Markov Decision Processes that seeks to maximize the prediction accuracy and minimize the bias measures tailored to individual biases. Evaluations on two benchmark datasets empirically validate the hypothesis that biases tend to be correlated and corroborate the effectiveness of the proposed sequential debiasing strategy.

Original languageEnglish (US)
Title of host publicationSIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherAssociation for Computing Machinery, Inc
Pages1750-1760
Number of pages11
ISBN (Electronic)9781450387323
DOIs
StatePublished - Jul 6 2022
Event45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022 - Madrid, Spain
Duration: Jul 11 2022Jul 15 2022

Publication series

NameSIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Conference

Conference45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022
Country/TerritorySpain
CityMadrid
Period7/11/227/15/22

Keywords

  • sequential decision-making
  • social media
  • toxicity detection
  • unintended bias

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Information Systems
  • Software

Fingerprint

Dive into the research topics of 'Bias Mitigation for Toxicity Detection via Sequential Decisions'. Together they form a unique fingerprint.

Cite this