To improve the quality of questions asked in Community-based questions answering forums, we create a new dataset from the website, Stack Overflow, which contains three components: (1) context: the text features of questions, (2) treatment: categories of revision suggestions and (3) outcome: the measure of question quality (e.g., the number of questions, upvotes or clicks). This dataset helps researchers develop causal inference models towards solving two problems: (i) estimating the causal effects of aforementioned treatments on the outcome and (ii) finding the optimal treatment for the questions. Empirically, we performed experiments with three state-of-the-art causal effect estimation methods on the contributed dataset. In particular, we evaluated the optimal treatments recommended by the these approaches by comparing them with the ground truth labels – treatments (suggestions) provided by experts.

Original languageEnglish (US)
Title of host publicationBenchmarking, Measuring, and Optimizing - 2nd BenchCouncil International Symposium, Bench 2019, Revised Selected Papers
EditorsWanling Gao, Jianfeng Zhan, Geoffrey Fox, Xiaoyi Lu, Dan Stanzione
Number of pages11
ISBN (Print)9783030495558
StatePublished - 2020
Event2nd International Symposium on Benchmarking, Measuring, and Optimization, Bench 2019 - Denver, United States
Duration: Nov 14 2019Nov 16 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12093 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference2nd International Symposium on Benchmarking, Measuring, and Optimization, Bench 2019
Country/TerritoryUnited States

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)


Dive into the research topics of 'Causal learning in question quality improvement'. Together they form a unique fingerprint.

Cite this