Abstract

Nowadays, crowdsourcing has been commonly used to enlist label information both effectively and efficiently. One major challenge in crowdsourcing is the diverse worker quality, which determines the accuracy of the label information provided by such workers. Motivated by the observation that in many crowdsourcing platforms, the same set of workers typically work on the same set of tasks, we propose to model the diverse worker quality by studying their behaviors across multiple related tasks. To this end, we propose an optimization framework named MultiC2 for learning from task and worker dual heterogeneity. It uses a weight tensor to represent the workers' behaviors across multiple tasks, and seeks to find the optimal solution of the tensor by exploiting its structured information. We then propose an iterative algorithm to solve the optimization framework and analyze its computational complexity. To infer the true label of an example, we construct a worker ensemble based on the estimated tensor, whose decisions will be weighted using a set of entropy weight. Finally, we test the performance of MultiC2 on various data sets, and demonstrate its superiority over state-of-the-art crowdsourcing techniques.

Original languageEnglish (US)
Title of host publicationProceedings of the 17th SIAM International Conference on Data Mining, SDM 2017
EditorsNitesh Chawla, Wei Wang
PublisherSociety for Industrial and Applied Mathematics Publications
Pages579-587
Number of pages9
ISBN (Electronic)9781611974874
DOIs
StatePublished - Jan 1 2017
Event17th SIAM International Conference on Data Mining, SDM 2017 - Houston, United States
Duration: Apr 27 2017Apr 29 2017

Publication series

NameProceedings of the 17th SIAM International Conference on Data Mining, SDM 2017

Other

Other17th SIAM International Conference on Data Mining, SDM 2017
CountryUnited States
CityHouston
Period4/27/174/29/17

Keywords

  • Crowdsourcing
  • Multi-task learning
  • Tensor representation

ASJC Scopus subject areas

  • Software
  • Computer Science Applications

Fingerprint Dive into the research topics of 'MultiC<sup>2</sup>: An Optimization framework for learning from task and worker dual heterogeneity'. Together they form a unique fingerprint.

  • Cite this

    Zhou, Y., Ying, L., & He, J. (2017). MultiC2: An Optimization framework for learning from task and worker dual heterogeneity. In N. Chawla, & W. Wang (Eds.), Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017 (pp. 579-587). (Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017). Society for Industrial and Applied Mathematics Publications. https://doi.org/10.1137/1.9781611974973.65