### Abstract

Nowadays, crowdsourcing has been commonly used to enlist label information both effectively and efficiently. One major challenge in crowdsourcing is the diverse worker quality, which determines the accuracy of the label information provided by such workers. Motivated by the observation that in many crowdsourcing platforms, the same set of workers typically work on the same set of tasks, we propose to model the diverse worker quality by studying their behaviors across multiple related tasks. To this end, we propose an optimization framework named MultiC^{2} for learning from task and worker dual heterogeneity. It uses a weight tensor to represent the workers' behaviors across multiple tasks, and seeks to find the optimal solution of the tensor by exploiting its structured information. We then propose an iterative algorithm to solve the optimization framework and analyze its computational complexity. To infer the true label of an example, we construct a worker ensemble based on the estimated tensor, whose decisions will be weighted using a set of entropy weight. Finally, we test the performance of MultiC^{2} on various data sets, and demonstrate its superiority over state-of-the-art crowdsourcing techniques.

Original language | English (US) |
---|---|

Title of host publication | Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017 |

Publisher | Society for Industrial and Applied Mathematics Publications |

Pages | 579-587 |

Number of pages | 9 |

ISBN (Electronic) | 9781611974874 |

State | Published - 2017 |

Event | 17th SIAM International Conference on Data Mining, SDM 2017 - Houston, United States Duration: Apr 27 2017 → Apr 29 2017 |

### Other

Other | 17th SIAM International Conference on Data Mining, SDM 2017 |
---|---|

Country | United States |

City | Houston |

Period | 4/27/17 → 4/29/17 |

### Fingerprint

### Keywords

- Crowdsourcing
- Multi-task learning
- Tensor representation

### ASJC Scopus subject areas

- Software
- Computer Science Applications

### Cite this

^{2}: An Optimization framework for learning from task and worker dual heterogeneity. In

*Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017*(pp. 579-587). Society for Industrial and Applied Mathematics Publications.

**MultiC ^{2} : An Optimization framework for learning from task and worker dual heterogeneity.** / Zhou, Yao; Ying, Lei; He, Jingrui.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

^{2}: An Optimization framework for learning from task and worker dual heterogeneity. in

*Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017.*Society for Industrial and Applied Mathematics Publications, pp. 579-587, 17th SIAM International Conference on Data Mining, SDM 2017, Houston, United States, 4/27/17.

^{2}: An Optimization framework for learning from task and worker dual heterogeneity. In Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017. Society for Industrial and Applied Mathematics Publications. 2017. p. 579-587

}

TY - GEN

T1 - MultiC2

T2 - An Optimization framework for learning from task and worker dual heterogeneity

AU - Zhou, Yao

AU - Ying, Lei

AU - He, Jingrui

PY - 2017

Y1 - 2017

N2 - Nowadays, crowdsourcing has been commonly used to enlist label information both effectively and efficiently. One major challenge in crowdsourcing is the diverse worker quality, which determines the accuracy of the label information provided by such workers. Motivated by the observation that in many crowdsourcing platforms, the same set of workers typically work on the same set of tasks, we propose to model the diverse worker quality by studying their behaviors across multiple related tasks. To this end, we propose an optimization framework named MultiC2 for learning from task and worker dual heterogeneity. It uses a weight tensor to represent the workers' behaviors across multiple tasks, and seeks to find the optimal solution of the tensor by exploiting its structured information. We then propose an iterative algorithm to solve the optimization framework and analyze its computational complexity. To infer the true label of an example, we construct a worker ensemble based on the estimated tensor, whose decisions will be weighted using a set of entropy weight. Finally, we test the performance of MultiC2 on various data sets, and demonstrate its superiority over state-of-the-art crowdsourcing techniques.

AB - Nowadays, crowdsourcing has been commonly used to enlist label information both effectively and efficiently. One major challenge in crowdsourcing is the diverse worker quality, which determines the accuracy of the label information provided by such workers. Motivated by the observation that in many crowdsourcing platforms, the same set of workers typically work on the same set of tasks, we propose to model the diverse worker quality by studying their behaviors across multiple related tasks. To this end, we propose an optimization framework named MultiC2 for learning from task and worker dual heterogeneity. It uses a weight tensor to represent the workers' behaviors across multiple tasks, and seeks to find the optimal solution of the tensor by exploiting its structured information. We then propose an iterative algorithm to solve the optimization framework and analyze its computational complexity. To infer the true label of an example, we construct a worker ensemble based on the estimated tensor, whose decisions will be weighted using a set of entropy weight. Finally, we test the performance of MultiC2 on various data sets, and demonstrate its superiority over state-of-the-art crowdsourcing techniques.

KW - Crowdsourcing

KW - Multi-task learning

KW - Tensor representation

UR - http://www.scopus.com/inward/record.url?scp=85027831142&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85027831142&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85027831142

SP - 579

EP - 587

BT - Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017

PB - Society for Industrial and Applied Mathematics Publications

ER -