Computing crowd consensus with partial agreement

Nguyen Quoc Viet Hung, Viet Huynh, Nguyen Thanh Tam, Matthias Weidlich, Hongzhi Yin, Xiaofang Zhou

Research output: Contribution to journalArticleResearchpeer-review

14 Citations (Scopus)

Abstract

Crowdsourcing has been widely established as a means to enable human computation at large-scale, in particular for tasks that require manual labelling of large sets of data items. Answers obtained from heterogeneous crowd workers are aggregated to obtain a robust result. However, existing methods for answer aggregation are designed for discrete tasks, where answers are given as a single label per item. In this paper, we consider partial-agreement tasks that are common in many applications such as image tagging and document annotation, where items are assigned sets of labels. Common approaches for the aggregation of partial-agreement answers either (i) reduce the problemto several instances of an aggregation problem for discrete tasks or (ii) consider each label independently. Going beyond the state-of-the-art, we propose a novel Bayesian nonparametric model to aggregate the partial-agreement answers in a generic way. This model enables us to compute the consensus of partially-sound and partially-complete worker answers, while taking into account mutual relationships in labels and different answer sets. We also show how this model is instantiated for incremental learning, incorporating new answers from crowd workers as they arrive. An evaluation of our method using real-world datasets reveals that it consistently outperforms the state-of-the-art in terms of precision, recall, and robustness against faulty workers and data sparsity.

Original languageEnglish
Pages (from-to)1-14
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Volume30
Issue number1
DOIs
Publication statusPublished - Jan 2018
Externally publishedYes

Keywords

  • Answer aggregation
  • Bayesian models
  • Crowdsourcing
  • Nonparametric models

Cite this