Dual-View Learning from Crowds

Huan Zhang, Liangxiao Jiang, Wenjun Zhang, Geoffrey I. Webb

Research output: Contribution to journalArticleResearchpeer-review

2 Citations (Scopus)

Abstract

Crowdsourcing services provide a fast and cheap way to obtain substantial labeled data by employing crowd workers on the Internet. In crowdsourcing learning, two-stage methods have been widely used, which first infer the integrated label for each instance and then build a learning model using instances with their integrated labels. However, existing two-stage methods mainly focus on how to infer more accurate integrated labels, after that, most of them directly regard the integrated labels as class labels to build a learning model, which loses the detailed worker labeling information in multiple noisy labels and thus results in sub-optimal model accuracy. To solve this problem, in this study, we take the multiple noisy labels of each instance as its attribute value vector to construct another view in addition to the original attribute view, and propose a novel two-stage method called dual-view learning from crowds (DVLFC). In DVLFC, we first pick out workers with sufficient number of labels and augment the multiple noisy label set for each instance, then we build a supervised learning model in each view and at last we fuse their class-membership probabilities to get the final classification result. Extensive experiments on both real-world and artificial crowdsourced datasets prove the effectiveness of DVLFC.

Original languageEnglish
Article number61
Number of pages21
JournalACM Transactions on Knowledge Discovery from Data
Volume19
Issue number3
DOIs
Publication statusPublished - 20 Feb 2025

Keywords

  • crowdsourcing
  • dual-view learning
  • label integration
  • model accuracy

Cite this