Annotator expertise and information quality in annotation-based retrieval

Wern Han Lim, Mark James Carman

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearch


This paper investigates the annotation-based retrieval (AR) ofWorld Wide Web (WWW) resources that has been annotated by users on Collaborative Tagging (CT) platforms as a form of user-generated content (UGC). Previous approaches have simply weight theWWW resources according to their popularity, in order to leverage on the inherent wisdom of the crowd (WotC). In this paper, we argue that the popularity alone is not a sufficient indicator of quality since (1) some users are better annotators than the others; (2) resource popularity can be easily inflated by malicious users; and (3) high quality but highly specific resources may exhibit lower popularity than more general ones. Thus, we investigate the indicators of information quality for WWW resources, particularly user annotations that can be used to describe them. This research estimates the user expertise of content annotators in order to infer the information quality of their contributions; by exploring the various signals available on social bookmarking platforms such as the temporal information of annotations. The evaluation in retrieval performance on social bookmarking data shows significant improvements with the estimated user expertise and inferred information quality.

Original languageEnglish
Title of host publicationADCS 2017
Subtitle of host publicationProceedings of the 22nd Australasian Document Computing Symposium
EditorsBevan Koopman, Guido Zuccon, Mark Carman
Place of PublicationNew York NY USA
PublisherAssociation for Computing Machinery (ACM)
Number of pages8
ISBN (Print)9781450363914
Publication statusPublished - 7 Dec 2017
EventAustralasian Document Computing Symposium 2017 - Brisbane, Australia
Duration: 7 Dec 20178 Dec 2017
Conference number: 22nd (Proceedings)


ConferenceAustralasian Document Computing Symposium 2017
Abbreviated titleADCS 2017
Internet address


  • Information quality
  • Information retrieval
  • User annotation
  • User expertise

Cite this