Tag recommendation in software information sites

Xin Xia, David Lo, Xinyu Wang, Bo Zhou

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

120 Citations (Scopus)


Nowadays, software engineers use a variety of online media to search and become informed of new and interesting technologies, and to learn from and help one another. We refer to these kinds of online media which help software engineers improve their performance in software development, maintenance and test processes as software information sites. It is common to see tags in software information sites and many sites allow users to tag various objects with their own words. Users increasingly use tags to describe the most important features of their posted contents or projects. In this paper, we propose TagCombine, an automatic tag recommendation method which analyzes objects in software information sites. TagCombine has 3 different components: 1. multilabel ranking component which considers tag recommendation as a multi-label learning problem; 2. similarity based ranking component which recommends tags from similar objects; 3. tag-term based ranking component which considers the relationship between different terms and tags, and recommends tags after analyzing the terms in the objects. We evaluate TagCombine on 2 software information sites, StackOverflow and Freecode, which contain 47,668 and 39,231 text documents, respectively, and 437 and 243 tags, respectively. Experiment results show that for StackOverflow, our TagCombine achieves recall@5 and recall@10 scores of 0.5964 and 0.7239, respectively; For Freecode, it achieves recall@5 and recall@10 scores of 0.6391 and 0.7773, respectively. Moreover, averaging over StackOverflow and Freecode results, we improve TagRec proposed by Al-Kofahi et al. by 22.65% and 14.95%, and the tag recommendation method proposed by Zangerle et al. by 18.5% and 7.35% for recall@5 and recall@10 scores.

Original languageEnglish
Title of host publicationProceedings - 2013 10th Working Conference on Mining Software Repositories, MSR 2013
Subtitle of host publicationMay 18–19, 2013 San Francisco, CA, USA
EditorsMassimiliano Di Penta, Sunghun Kim
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Number of pages10
ISBN (Electronic)21601860
ISBN (Print)9781467329361, 21601852
Publication statusPublished - 2013
Externally publishedYes
EventIEEE International Working Conference on Mining Software Repositories 2013 - San Francisco, United States of America
Duration: 18 May 201319 May 2013
Conference number: 10th
https://ieeexplore.ieee.org/xpl/conhome/6597024/proceeding (Proceedings)


ConferenceIEEE International Working Conference on Mining Software Repositories 2013
Abbreviated titleMSR 2013
Country/TerritoryUnited States of America
CitySan Francisco
Internet address


  • Online media
  • Software information sites
  • Tag recommendation
  • TagCombine

Cite this