Scalable relevant project recommendation on GitHub

Wenyuan Xu, Xiaobing Sun, Xin Xia, Xiang Chen

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

4 Citations (Scopus)


GitHub, one of the largest social coding platforms, fosters a flexible and collaborative development process. In practice, developers in the open source software platform need to find projects relevant to their development work to reuse their function, explore ideas of possible features, or analyze the requirements for their projects. Recommending relevant projects to a developer is a difficult problem considering that there are millions of projects hosted on GitHub, and different developers may have different requirements on relevant projects. In this paper, we propose a scalable and personalized approach to recommend projects by leveraging both developers' behaviors and project features. Based on the features of projects created by developers and their behaviors to other projects, our approach automatically recommends top N most relevant software projects to developers. Moreover, to improve the scalability of our approach, we implement our approach in a parallel processing frame (i.e., Apache Spark) to analyze large-scale data on GitHub for efficient recommendation. We perform an empirical study on the data crawled from GitHub, and the results show that our approach can efficiently recommend relevant software projects with a relatively high precision fit for developers' interests.

Original languageEnglish
Title of host publicationInternetware 2017 - 9th Asia-Pacific Symposium on Internetware
Subtitle of host publicationSeptember 23, 2017, Shanghai China
EditorsHong Mei, Jian Lyu, Zhi Jin, Wenyun Zhao
Place of PublicationNew York NY USA
PublisherAssociation for Computing Machinery (ACM)
Number of pages10
ISBN (Electronic)9781450353137
Publication statusPublished - 2017
Externally publishedYes
EventAsia-Pacific Symposium on Internetware 2017 - Shanghai, China
Duration: 23 Sep 201723 Sep 2017
Conference number: 9th


ConferenceAsia-Pacific Symposium on Internetware 2017
Abbreviated titleInternetware 2017
Internet address


  • GitHub
  • Parallel processing frame
  • Software recommendation

Cite this