AndroZooOpen: collecting large-scale open source Android apps for the research community

Pei Liu, Li Li, Yanjie Zhao, Xiaoyu Sun, John Grundy

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

23 Citations (Scopus)


It is critical for research to have an open, well-curated, representative set of apps for analysis. We present a collection of open-source Android apps collected from several sources, including Github. Our dataset, AndroZooOpen, currently contains over 45,000 app artefacts, a representative picture of Github-hosted Android apps. For apps released on Google Play, metadata including categories, ratings and user reviews, are also stored. We share this new dataset as part of our ongoing research to better support and enable new research topics involving Android app artefact analysis, and as a supplement dataset for AndroZoo, a well-known app collection of close-sourced Android apps.

Original languageEnglish
Title of host publicationProceedings - 2020 IEEE/ACM 17th International Conference on Mining Software Repositories, MSR 2020
EditorsGeorgios Gousios, Sarah Nadi
Place of PublicationNew York NY USA
PublisherAssociation for Computing Machinery (ACM)
Number of pages5
ISBN (Electronic)9781450379571
Publication statusPublished - 2020
EventIEEE International Working Conference on Mining Software Repositories 2020 - Seoul, Korea, North
Duration: 29 Jun 202030 Jun 2020
Conference number: 17th (Proceedings) (Website)


ConferenceIEEE International Working Conference on Mining Software Repositories 2020
Abbreviated titleMSR 2020
Country/TerritoryKorea, North
Internet address


  • Android
  • AndroZoo
  • AndroZooOpen
  • Open-source

Cite this