Skip to main navigation Skip to search Skip to main content

Picaso: Enhancing API Recommendations with Relevant Stack Overflow Posts

  • Ivana Clairine Irsan
  • , Ting Zhang
  • , Ferdian Thung
  • , Kisub Kim
  • , David Lo

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

While having options could be liberating, too many options could lead to the sub-optimal solution being chosen. This is not an exception in the software engineering domain. Nowadays, API has become imperative in making software developers' life easier. APIs help developers implement a function faster and more efficiently. However, given the large number of open-source libraries to choose from, choosing the right APIs is not a simple task. Previous studies on API recommendation leverage natural language (query) to identify which API would be suitable for the given task. However, these studies only consider one source of input, i.e., GitHub or Stack Overflow, independently. There are no existing approaches that utilize Stack Overflow to help generate better API sequence recommendations from queries obtained from GitHub. Therefore, in this study, we aim to provide a framework that could improve the result of the API sequence recommendation by leveraging information from Stack Overflow. In this work, we propose Picaso, which leverages contrastive learning to train a sentence embedding model and a cross-encoder model to build a classification model in order to find a semantically similar Stack Overflow post given an annotation (i.e., code comment). Subsequently, Picaso then uses the Stack Overflow's title as a query expansion. Picaso then uses the extended queries to fine-tune a CodeBERT, resulting in an API sequence generation model. Based on our experiments, we found that incorporating the Stack Overflow information into CodeBERT would improve the performance of API sequence generation's BLEU-4 score by 10.8%.

Original languageEnglish
Title of host publicationProceedings - 2023 IEEE/ACM 20th International Conference on Mining Software Repositories, MSR 2023
EditorsDaniel Alencar Da Costa, Gema Rodríguez-Pérez
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages92-103
Number of pages12
ISBN (Electronic)9798350311846
ISBN (Print)9798350311853
DOIs
Publication statusPublished - 2023
Externally publishedYes
EventIEEE/ACM International Conference on Mining Software Repositories 2023 - Melbourne, Australia
Duration: 15 May 202316 May 2023
Conference number: 20th
https://ieeexplore.ieee.org/xpl/conhome/10173934/proceeding (Proceedings)
https://conf.researchr.org/home/msr-2023 (Website)

Conference

ConferenceIEEE/ACM International Conference on Mining Software Repositories 2023
Abbreviated titleMSR 2023
Country/TerritoryAustralia
CityMelbourne
Period15/05/2316/05/23
Internet address

Keywords

  • API recommendation
  • Multi-source analytics
  • Pre-trained Models
  • Query Expansion
  • Sequence Generation
  • Stack Overflow

Cite this