Tell them apart: distilling technology differences from crowd-scale comparison discussions

Yi Huang, Chunyang Chen, Zhenchang Xing, Tian Lin, Yang Liu

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

4 Citations (Scopus)

Abstract

Developers can use different technologies for many software development tasks in their work. However, when faced with several technologies with comparable functionalities, it is not easy for developers to select the most appropriate one, as comparisons among technologies are time-consuming by trial and error. Instead, developers can resort to expert articles, read official documents or ask questions in Q&A sites for technology comparison, but it is opportunistic to get a comprehensive comparison as online information is often fragmented or contradictory. To overcome these limitations, we propose the diffTech system that exploits the crowdsourced discussions from Stack Overflow, and assists technology comparison with an informative summary of different comparison aspects. We first build a large database of comparable software technologies by mining tags in Stack Overflow, and locate comparative sentences about comparable technologies with NLP methods. We further mine prominent comparison aspects by clustering similar comparative sentences and represent each cluster with its keywords. The evaluation demonstrates both the accuracy and usefulness of our model and we implement a practical website for public use.

Original languageEnglish
Title of host publicationASE'18 - Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering
Subtitle of host publicationSeptember 3–7, 2018 Montpellier, France
EditorsGordon Fraser, Christian Kästner
Place of PublicationNew York NY USA
PublisherAssociation for Computing Machinery (ACM)
Pages214-224
Number of pages11
ISBN (Electronic)9781450359375
DOIs
Publication statusPublished - 2018
EventAutomated Software Engineering Conference 2018 - Corum Conference Center, Montpellier, France
Duration: 3 Sep 20187 Sep 2018
Conference number: 33rd
http://www.ase2018.com/

Conference

ConferenceAutomated Software Engineering Conference 2018
Abbreviated titleASE 2018
CountryFrance
CityMontpellier
Period3/09/187/09/18
Internet address

Keywords

  • Differencing similar technology
  • NLP
  • Stack overflow

Cite this

Huang, Y., Chen, C., Xing, Z., Lin, T., & Liu, Y. (2018). Tell them apart: distilling technology differences from crowd-scale comparison discussions. In G. Fraser, & C. Kästner (Eds.), ASE'18 - Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering: September 3–7, 2018 Montpellier, France (pp. 214-224). Association for Computing Machinery (ACM). https://doi.org/10.1145/3238147.3238208