Object Detection for Graphical User Interface: old fashioned or deep learning or a combination?

Jieshan Chen, Mulong Xie, Zhenchang Xing, Chunyang Chen, Xiwei Xu, Liming Zhu, Guoqiang Li

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

3 Citations (Scopus)

Abstract

Detecting Graphical User Interface (GUI) elements in GUI images is a domain-specific object detection task. It supports many software engineering tasks, such as GUI animation and testing, GUI search and code generation. Existing studies for GUI element detection directly borrow the mature methods from computer vision (CV) domain, including old fashioned ones that rely on traditional image processing features (e.g., canny edge, contours), and deep learning models that learn to detect from large-scale GUI data. Unfortunately, these CV methods are not originally designed with the awareness of the unique characteristics of GUIs and GUI elements and the high localization accuracy of the GUI element detection task. We conduct the first large-scale empirical study of seven representative GUI element detection methods on over 50k GUI images to understand the capabilities, limitations and effective designs of these methods. This study not only sheds the light on the technical challenges to be addressed but also informs the design of new GUI element detection methods. We accordingly design a new GUI-specific old-fashioned method for non-text GUI element detection which adopts a novel top-down coarse-to-fine strategy, and incorporate it with the mature deep learning model for GUI text detection.Our evaluation on 25,000 GUI images shows that our method significantly advances the start-of-the-art performance in GUI element detection.

Original languageEnglish
Title of host publicationESEC/FSE'20 - Proceedings of the 28th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering
EditorsPrem Devanbu, Myra Cohen, Thomas Zimmermann
Place of PublicationNew York NY USA
PublisherAssociation for Computing Machinery (ACM)
Pages1202-1214
Number of pages13
ISBN (Electronic)9781450370431
DOIs
Publication statusPublished - 2020
EventJoint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering 2020 - Virtual, United States of America
Duration: 8 Nov 202013 Nov 2020
Conference number: 28th
https://dl.acm.org/doi/proceedings/10.1145/3368089 (Proceedings)
https://2020.esec-fse.org (Website)

Conference

ConferenceJoint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering 2020
Abbreviated titleESEC/FSE 2020
CountryUnited States of America
CityVirtual
Period8/11/2013/11/20
Internet address

Keywords

  • Computer Vision
  • Deep Learning
  • Object Detection
  • User Interface

Cite this