Scene graph generation with external knowledge and image reconstruction

Jiuxiang Gu, Handong Zhao, Zhe Lin, Sheng Li, Jianfei Cai, Mingyang Ling

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

223 Citations (Scopus)


Scene graph generation has received growing attention with the advancements in image understanding tasks such as object detection, attributes and relationship prediction, etc. However, existing datasets are biased in terms of object and relationship labels, or often come with noisy and missing annotations, which makes the development of a reliable scene graph prediction model very challenging. In this paper, we propose a novel scene graph generation algorithm with external knowledge and image reconstruction loss to overcome these dataset issues. In particular, we extract commonsense knowledge from the external knowledge base to refine object and phrase features for improving generalizability in scene graph generation. To address the bias of noisy object annotations, we introduce an auxiliary image reconstruction path to regularize the scene graph generation network. Extensive experiments show that our framework can generate better scene graphs, achieving the state-of-the-art performance on two benchmark datasets: Visual Relationship Detection and Visual Genome datasets.
Original languageEnglish
Title of host publicationProceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
EditorsAbhinav Gupta, Derek Hoiem, Gang Hua, Zhuowen Tu
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Number of pages10
ISBN (Electronic)9781728132938
ISBN (Print)9781728132945
Publication statusPublished - 2019
EventIEEE Conference on Computer Vision and Pattern Recognition 2019 - Long Beach, United States of America
Duration: 16 Jun 201920 Jun 2019
Conference number: 32nd (Proceedings)


ConferenceIEEE Conference on Computer Vision and Pattern Recognition 2019
Abbreviated titleCVPR 2019
Country/TerritoryUnited States of America
CityLong Beach
Internet address


  • Vision + Language
  • Visual Reasoning

Cite this