Abstract
Scene graph generation has received growing attention with the advancements in image understanding tasks such as object detection, attributes and relationship prediction, etc. However, existing datasets are biased in terms of object and relationship labels, or often come with noisy and missing annotations, which makes the development of a reliable scene graph prediction model very challenging. In this paper, we propose a novel scene graph generation algorithm with external knowledge and image reconstruction loss to overcome these dataset issues. In particular, we extract commonsense knowledge from the external knowledge base to refine object and phrase features for improving generalizability in scene graph generation. To address the bias of noisy object annotations, we introduce an auxiliary image reconstruction path to regularize the scene graph generation network. Extensive experiments show that our framework can generate better scene graphs, achieving the state-of-the-art performance on two benchmark datasets: Visual Relationship Detection and Visual Genome datasets.
Original language | English |
---|---|
Title of host publication | Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 |
Editors | Abhinav Gupta, Derek Hoiem, Gang Hua, Zhuowen Tu |
Place of Publication | Piscataway NJ USA |
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Pages | 1969-1978 |
Number of pages | 10 |
ISBN (Electronic) | 9781728132938 |
ISBN (Print) | 9781728132945 |
DOIs | |
Publication status | Published - 2019 |
Event | IEEE Conference on Computer Vision and Pattern Recognition 2019 - Long Beach, United States of America Duration: 16 Jun 2019 → 20 Jun 2019 Conference number: 32nd http://cvpr2019.thecvf.com/ https://ieeexplore.ieee.org/xpl/conhome/8938205/proceeding (Proceedings) |
Conference
Conference | IEEE Conference on Computer Vision and Pattern Recognition 2019 |
---|---|
Abbreviated title | CVPR 2019 |
Country | United States of America |
City | Long Beach |
Period | 16/06/19 → 20/06/19 |
Internet address |
Keywords
- Vision + Language
- Visual Reasoning