Shuffle-then-assemble: learning object-agnostic visual relationship features

Xu Yang, Hanwang Zhang, Jianfei Cai

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

13 Citations (Scopus)

Abstract

Due to the fact that it is prohibitively expensive to completely annotate visual relationships, i.e., the (obj1, rel, obj2) triplets, relationship models are inevitably biased to object classes of limited pairwise patterns, leading to poor generalization to rare or unseen object combinations. Therefore, we are interested in learning object-agnostic visual features for more generalizable relationship models. By “agnostic”, we mean that the feature is less likely biased to the classes of paired objects. To alleviate the bias, we propose a novel Shuffle-Then-Assemble pre-training strategy. First, we discard all the triplet relationship annotations in an image, leaving two unpaired object domains without obj1-obj2 alignment. Then, our feature learning is to recover possible obj1-obj2 pairs. In particular, we design a cycle of residual transformations between the two domains, to capture shared but not object-specific visual patterns. Extensive experiments on two visual relationship benchmarks show that by using our pre-trained features, naive relationship models can be consistently improved and even outperform other state-of-the-art relationship models. Code has been made available at: https://github.com/yangxuntu/vrd.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2018
Subtitle of host publication15th European Conference Munich, Germany, September 8–14, 2018 Proceedings, Part XII
EditorsVittorio Ferrari, Martial Hebert, Cristian Sminchisescu, Yair Weiss
Place of PublicationCham Switzerland
PublisherSpringer
Pages38-54
Number of pages17
ISBN (Electronic)9783030012588
ISBN (Print)9783030012571
DOIs
Publication statusPublished - 2018
Externally publishedYes
EventEuropean Conference on Computer Vision 2018 - Munich, Germany
Duration: 8 Sep 201814 Sep 2018
Conference number: 15th
https://eccv2018.org/
https://link.springer.com/book/10.1007/978-3-030-01246-5 (Proceedings)

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume11216
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Computer Vision 2018
Abbreviated titleECCV 2018
Country/TerritoryGermany
CityMunich
Period8/09/1814/09/18
Internet address

Cite this