Key-word-aware network for referring expression image segmentation

Hengcan Shi, Hongliang Li, Fanman Meng, Qingbo Wu

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

72 Citations (Scopus)


Referring expression image segmentation aims to segment out the object referred by a natural language query expression. Without considering the specific properties of visual and textual information, existing works usually deal with this task by directly feeding a foreground/background classifier with cascaded image and text features, which are extracted from each image region and the whole query, respectively. On the one hand, they ignore that each word in a query expression makes different contributions to identify the desired object, which requires a differential treatment in extracting text feature. On the other hand, the relationships of different image regions are not considered as well, even though they are greatly important to eliminate the undesired foreground object in accordance with specific query. To address aforementioned issues, in this paper, we propose a key-word-aware network, which contains a query attention model and a key-word-aware visual context model. In extracting text features, the query attention model attends to assign higher weights for the words which are more important for identifying object. Meanwhile, the key-word-aware visual context model describes the relationships among different image regions, according to corresponding query. Our proposed method outperforms state-of-the-art methods on two referring expression image segmentation databases.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2018
Subtitle of host publication15th European Conference Munich, Germany, September 8–14, 2018 Proceedings, Part VI
EditorsVittorio Ferrari, Martial Hebert, Cristian Sminchisescu, Yair Weiss
Place of PublicationCham Switzerland
Number of pages17
ISBN (Electronic)9783030012311
ISBN (Print)9783030012304
Publication statusPublished - 2018
Externally publishedYes
EventEuropean Conference on Computer Vision 2018 - Munich, Germany
Duration: 8 Sept 201814 Sept 2018
Conference number: 15th (Proceedings)

Publication series

NameLecture Notes in Computer Science
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


ConferenceEuropean Conference on Computer Vision 2018
Abbreviated titleECCV 2018
Internet address


  • Key word extraction
  • Key-word-aware visual context
  • Query attention
  • Referring expression image segmentation

Cite this