Abstract
Change Captioning is a task that aims to describe the difference between images with natural language. Most existing methods treat this problem as a difference judgment without the existence of distractors, such as viewpoint changes. However, in practice, viewpoint changes happen often and can overwhelm the semantic difference to be described. In this paper, we propose a novel visual encoder to explicitly distinguish viewpoint changes from semantic changes in the change captioning task. Moreover, we further simulate the attention preference of humans and propose a novel reinforcement learning process to fine-tune the attention directly with language evaluation rewards. Extensive experimental results show that our method outperforms the state-of-the-art approaches by a large margin in both Spot-the-Diff and CLEVR-Change datasets.
Original language | English |
---|---|
Title of host publication | Computer Vision – ECCV 2020 |
Subtitle of host publication | 16th European Conference Glasgow, UK, August 23–28, 2020 Proceedings, Part XIV |
Editors | Andrea Vedaldi, Horst Bischof, Thomas Brox, Jan-Michael Frahm |
Place of Publication | Cham Switzerland |
Publisher | Springer |
Pages | 574-590 |
Number of pages | 17 |
ISBN (Electronic) | 9783030585686 |
ISBN (Print) | 9783030585679 |
DOIs | |
Publication status | Published - 2020 |
Event | European Conference on Computer Vision 2020 - Glasgow, United Kingdom Duration: 23 Aug 2020 → 28 Aug 2020 Conference number: 16th https://link.springer.com/book/10.1007/978-3-030-58452-8 (Proceedings) https://eccv2020.eu (Website) |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Volume | 12359 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | European Conference on Computer Vision 2020 |
---|---|
Abbreviated title | ECCV 2020 |
Country | United Kingdom |
City | Glasgow |
Period | 23/08/20 → 28/08/20 |
Internet address |
|
Keywords
- Attention
- Change captioning
- Image captioning
- Reinforcement learning