Abstract
Recent studies have shown that, context aggregating information from proposals in different frames can clearly enhance the performance of video object detection. However, these approaches mainly exploit the intra-proposal relation within single video, while ignoring the intra-proposal relation among different videos, which can provide important discriminative cues for recognizing confusing objects. To address the limitation, we propose a novel Inter-Video Proposal Relation module. Based on a concise multi-level triplet selection scheme, this module can learn effective object representations via modeling relations of hard proposals among different videos. Moreover, we design a Hierarchical Video Relation Network (HVR-Net), by integrating intra-video and inter-video proposal relations in a hierarchical fashion. This design can progressively exploit both intra and inter contexts to boost video object detection. We examine our method on the large-scale video object detection benchmark, i.e., ImageNet VID, where HVR-Net achieves the SOTA results. Codes and models are available at https://github.com/youthHan/HVRNet.
Original language | English |
---|---|
Title of host publication | Computer Vision – ECCV 2020 |
Subtitle of host publication | 16th European Conference Glasgow, UK, August 23–28, 2020 Proceedings, Part XXI |
Editors | Andrea Vedaldi, Horst Bischof, Thomas Brox, Jan-Michael Frahm |
Place of Publication | Cham Switzerland |
Publisher | Springer |
Pages | 431-446 |
Number of pages | 16 |
ISBN (Electronic) | 9783030585891 |
ISBN (Print) | 9783030585884 |
DOIs | |
Publication status | Published - 2020 |
Event | European Conference on Computer Vision 2020 - Glasgow, United Kingdom Duration: 23 Aug 2020 → 28 Aug 2020 Conference number: 16th https://link.springer.com/book/10.1007/978-3-030-58452-8 (Proceedings) https://eccv2020.eu (Website) |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Volume | 12366 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | European Conference on Computer Vision 2020 |
---|---|
Abbreviated title | ECCV 2020 |
Country | United Kingdom |
City | Glasgow |
Period | 23/08/20 → 28/08/20 |
Internet address |
|
Keywords
- Hierachical Video Relation Network
- Inter-Video Proposal Relation
- Multi-level triplet selection
- Video object detection