Abstract
We aim to search for a target person from a gallery of whole scene images for which the annotations of pedestrian bounding boxes are unavailable. Previous approaches to this problem have relied on a pedestrian proposal net, which may generate redundant proposals and increase the computational burden. In this paper, we address this problem by training relational context-aware agents which learn the actions to localize the target person from the gallery of whole scene images. We incorporate the relational spatial and temporal contexts into the framework. Specifically, we propose to use the target person as the query in the query-dependent relational network. The agent determines the best action to take at each time step by simultaneously considering the local visual information, the relational and temporal contexts, together with the target person. To validate the performance of our approach, we conduct extensive experiments on the large-scale Person Search benchmark dataset and achieve significant improvements over the compared approaches. It is also worth noting that the proposed model even performs better than traditional methods with perfect pedestrian detectors.
Original language | English |
---|---|
Title of host publication | Computer Vision – ECCV 2018 |
Subtitle of host publication | 15th European Conference Munich, Germany, September 8–14, 2018 Proceedings, Part IX |
Editors | Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, Yair Weiss |
Place of Publication | Cham Switzerland |
Publisher | Springer |
Pages | 86-102 |
Number of pages | 17 |
ISBN (Electronic) | 9783030012403 |
ISBN (Print) | 9783030012397 |
DOIs | |
Publication status | Published - 2018 |
Externally published | Yes |
Event | European Conference on Computer Vision 2018 - Munich, Germany Duration: 8 Sept 2018 → 14 Sept 2018 Conference number: 15th https://eccv2018.org/ https://link.springer.com/book/10.1007/978-3-030-01246-5 (Proceedings) |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Volume | 11213 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | European Conference on Computer Vision 2018 |
---|---|
Abbreviated title | ECCV 2018 |
Country/Territory | Germany |
City | Munich |
Period | 8/09/18 → 14/09/18 |
Internet address |
Keywords
- Person search
- Relational network