TY - JOUR
T1 - Learning Context Flexible Attention Model for Long-Term Visual Place Recognition
AU - Chen, Zetao
AU - Liu, Lingqiao
AU - Sa, Inkyu
AU - Ge, Zongyuan
AU - Chli, Margarita
PY - 2018/10/1
Y1 - 2018/10/1
N2 - Identifying regions of interest in an image has long been of great importance in a wide range of tasks, including place recognition. In this letter, we propose a novel attention mechanism with flexible context, which can be incorporated into existing feedforward network architecture to learn image representations for long-term place recognition. In particular, in order to focus on regions that contribute positively to place recognition, we introduce a multiscale context-flexible network to estimate the importance of each spatial region in the feature map. Our model is trained end-to-end for place recognition and can detect regions of interest of arbitrary shape. Extensive experiments have been conducted to verify the effectiveness of our approach and the results demonstrate that our model can achieve consistently better performance than the state of the art on standard benchmark datasets. Finally, we visualize the learned attention maps to generate insights into what attention the network has learned.
AB - Identifying regions of interest in an image has long been of great importance in a wide range of tasks, including place recognition. In this letter, we propose a novel attention mechanism with flexible context, which can be incorporated into existing feedforward network architecture to learn image representations for long-term place recognition. In particular, in order to focus on regions that contribute positively to place recognition, we introduce a multiscale context-flexible network to estimate the importance of each spatial region in the feature map. Our model is trained end-to-end for place recognition and can detect regions of interest of arbitrary shape. Extensive experiments have been conducted to verify the effectiveness of our approach and the results demonstrate that our model can achieve consistently better performance than the state of the art on standard benchmark datasets. Finally, we visualize the learned attention maps to generate insights into what attention the network has learned.
KW - deep learning in robotics and automation
KW - Localization
KW - visual-based navigation
UR - http://www.scopus.com/inward/record.url?scp=85060562544&partnerID=8YFLogxK
U2 - 10.1109/LRA.2018.2859916
DO - 10.1109/LRA.2018.2859916
M3 - Article
AN - SCOPUS:85060562544
VL - 3
SP - 4015
EP - 4022
JO - IEEE Robotics and Automation Letters
JF - IEEE Robotics and Automation Letters
SN - 2377-3766
IS - 4
M1 - 8421024
ER -