CATS: co-saliency activated tracklet selection for video co-localization

Koteswar Rao Jerripothula, Jianfei Cai, Junsong Yuan

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

11 Citations (Scopus)

Abstract

Video co-localization is the task of jointly localizing common objects across videos. Due to the appearance variations both across the videos and within the video, it is a challenging problem to identify and track them without any supervision. In contrast to previous joint frameworks that use bounding box proposals to attack the problem, we propose to leverage co-saliency activated tracklets to address the challenge. To identify the common visual object, we first explore inter-video commonness, intra-video commonness, and motion saliency to generate the co-saliency maps. Object proposals of high objectness and co-saliency scores are tracked across short video intervals to build tracklets. The best tube for a video is obtained through tracklet selection from these intervals based on confidence and smoothness between the adjacent tracklets, with the help of dynamic programming. Experimental results on the benchmark YouTube Object dataset show that the proposed method outperforms state-of-the-art methods.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2016
Subtitle of host publication14th European Conference Amsterdam, The Netherlands, October 11–14, 2016 Proceedings, Part VII
EditorsBastian Leibe, Jiri Matas, Nicu Sebe, Max Welling
Place of PublicationCham Switzerland
PublisherSpringer
Pages187-202
Number of pages16
ISBN (Electronic)9783319464787
ISBN (Print)9783319464770
DOIs
Publication statusPublished - 2016
Externally publishedYes
EventEuropean Conference on Computer Vision 2016 - Amsterdam, Netherlands
Duration: 8 Oct 201616 Oct 2016
Conference number: 14th
http://www.eccv2016.org/

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume9911
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Computer Vision 2016
Abbreviated titleECCV 2016
CountryNetherlands
CityAmsterdam
Period8/10/1616/10/16
Internet address

Keywords

  • Cats
  • Co-detection
  • Co-localization
  • Co-saliency
  • Tracklet
  • Video

Cite this

Jerripothula, K. R., Cai, J., & Yuan, J. (2016). CATS: co-saliency activated tracklet selection for video co-localization. In B. Leibe, J. Matas, N. Sebe, & M. Welling (Eds.), Computer Vision – ECCV 2016: 14th European Conference Amsterdam, The Netherlands, October 11–14, 2016 Proceedings, Part VII (pp. 187-202). (Lecture Notes in Computer Science ; Vol. 9911 ). Springer. https://doi.org/10.1007/978-3-319-46478-7_12