Knowledge driven temporal activity localization

Changlin Li, Zhihui Li, Zongyuan Ge, Mingjie Li

Research output: Contribution to journalArticleResearchpeer-review

1 Citation (Scopus)

Abstract

In this paper, we focus on the problem of temporal activity detection, which aims to directly predict the temporal bounds of actions. Most existing temporal activity detection algorithms treat the classification of each action proposal separately and neglect vital semantic correlations between actions in one video. This will deteriorate the classification performance in the scenario of long-tail problems, where only a handful of examples are available for uncommon actions. To solve this problem, we propose to incorporate knowledge to reason over large scale action classes and maintain semantic coherency within one video. Specifically, we employ an implicit knowledge reasoning module and an explicit knowledge reasoning module to incorporate the knowledge constraints to facilitate temporal activity localization. To demonstrate the superiority of the proposed model, we test the proposed method on large-scale action detection datasets, namely ActivityNet and THUMOS’14 datasets. The experimental results have demonstrated the superiority of the proposed model. Codes and models will be released once this paper is accepted.

Original languageEnglish
Article number102628
Number of pages7
JournalJournal of Visual Communication and Image Representation
Volume64
DOIs
Publication statusPublished - Oct 2019

Keywords

  • Knowledge constraints
  • Reasoning module
  • Temporal activity detection

Cite this