Abstract
Video data are distinct from images for the extra temporal dimension, which results in more content dependencies from various perspectives. It increases the difficulty of learning representation for various video actions. Existing methods mainly focus on the dependency under a specific perspective, which cannot facilitate the categorization of complex video actions. This paper proposes a novel selective dependency aggregation (SDA) module, which adaptively exploits multiple types of video dependencies to refine the features. Specifically, we empirically investigate various long-range and short-range dependencies achieved by the multi-direction multi-scale feature squeeze and the dependency excitation. Query structured attention is then adopted to fuse them selectively, fully considering the diversity of videos' dependency preferences. Moreover, the channel reduction mechanism is involved in SDA for controlling the additional computation cost to be lightweight. Finally, we show that the SDA module can be easily plugged into different backbones to form SDA-Nets and demonstrate its effectiveness, efficiency and robustness by conducting extensive experiments on several video benchmarks for action classification. The code and models will be available at https://github.com/ty-97/SDA.
Original language | English |
---|---|
Title of host publication | Proceedings of the 29th ACM International Conference on Multimedia |
Editors | Liqiang Nie, Qianru Sun, Peng Cui |
Place of Publication | New York NY USA |
Publisher | Association for Computing Machinery (ACM) |
Pages | 592-601 |
Number of pages | 10 |
ISBN (Electronic) | 9781450386517 |
DOIs | |
Publication status | Published - 2021 |
Externally published | Yes |
Event | ACM International Conference on Multimedia 2021 - Chengdu, China Duration: 20 Oct 2021 → 24 Oct 2021 Conference number: 29th https://dl.acm.org/doi/proceedings/10.1145/3474085 (Proceedings) https://2021.acmmm.org/ (Website) |
Conference
Conference | ACM International Conference on Multimedia 2021 |
---|---|
Abbreviated title | MM 2021 |
Country/Territory | China |
City | Chengdu |
Period | 20/10/21 → 24/10/21 |
Internet address |
|
Keywords
- action classification
- selective depen-dency aggregation
- video content dependency