Abstract
Video Semantic Segmentation (VSS) involves assigning a semantic label to each pixel in a video sequence. Prior work in this field has demonstrated promising results by extending image semantic segmentation models to exploit temporal relationships across video frames; however, these approaches often incur significant computational costs. In this paper, we propose an efficient mask propagation framework for VSS, called MPVSS. Our approach first employs a strong query-based image segmentor on sparse key frames to generate accurate binary masks and class predictions. We then design a flow estimation module utilizing the learned queries to generate a set of segment-aware flow maps, each associated with a mask prediction from the key frame. Finally, the mask-flow pairs are warped to serve as the mask predictions for the non-key frames. By reusing predictions from key frames, we circumvent the need to process a large volume of video frames individually with resource-intensive segmentors, alleviating temporal redundancy and significantly reducing computational costs. Extensive experiments on VSPW and Cityscapes demonstrate that our mask propagation framework achieves SOTA accuracy and efficiency trade-offs. For instance, our best model with Swin-L backbone outperforms the SOTA MRCFA using MiT-B5 by 4.0% mIoU, requiring only 26% FLOPs on the VSPW dataset. Moreover, our framework reduces up to 4× FLOPs compared to the per-frame Mask2Former baseline with only up to 2% mIoU degradation on the Cityscapes validation set. Code is available at https://github.com/ziplab/MPVSS.
Original language | English |
---|---|
Title of host publication | Advances in Neural Information Processing Systems 36 (NeurIPS 2023) |
Editors | A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, S. Levine |
Place of Publication | San Diego CA USA |
Publisher | Neural Information Processing Systems (NIPS) |
Number of pages | 14 |
Publication status | Published - 2023 |
Event | Advances in Neural Information Processing Systems 2023 - Ernest N. Morial Convention Center, New Orleans, United States of America Duration: 10 Dec 2023 → 16 Dec 2023 Conference number: 37th https://openreview.net/group?id=NeurIPS.cc/2023/Conference#tab-accept-oral https://neurips.cc/ (Website) https://papers.nips.cc/paper_files/paper/2023 (Proceedings) |
Publication series
Name | Advances in Neural Information Processing Systems |
---|---|
Publisher | Neural Information Processing Systems (NIPS) |
Volume | 36 |
ISSN (Print) | 1049-5258 |
Conference
Conference | Advances in Neural Information Processing Systems 2023 |
---|---|
Abbreviated title | NeurIPS 2023 |
Country/Territory | United States of America |
City | New Orleans |
Period | 10/12/23 → 16/12/23 |
Internet address |