Spatial-Temporal Transformer for 3D point cloud sequences

Yimin Wei, Hao Liu, Tingting Xie, Qiuhong Ke, Yulan Guo

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

4 Citations (Scopus)

Abstract

Effective learning of spatial-temporal information within a point cloud sequence is highly important for many down-stream tasks such as 4D semantic segmentation and 3D action recognition. In this paper, we propose a novel frame-work named Point Spatial-Temporal Transformer (PST2) to learn spatial-temporal representations from dynamic 3D point cloud sequences. Our PST2 consists of two major modules: a Spatio-Temporal Self-Attention (STSA) module and a Resolution Embedding (RE) module. Our STSA module is introduced to capture the spatial-temporal context in-formation across adjacent frames, while the RE module is proposed to aggregate features across neighbors to enhance the resolution of feature maps. We test the effectiveness our PST 2 with two different tasks on point cloud sequences, i.e., 4D semantic segmentation and 3D action recognition. Extensive experiments on three benchmarks show that our PST2 outperforms existing methods on all datasets. The effectiveness of our STSA and RE modules have also been justified with ablation experiments.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022
EditorsSaket Anand, Ryan Farrell, Richard Souvenir, Catherine Zhao
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages657-666
Number of pages10
ISBN (Electronic)9781665409155
ISBN (Print)9781665409162
DOIs
Publication statusPublished - 2022
Externally publishedYes
EventIEEE Winter Conference on Applications of Computer Vision 2022 - Waikoloa, United States of America
Duration: 4 Jan 20228 Jan 2022
https://wacv2022.thecvf.com/home

Publication series

NameProceedings - 2022 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022
PublisherIEEE, Institute of Electrical and Electronics Engineers
ISSN (Print)2472-6737
ISSN (Electronic)2642-9381

Conference

ConferenceIEEE Winter Conference on Applications of Computer Vision 2022
Abbreviated titleWACV 2022
Country/TerritoryUnited States of America
CityWaikoloa
Period4/01/228/01/22
Internet address

Keywords

  • 3D Computer Vision Segmentation
  • Grouping and Shape

Cite this