Abstract
We present an image classification based approach to large scale action recognition from 3D skeleton videos. Firstly, we map the 3D skeleton videos to color images, where the transformed action images are translation-scale invariance and dataset independent. Secondly, we propose a multi-scale deep convolutional neural network (CNN) for the image classification task, which could enhance the temporal frequency adjustment of our model. Even though the action images are very different from natural images, the fine-tune strategy still works well. Finally, we exploit various kinds of data augmentation methods to improve the generalization ability of the network. Experimental results on the largest and most challenging benchmark NTU RGB-D dataset show that our method achieves the state-of-the-art performance and outperforms other methods by a large margin.
Original language | English |
---|---|
Title of host publication | 2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017 |
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Pages | 601-604 |
Number of pages | 4 |
ISBN (Electronic) | 9781538605608 |
DOIs | |
Publication status | Published - 2017 |
Event | IEEE International Conference on Multimedia and Expo Workshops 2017 - Hong Kong, Hong Kong Duration: 10 Jul 2017 → 14 Jul 2017 https://ieeexplore.ieee.org/xpl/conhome/8014334/proceeding (Proceedings) |
Conference
Conference | IEEE International Conference on Multimedia and Expo Workshops 2017 |
---|---|
Abbreviated title | ICMEW 2017 |
Country/Territory | Hong Kong |
City | Hong Kong |
Period | 10/07/17 → 14/07/17 |
Internet address |
Keywords
- 3D skeleton
- action recognition
- CNN
- image mapping