Abstract
Interaction prediction has a wide range of applications such as robot controlling and prevention of dangerous events. In this paper, we introduce a new method to capture deep temporal information in videos for human interaction prediction. We propose to use flow coding images to represent the low-level motion information in videos and extract deep temporal features using a deep convolutional neural network architecture. We tested our method on the UT-Interaction dataset and the challenging TV human interaction dataset, and demonstrated the advantages of the proposed deep temporal features based on flow coding images. The proposed method, though using only the temporal information, outperforms the state of the art methods for human interaction prediction.
| Original language | English |
|---|---|
| Title of host publication | Computer Vision – ECCV 2016 Workshops, Proceedings |
| Editors | Gang Hua, Herve Jegou |
| Publisher | Springer-Verlag London Ltd. |
| Pages | 403-414 |
| Number of pages | 12 |
| ISBN (Print) | 9783319488806 |
| DOIs | |
| Publication status | Published - 2016 |
| Externally published | Yes |
| Event | European Conference on Computer Vision 2016 - Amsterdam, Netherlands Duration: 11 Oct 2016 → 14 Oct 2016 Conference number: 14th http://www.eccv2016.org/ https://link.springer.com/book/10.1007/978-3-319-46448-0 (Proceedings) |
Publication series
| Name | Lecture Notes in Computer Science |
|---|---|
| Volume | 9914 |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | European Conference on Computer Vision 2016 |
|---|---|
| Abbreviated title | ECCV 2016 |
| Country/Territory | Netherlands |
| City | Amsterdam |
| Period | 11/10/16 → 14/10/16 |
| Internet address |
Keywords
- CNN
- Interaction prediction
- Temporal convolution