Abstract
The lack of labeled exemplars is an important factor that makes the task of multimedia event detection (MED) complicated and challenging. Utilizing artificially picked and labeled external sources is an effective way to enhance the performance of MED. However, building these data usually requires professional human annotators, and the procedure is too time-consuming and costly to scale. In this paper, we propose a new robust dictionary learning framework for complex event detection, which is able to handle both labeled and easy-to-get unlabeled web videos by sharing the same dictionary. By employing the lq-norm based loss jointly with the structured sparsity based regularization, our model shows strong robustness against the substantial noisy and outlier videos from open source. We exploit an effective optimization algorithm to solve the proposed highly non-smooth and non-convex problem. Extensive experiment results over standard datasets of TRECVID MEDTest 2013 and TRECVID MEDTest 2014 demonstrate the effectiveness and superiority of the proposed framework on complex event detection.
Original language | English |
---|---|
Title of host publication | Proceedings of the 26th International Joint Conference on Artificial Intelligence |
Editors | Carles Sierra |
Place of Publication | Marina del Rey CA USA |
Publisher | Association for the Advancement of Artificial Intelligence (AAAI) |
Pages | 4040-4046 |
Number of pages | 7 |
ISBN (Electronic) | 9780999241103 |
ISBN (Print) | 9780999241110 |
DOIs | |
Publication status | Published - 2017 |
Externally published | Yes |
Event | International Joint Conference on Artificial Intelligence 2017 - Melbourne, Australia Duration: 19 Aug 2017 → 25 Aug 2017 Conference number: 26th https://ijcai-17.org/ https://www.ijcai.org/Proceedings/2017/ (Proceedings) |
Conference
Conference | International Joint Conference on Artificial Intelligence 2017 |
---|---|
Abbreviated title | IJCAI 2017 |
Country/Territory | Australia |
City | Melbourne |
Period | 19/08/17 → 25/08/17 |
Internet address |
Keywords
- Natural Language Processing
- Information Retrieval
- Robotics and Vision
- Vision and Perception