How unlabeled web videos help complex event detection?

Huan Liu, Qinghua Zheng, Minnan Luo, Dingwen Zhang, Xiaojun Chang, Cheng Deng

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

2 Citations (Scopus)


The lack of labeled exemplars is an important factor that makes the task of multimedia event detection (MED) complicated and challenging. Utilizing artificially picked and labeled external sources is an effective way to enhance the performance of MED. However, building these data usually requires professional human annotators, and the procedure is too time-consuming and costly to scale. In this paper, we propose a new robust dictionary learning framework for complex event detection, which is able to handle both labeled and easy-to-get unlabeled web videos by sharing the same dictionary. By employing the lq-norm based loss jointly with the structured sparsity based regularization, our model shows strong robustness against the substantial noisy and outlier videos from open source. We exploit an effective optimization algorithm to solve the proposed highly non-smooth and non-convex problem. Extensive experiment results over standard datasets of TRECVID MEDTest 2013 and TRECVID MEDTest 2014 demonstrate the effectiveness and superiority of the proposed framework on complex event detection.

Original languageEnglish
Title of host publicationProceedings of the 26th International Joint Conference on Artificial Intelligence
EditorsCarles Sierra
Place of PublicationMarina del Rey CA USA
PublisherAssociation for the Advancement of Artificial Intelligence (AAAI)
Number of pages7
ISBN (Electronic)9780999241103
ISBN (Print)9780999241110
Publication statusPublished - 2017
Externally publishedYes
EventInternational Joint Conference on Artificial Intelligence 2017 - Melbourne, Australia
Duration: 19 Aug 201725 Aug 2017
Conference number: 26th (Proceedings)


ConferenceInternational Joint Conference on Artificial Intelligence 2017
Abbreviated titleIJCAI 2017
Internet address


  • Natural Language Processing
  • Information Retrieval
  • Robotics and Vision
  • Vision and Perception

Cite this