Image set classification finds its applications in a number of real-life scenarios such as classification from surveillance videos, multi-view camera networks and personal albums. Compared with single image based classification, it offers more promises and has therefore attracted significant research attention in recent years. Unlike many existing methods which assume images of a set to lie on a certain geometric surface, this paper introduces a deep learning framework which makes no such prior assumptions and can automatically discover the underlying geometric structure. Specifically, a Template Deep Reconstruction Model (TDRM) is defined whose parameters are initialized by performing unsupervised pre-training in a layer-wise fashion using Gaussian Restricted Boltzmann Machines (GRBMs). The initialized TDRM is then separately trained for images of each class and class-specific DRMs are learnt. Based on the minimum reconstruction errors from the learnt class-specific models, three different voting strategies are devised for classification. Extensive experiments are performed to demonstrate the efficacy of the proposed framework for the tasks of face and object recognition from image sets. Experimental results show that the proposed method consistently outperforms the existing state of the art methods.
|Number of pages||15|
|Journal||IEEE Transactions on Pattern Analysis and Machine Intelligence|
|Publication status||Published - 1 Apr 2015|
- deep learning
- Image set classification
- object recognition
- video based face recognition