Recent studies have demonstrated the advantages of fusing multi-modal features in improving the accuracy of visual object classification. However, regarding a complex classification task with a large number of categories, previous studies on multiple feature fusion are prone to failure resulting from the occurrence of class ambiguity. In this paper, we address this issue by allowing k (k ≥ 2) guesses at the top instead of only considering the one with the largest prediction score in the framework of multi-view learning. This strategy relaxes the penalty for making an error in the top-k predictions, which can mitigate the challenge of class ambiguity to some extent. To fuse multiple features effectively, we introduce an adaptive weight for each view and exploit an efficient alternating optimization algorithm to learn the optimal classifiers and their corresponding weights jointly. Extensive experiments on several benchmark datasets illustrate the effectiveness and superiority of the proposed model over the state-of-the-art approaches.
- Multi-class visual classification
- Multiple feature learning
- Top-k SVM