Learning sparse latent representation and distance metric for image retrieval

Tu Dinh Nguyen, Truyen Tran, Dinh Phung, Svetha Venkatesh

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

8 Citations (Scopus)


The performance of image retrieval depends critically on the semantic representation and the distance function used to estimate the similarity of two images. A good representation should integrate multiple visual and textual (e.g., tag) features and offer a step closer to the true semantics of interest (e.g., concepts). As the distance function operates on the representation, they are interdependent, and thus should be addressed at the same time. We propose a probabilistic solution to learn both the representation from multiple feature types and modalities and the distance metric from data. The learning is regularised so that the learned representation and information-theoretic metric will (i) preserve the regularities of the visual/textual spaces, (ii) enhance structured sparsity, (iii) encourage small intra-concept distances, and (iv) keep inter-concept images separated. We demonstrate the capacity of our method on the NUS-WIDE data. For the well-studied 13 animal subset, our method outperforms state-of-the-art rivals. On the subset of single-concept images, we gain 79:5% improvement over the standard nearest neighbours approach on the MAP score, and 45.7% on the NDCG.

Original languageEnglish
Title of host publication2013 IEEE International Conference on Multimedia and Expo, ICME 2013
Publication statusPublished - 21 Oct 2013
Externally publishedYes
EventIEEE International Conference on Multimedia and Expo 2013 - Fairmont Hotel, San Jose, United States of America
Duration: 15 Jul 201319 Jul 2013
http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6596168 (IEEE Conference Proceedings)

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X


ConferenceIEEE International Conference on Multimedia and Expo 2013
Abbreviated titleICME 2013
Country/TerritoryUnited States of America
CitySan Jose
Internet address


  • Image retrieval
  • Metric learning
  • Mixed-Variate
  • Restricted Boltzmann Machines
  • Sparsity

Cite this