Unsupervised Deep Metric Learning via Orthogonality Based Probabilistic Loss

Ujjal Kr Dutta, Mehrtash Harandi, Chellu Chandra Sekhar

Research output: Contribution to journalArticleResearchpeer-review

18 Citations (Scopus)

Abstract

Metric learning is an important problem in machine learning. It aims to group similar observations together. Existing state-of-the-art metric learning approaches require class labels to induce a metric. Sometimes it is expensive or not possible to collect these labels. In this paper, we propose an unsupervised learning approach that learns a metric without making use of class labels. The lack of class labels is compensated by obtaining pseudo-labels of data using a graph-based clustering approach. The pseudo-labels are used to form triplets of examples, which guide the metric learning process. We propose a probabilistic loss function that minimizes the chances of each triplet violating an angular constraint. A weight function and an orthogonality constraint in the objective speed up convergence and avoid a model collapse. We also provide a stochastic formulation of our method to scale up to large-scale datasets. Our studies demonstrate the competitiveness of our approach against state-of-the-art methods. Impact Statement —Adequately detecting the similarity among two observations is the essence of many Artificial Intelligence (AI) algorithms, and to an extent, impacts their success. Similarity, hence distance, depends on the manner we represent observations. Existing AI algorithms that automatically learn a good representation of data, require huge manual intervention and effort in the form of annotations. However, it is not possible in many crucial applications to obtain a large amount of manually annotated data. For example, some applications of significant technological and economic impacts, such as invasive medical imaging, insurance, and computer security, produce a huge amount of unlabeled data. We present an algorithm for this problem that when compared to a rival, increased recall between 1.4% to 6.4%.

Original languageEnglish
Pages (from-to)74-84
Number of pages11
JournalIEEE Transactions on Artificial Intelligence
Volume1
Issue number1
DOIs
Publication statusPublished - Aug 2020

Keywords

  • Clustering
  • distance learning
  • machine learning
  • similarity learning

Cite this