A comprehensive look at coding techniques on Riemannian manifolds

Masoud Faraki, Mehrtash T. Harandi, Fatih Porikli

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Core to many learning pipelines is visual recognition such as image and video classification. In such applications, having a compact yet rich and informative representation plays a pivotal role. An underlying assumption in traditional coding schemes [e.g., sparse coding (SC)] is that the data geometrically comply with the Euclidean space. In other words, the data are presented to the algorithm in vector form and Euclidean axioms are fulfilled. This is of course restrictive in machine learning, computer vision, and signal processing, as shown by a large number of recent studies. This paper takes a further step and provides a comprehensive mathematical framework to perform coding in curved and non-Euclidean spaces, i.e., Riemannian manifolds. To this end, we start by the simplest form of coding, namely, bag of words. Then, inspired by the success of vector of locally aggregated descriptors in addressing computer vision problems, we will introduce its Riemannian extensions. Finally, we study Riemannian form of SC, locality-constrained linear coding, and collaborative coding. Through rigorous tests, we demonstrate the superior performance of our Riemannian coding schemes against the state-of-the-art methods on several visual classification tasks, including head pose classification, video-based face recognition, and dynamic scene recognition.

Original languageEnglish
Pages (from-to)5701-5712
Number of pages12
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume29
Issue number11
DOIs
Publication statusPublished - Nov 2018
Externally publishedYes

Keywords

  • Australia
  • Bag of words (BoW)
  • collaborative coding (CC)
  • Encoding
  • Geometry
  • Level measurement
  • locality-constrained linear coding (LLC)
  • Manifolds
  • Riemannian geometry
  • sparse coding (SC)
  • Task analysis
  • vector of locally aggregated descriptors (VLADs).

Cite this

@article{32877ee04e6e4eb5935184a32e254b69,
title = "A comprehensive look at coding techniques on Riemannian manifolds",
abstract = "Core to many learning pipelines is visual recognition such as image and video classification. In such applications, having a compact yet rich and informative representation plays a pivotal role. An underlying assumption in traditional coding schemes [e.g., sparse coding (SC)] is that the data geometrically comply with the Euclidean space. In other words, the data are presented to the algorithm in vector form and Euclidean axioms are fulfilled. This is of course restrictive in machine learning, computer vision, and signal processing, as shown by a large number of recent studies. This paper takes a further step and provides a comprehensive mathematical framework to perform coding in curved and non-Euclidean spaces, i.e., Riemannian manifolds. To this end, we start by the simplest form of coding, namely, bag of words. Then, inspired by the success of vector of locally aggregated descriptors in addressing computer vision problems, we will introduce its Riemannian extensions. Finally, we study Riemannian form of SC, locality-constrained linear coding, and collaborative coding. Through rigorous tests, we demonstrate the superior performance of our Riemannian coding schemes against the state-of-the-art methods on several visual classification tasks, including head pose classification, video-based face recognition, and dynamic scene recognition.",
keywords = "Australia, Bag of words (BoW), collaborative coding (CC), Encoding, Geometry, Level measurement, locality-constrained linear coding (LLC), Manifolds, Riemannian geometry, sparse coding (SC), Task analysis, vector of locally aggregated descriptors (VLADs).",
author = "Masoud Faraki and Harandi, {Mehrtash T.} and Fatih Porikli",
year = "2018",
month = "11",
doi = "10.1109/TNNLS.2018.2812799",
language = "English",
volume = "29",
pages = "5701--5712",
journal = "IEEE Transactions on Neural Networks and Learning Systems",
issn = "2162-237X",
number = "11",

}

A comprehensive look at coding techniques on Riemannian manifolds. / Faraki, Masoud; Harandi, Mehrtash T.; Porikli, Fatih.

In: IEEE Transactions on Neural Networks and Learning Systems, Vol. 29, No. 11, 11.2018, p. 5701-5712.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - A comprehensive look at coding techniques on Riemannian manifolds

AU - Faraki, Masoud

AU - Harandi, Mehrtash T.

AU - Porikli, Fatih

PY - 2018/11

Y1 - 2018/11

N2 - Core to many learning pipelines is visual recognition such as image and video classification. In such applications, having a compact yet rich and informative representation plays a pivotal role. An underlying assumption in traditional coding schemes [e.g., sparse coding (SC)] is that the data geometrically comply with the Euclidean space. In other words, the data are presented to the algorithm in vector form and Euclidean axioms are fulfilled. This is of course restrictive in machine learning, computer vision, and signal processing, as shown by a large number of recent studies. This paper takes a further step and provides a comprehensive mathematical framework to perform coding in curved and non-Euclidean spaces, i.e., Riemannian manifolds. To this end, we start by the simplest form of coding, namely, bag of words. Then, inspired by the success of vector of locally aggregated descriptors in addressing computer vision problems, we will introduce its Riemannian extensions. Finally, we study Riemannian form of SC, locality-constrained linear coding, and collaborative coding. Through rigorous tests, we demonstrate the superior performance of our Riemannian coding schemes against the state-of-the-art methods on several visual classification tasks, including head pose classification, video-based face recognition, and dynamic scene recognition.

AB - Core to many learning pipelines is visual recognition such as image and video classification. In such applications, having a compact yet rich and informative representation plays a pivotal role. An underlying assumption in traditional coding schemes [e.g., sparse coding (SC)] is that the data geometrically comply with the Euclidean space. In other words, the data are presented to the algorithm in vector form and Euclidean axioms are fulfilled. This is of course restrictive in machine learning, computer vision, and signal processing, as shown by a large number of recent studies. This paper takes a further step and provides a comprehensive mathematical framework to perform coding in curved and non-Euclidean spaces, i.e., Riemannian manifolds. To this end, we start by the simplest form of coding, namely, bag of words. Then, inspired by the success of vector of locally aggregated descriptors in addressing computer vision problems, we will introduce its Riemannian extensions. Finally, we study Riemannian form of SC, locality-constrained linear coding, and collaborative coding. Through rigorous tests, we demonstrate the superior performance of our Riemannian coding schemes against the state-of-the-art methods on several visual classification tasks, including head pose classification, video-based face recognition, and dynamic scene recognition.

KW - Australia

KW - Bag of words (BoW)

KW - collaborative coding (CC)

KW - Encoding

KW - Geometry

KW - Level measurement

KW - locality-constrained linear coding (LLC)

KW - Manifolds

KW - Riemannian geometry

KW - sparse coding (SC)

KW - Task analysis

KW - vector of locally aggregated descriptors (VLADs).

UR - http://www.scopus.com/inward/record.url?scp=85044859453&partnerID=8YFLogxK

U2 - 10.1109/TNNLS.2018.2812799

DO - 10.1109/TNNLS.2018.2812799

M3 - Article

VL - 29

SP - 5701

EP - 5712

JO - IEEE Transactions on Neural Networks and Learning Systems

JF - IEEE Transactions on Neural Networks and Learning Systems

SN - 2162-237X

IS - 11

ER -