Contrastive Learning-Based Multi-Level Knowledge Distillation

Lin Li, Jianping Gou, Weihua Ou, Wenbai Chen, Lan Du

Research output: Contribution to journalArticleResearchpeer-review

Abstract

With the increasing constraints of hardware devices, there is a growing demand for compact models to be deployed on device endpoints. Knowledge distillation, a widely used technique for model compression and knowledge transfer, has gained significant attention in recent years. However, traditional distillation approaches compare the knowledge of individual samples indirectly through class prototypes overlooking the structural relationships between samples. Although recent distillation methods based on contrastive learning can capture relational knowledge, their relational constraints often distort the positional information of the samples leading to compromised performance in the distilled model. To address these challenges and further enhance the performance of compact models, we propose a novel approach, termed contrastive learning-based multi-level knowledge distillation (CLMKD). The CLMKD framework introduces three key modules: class-guided contrastive distillation, gradient relation contrastive distillation, and semantic similarity distillation. These modules are effectively integrated into a unified framework to extract feature knowledge from multiple levels, capturing not only the representational consistency of individual samples but also their higher-order structure and semantic similarity. We evaluate the proposed CLMKD method on multiple image classification datasets and the results demonstrate its superior performance compared to state-of-the-art knowledge distillation methods.

Original languageEnglish
Pages (from-to)1478-1488
Number of pages11
JournalCAAI Transactions on Intelligence Technology
Volume10
Issue number5
DOIs
Publication statusPublished - Oct 2025

Keywords

  • contrastive distillation
  • contrastive learning
  • knowledge distillation

Cite this