TY - JOUR
T1 - Contrastive Learning-Based Multi-Level Knowledge Distillation
AU - Li, Lin
AU - Gou, Jianping
AU - Ou, Weihua
AU - Chen, Wenbai
AU - Du, Lan
N1 - Publisher Copyright:
© 2025 The Author(s). CAAI Transactions on Intelligence Technology published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology and Chongqing University of Technology.
PY - 2025/10
Y1 - 2025/10
N2 - With the increasing constraints of hardware devices, there is a growing demand for compact models to be deployed on device endpoints. Knowledge distillation, a widely used technique for model compression and knowledge transfer, has gained significant attention in recent years. However, traditional distillation approaches compare the knowledge of individual samples indirectly through class prototypes overlooking the structural relationships between samples. Although recent distillation methods based on contrastive learning can capture relational knowledge, their relational constraints often distort the positional information of the samples leading to compromised performance in the distilled model. To address these challenges and further enhance the performance of compact models, we propose a novel approach, termed contrastive learning-based multi-level knowledge distillation (CLMKD). The CLMKD framework introduces three key modules: class-guided contrastive distillation, gradient relation contrastive distillation, and semantic similarity distillation. These modules are effectively integrated into a unified framework to extract feature knowledge from multiple levels, capturing not only the representational consistency of individual samples but also their higher-order structure and semantic similarity. We evaluate the proposed CLMKD method on multiple image classification datasets and the results demonstrate its superior performance compared to state-of-the-art knowledge distillation methods.
AB - With the increasing constraints of hardware devices, there is a growing demand for compact models to be deployed on device endpoints. Knowledge distillation, a widely used technique for model compression and knowledge transfer, has gained significant attention in recent years. However, traditional distillation approaches compare the knowledge of individual samples indirectly through class prototypes overlooking the structural relationships between samples. Although recent distillation methods based on contrastive learning can capture relational knowledge, their relational constraints often distort the positional information of the samples leading to compromised performance in the distilled model. To address these challenges and further enhance the performance of compact models, we propose a novel approach, termed contrastive learning-based multi-level knowledge distillation (CLMKD). The CLMKD framework introduces three key modules: class-guided contrastive distillation, gradient relation contrastive distillation, and semantic similarity distillation. These modules are effectively integrated into a unified framework to extract feature knowledge from multiple levels, capturing not only the representational consistency of individual samples but also their higher-order structure and semantic similarity. We evaluate the proposed CLMKD method on multiple image classification datasets and the results demonstrate its superior performance compared to state-of-the-art knowledge distillation methods.
KW - contrastive distillation
KW - contrastive learning
KW - knowledge distillation
UR - https://www.scopus.com/pages/publications/105009841336
U2 - 10.1049/cit2.70036
DO - 10.1049/cit2.70036
M3 - Article
AN - SCOPUS:105009841336
SN - 2468-6557
VL - 10
SP - 1478
EP - 1488
JO - CAAI Transactions on Intelligence Technology
JF - CAAI Transactions on Intelligence Technology
IS - 5
ER -