TY - JOUR
T1 - Feature fusion-based collaborative learning for knowledge distillation
AU - Li, Yiting
AU - Sun, Liyuan
AU - Gou, Jianping
AU - Du, Lan
AU - Ou, Weihua
N1 - Funding Information:
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The work was, in part, supported by the National Natural Science Foundation of China (grant nos 61976107 and 61502208) and the Postgraduate Research and Practice Innovation Program of Jiangsu Province (grant no. KYCX20_3085).
Publisher Copyright:
© The Author(s) 2021.
PY - 2021/11/1
Y1 - 2021/11/1
N2 - Deep neural networks have achieved a great success in a variety of applications, such as self-driving cars and intelligent robotics. Meanwhile, knowledge distillation has received increasing attention as an effective model compression technique for training very efficient deep models. The performance of the student network obtained through knowledge distillation heavily depends on whether the transfer of the teacher’s knowledge can effectively guide the student training. However, most existing knowledge distillation schemes require a large teacher network pre-trained on large-scale data sets, which can increase the difficulty of knowledge distillation in different applications. In this article, we propose a feature fusion-based collaborative learning for knowledge distillation. Specifically, during knowledge distillation, it enables networks to learn from each other using the feature/response-based knowledge in different network layers. We concatenate the features learned by the teacher and the student networks to obtain a more representative feature map for knowledge transfer. In addition, we also introduce a network regularization method to further improve the model performance by providing a positive knowledge during training. Experiments and ablation studies on two widely used data sets demonstrate that the proposed method, feature fusion-based collaborative learning, significantly outperforms recent state-of-the-art knowledge distillation methods.
AB - Deep neural networks have achieved a great success in a variety of applications, such as self-driving cars and intelligent robotics. Meanwhile, knowledge distillation has received increasing attention as an effective model compression technique for training very efficient deep models. The performance of the student network obtained through knowledge distillation heavily depends on whether the transfer of the teacher’s knowledge can effectively guide the student training. However, most existing knowledge distillation schemes require a large teacher network pre-trained on large-scale data sets, which can increase the difficulty of knowledge distillation in different applications. In this article, we propose a feature fusion-based collaborative learning for knowledge distillation. Specifically, during knowledge distillation, it enables networks to learn from each other using the feature/response-based knowledge in different network layers. We concatenate the features learned by the teacher and the student networks to obtain a more representative feature map for knowledge transfer. In addition, we also introduce a network regularization method to further improve the model performance by providing a positive knowledge during training. Experiments and ablation studies on two widely used data sets demonstrate that the proposed method, feature fusion-based collaborative learning, significantly outperforms recent state-of-the-art knowledge distillation methods.
KW - collaborative learning
KW - deep learning
KW - feature fusion
KW - knowledge distillation
KW - Model compression
UR - http://www.scopus.com/inward/record.url?scp=85120463661&partnerID=8YFLogxK
U2 - 10.1177/15501477211057037
DO - 10.1177/15501477211057037
M3 - Article
AN - SCOPUS:85120463661
SN - 1550-1329
VL - 17
SP - 1
EP - 11
JO - International Journal of Distributed Sensor Networks
JF - International Journal of Distributed Sensor Networks
IS - 11
ER -