Discrimination-aware network pruning for deep model compression

Jing Liu, Bohan Zhuang, Zhuangwei Zhuang, Yong Guo, Junzhou Huang, Jinhui Zhu, Mingkui Tan

Research output: Contribution to journalArticleResearchpeer-review

Abstract

We study network pruning which aims to remove redundant channels/kernels and accelerate the inference of deep networks. Existing pruning methods either train from scratch with sparsity constraints or minimize the reconstruction error between the feature maps of the pre-trained models and the compressed ones. Both strategies suffer from some limitations: the former kind is computationally expensive and difficult to converge, while the latter kind optimizes the reconstruction error but ignores the discriminative power of channels. In this paper, we propose a discrimination-aware channel pruning (DCP) method to choose the channels that actually contribute to the discriminative power. Based on DCP, we further propose several techniques to improve the optimization efficiency. Note that the parameters of a channel (3D tensor) may contain redundant kernels (each with a 2D matrix). To solve this issue, we propose a discrimination-aware kernel pruning (DKP) method to select the kernels with promising discriminative power. Experiments on image classification and face recognition demonstrate the effectiveness of our methods. For example, on ILSVRC-12, the resultant ResNet-50 with 30% reduction of channels even outperforms the baseline model by 0.36% on Top-1 accuracy. The pruned MobileNetV1 and MobileNetV2 achieve 1.93x and 1.42x inference acceleration on a mobile device, respectively, with negligible performance degradation.

Original languageEnglish
Number of pages15
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
DOIs
Publication statusAccepted/In press - 23 Mar 2021
Externally publishedYes

Keywords

  • Acceleration
  • Adaptation models
  • Channel Pruning
  • Computational modeling
  • Deep Neural Networks
  • Kernel
  • Kernel Pruning
  • Network Compression
  • Quantization (signal)
  • Redundancy
  • Training

Cite this