Block-wisely supervised Neural Architecture Search with knowledge distillation

Changlin Li, Jiefeng Peng, Liuchun Yuan, Guangrun Wang, Xiaodan Liang, Liang Lin, Xiaojun Chang

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

3 Citations (Scopus)

Abstract

Neural Architecture Search (NAS), aiming at automatically designing network architectures by machines, is expected to bring about a new revolution in machine learning. Despite these high expectation, the effectiveness and efficiency of existing NAS solutions are unclear, with some recent works going so far as to suggest that many existing NAS solutions are no better than random architecture selection. The ineffectiveness of NAS solutions may be attributed to inaccurate architecture evaluation. Specifically, to speed up NAS, recent works have proposed under-training different candidate architectures in a large search space concurrently by using shared network parameters; however, this has resulted in incorrect architecture ratings and furthered the ineffectiveness of NAS. In this work, we propose to modularize the large search space of NAS into blocks to ensure that the potential candidate architectures are fully trained; this reduces the representation shift caused by the shared parameters and leads to the correct rating of the candidates. Thanks to the blockwise search, we can also evaluate all of the candidate architectures within each block. Moreover, we find that the knowledge of a network model lies not only in the network parameters but also in the network architecture. Therefore, we propose to distill the neural architecture (DNA) knowledge from a teacher model to supervise our block-wise architecture search, which significantly improves the effectiveness of NAS. Remarkably, the performance of our searched architectures has exceeded the teacher model, demonstrating the practicability of our method. Finally, our method achieves a state-of-the-art 78.4% top-1 accuracy on ImageNet in a mobile setting. All of our searched models along with the evaluation code are available at https://github.com/changlin31/DNA.

Original languageEnglish
Title of host publicationProceedings - 33th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2020
EditorsCe Liu, Greg Mori, Kate Saenko, Silvio Savarese
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages1986-1995
Number of pages10
ISBN (Electronic)9781728171685
ISBN (Print)9781728171692
DOIs
Publication statusPublished - 2020
EventIEEE Conference on Computer Vision and Pattern Recognition 2020 - Virtual, China
Duration: 14 Jun 202019 Jun 2020
http://cvpr2020.thecvf.com (Website )
https://openaccess.thecvf.com/CVPR2020 (Proceedings)
https://ieeexplore.ieee.org/xpl/conhome/9142308/proceeding (Proceedings)

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
PublisherIEEE, Institute of Electrical and Electronics Engineers
ISSN (Print)1063-6919
ISSN (Electronic)2575-7075

Conference

ConferenceIEEE Conference on Computer Vision and Pattern Recognition 2020
Abbreviated titleCVPR 2020
CountryChina
CityVirtual
Period14/06/2019/06/20
Internet address

Cite this