Explicit Eigenvalue Regularization Improves Sharpness-Aware Minimization

Haocheng Luo, Tuan Truong, Tung Pham, Mehrtash Harandi, Dinh Phung, Trung Le

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Sharpness-Aware Minimization (SAM) has attracted significant attention for its effectiveness in improving generalization across various tasks. However, its underlying principles remain poorly understood. In this work, we analyze SAM's training dynamics using the maximum eigenvalue of the Hessian as a measure of sharpness and propose a third-order stochastic differential equation (SDE), which reveals that the dynamics are driven by a complex mixture of second- and third-order terms. We show that alignment between the perturbation vector and the top eigenvector is crucial for SAM's effectiveness in regularizing sharpness, but find that this alignment is often inadequate in practice, which limits SAM's efficiency. Building on these insights, we introduce Eigen-SAM, an algorithm that explicitly aims to regularize the top Hessian eigenvalue by aligning the perturbation vector with the leading eigenvector. We validate the effectiveness of our theory and the practical advantages of our proposed approach through comprehensive experiments. Code is available at https://github.com/RitianLuo/EigenSAM.

Original languageEnglish
Title of host publicationNeurIPS Proceedings - Advances in Neural Information Processing Systems 37 (NeurIPS 2024)
EditorsA. Globerson, L. Mackey, D. Belgrave , A. Fan, U. Paquet, J. Tomczak, C. Zhang
Place of PublicationSan Diego CA USA
PublisherNeural Information Processing Systems (NIPS)
Number of pages30
ISBN (Electronic)9798331314385
Publication statusPublished - 2024
EventAdvances in Neural Information Processing Systems 2024 - Vancouver, Canada
Duration: 10 Dec 202415 Dec 2024
Conference number: 38th
https://neurips.cc/ (Website)
https://openreview.net/group?id=NeurIPS.cc/2024/Conference#tab-accept-oral (Peer Reviews)
https://proceedings.neurips.cc/paper_files/paper/2024 (Proceedings - NeurIPS Proceedings)

Publication series

NameAdvances in Neural Information Processing Systems
PublisherNeurIPS Proceedings
Volume37
ISSN (Print)1049-5258

Conference

ConferenceAdvances in Neural Information Processing Systems 2024
Abbreviated titleNeurIPS 2024
Country/TerritoryCanada
CityVancouver
Period10/12/2415/12/24
Internet address

Cite this