Projects per year
Abstract
Sharpness-Aware Minimization (SAM) has attracted significant attention for its effectiveness in improving generalization across various tasks. However, its underlying principles remain poorly understood. In this work, we analyze SAM's training dynamics using the maximum eigenvalue of the Hessian as a measure of sharpness and propose a third-order stochastic differential equation (SDE), which reveals that the dynamics are driven by a complex mixture of second- and third-order terms. We show that alignment between the perturbation vector and the top eigenvector is crucial for SAM's effectiveness in regularizing sharpness, but find that this alignment is often inadequate in practice, which limits SAM's efficiency. Building on these insights, we introduce Eigen-SAM, an algorithm that explicitly aims to regularize the top Hessian eigenvalue by aligning the perturbation vector with the leading eigenvector. We validate the effectiveness of our theory and the practical advantages of our proposed approach through comprehensive experiments. Code is available at https://github.com/RitianLuo/EigenSAM.
Original language | English |
---|---|
Title of host publication | NeurIPS Proceedings - Advances in Neural Information Processing Systems 37 (NeurIPS 2024) |
Editors | A. Globerson, L. Mackey, D. Belgrave , A. Fan, U. Paquet, J. Tomczak, C. Zhang |
Place of Publication | San Diego CA USA |
Publisher | Neural Information Processing Systems (NIPS) |
Number of pages | 30 |
ISBN (Electronic) | 9798331314385 |
Publication status | Published - 2024 |
Event | Advances in Neural Information Processing Systems 2024 - Vancouver, Canada Duration: 10 Dec 2024 → 15 Dec 2024 Conference number: 38th https://neurips.cc/ (Website) https://openreview.net/group?id=NeurIPS.cc/2024/Conference#tab-accept-oral (Peer Reviews) https://proceedings.neurips.cc/paper_files/paper/2024 (Proceedings - NeurIPS Proceedings) |
Publication series
Name | Advances in Neural Information Processing Systems |
---|---|
Publisher | NeurIPS Proceedings |
Volume | 37 |
ISSN (Print) | 1049-5258 |
Conference
Conference | Advances in Neural Information Processing Systems 2024 |
---|---|
Abbreviated title | NeurIPS 2024 |
Country/Territory | Canada |
City | Vancouver |
Period | 10/12/24 → 15/12/24 |
Internet address |
|
Projects
- 1 Active
-
Exploiting Geometries of Learning for Fast, Adaptive and Robust AI
Phung, D. (Primary Chief Investigator (PCI)), Tafazzoli Harandi, M. (Chief Investigator (CI)), Hartley, R. I. (Chief Investigator (CI)), Le, T. (Chief Investigator (CI)) & Koniusz, P. (Partner Investigator (PI))
ARC - Australian Research Council
8/05/23 → 7/05/26
Project: Research