TY - JOUR
T1 - Optimizing Variational Physics-Informed Neural Networks Using Least Squares
AU - Uriarte, Carlos
AU - Bastidas, Manuela
AU - Pardo, David
AU - Taylor, Jamie M.
AU - Rojas, Sergio
N1 - Publisher Copyright:
© 2025 Elsevier Ltd
PY - 2025/5/1
Y1 - 2025/5/1
N2 - Variational Physics-Informed Neural Networks often suffer from poor convergence when using stochastic gradient-descent-based optimizers. By introducing a least squares solver for the weights of the last layer of the neural network, we improve the convergence of the loss during training in most practical scenarios. This work analyzes the computational cost of the resulting hybrid least-squares/gradient-descent optimizer and explains how to implement it efficiently. In particular, we show that a traditional implementation based on backward-mode automatic differentiation leads to a prohibitively expensive algorithm. To remedy this, we propose using either forward-mode automatic differentiation or an ultraweak-type scheme that avoids the differentiation of trial functions in the discrete weak formulation. The proposed alternatives are up to one hundred times faster than the traditional one, recovering a computational cost-per-iteration similar to that of a conventional gradient-descent-based optimizer alone. To support our analysis, we derive computational estimates and conduct numerical experiments in one- and two-dimensional problems.
AB - Variational Physics-Informed Neural Networks often suffer from poor convergence when using stochastic gradient-descent-based optimizers. By introducing a least squares solver for the weights of the last layer of the neural network, we improve the convergence of the loss during training in most practical scenarios. This work analyzes the computational cost of the resulting hybrid least-squares/gradient-descent optimizer and explains how to implement it efficiently. In particular, we show that a traditional implementation based on backward-mode automatic differentiation leads to a prohibitively expensive algorithm. To remedy this, we propose using either forward-mode automatic differentiation or an ultraweak-type scheme that avoids the differentiation of trial functions in the discrete weak formulation. The proposed alternatives are up to one hundred times faster than the traditional one, recovering a computational cost-per-iteration similar to that of a conventional gradient-descent-based optimizer alone. To support our analysis, we derive computational estimates and conduct numerical experiments in one- and two-dimensional problems.
KW - Gradient-descent optimization
KW - Least squares
KW - Neural networks
KW - Variational problems
UR - https://www.scopus.com/pages/publications/85218980893
U2 - 10.1016/j.camwa.2025.02.022
DO - 10.1016/j.camwa.2025.02.022
M3 - Article
AN - SCOPUS:85218980893
SN - 0898-1221
VL - 185
SP - 76
EP - 93
JO - Computers and Mathematics with Applications
JF - Computers and Mathematics with Applications
ER -