Treat abstract

> Treat abstract

36. Convergence Analysis of Nonlinear Parabolic PDE Models with Neural Network Terms Trained with Gradient Descent

Invited abstract in session TB-10: First order methods: new perspectives for machine learning , stream Large scale optimization: methods and algorithms.

Tuesday, 10:30-12:30
Room: B100/8011

Authors (first author is the speaker)

1.	Konstantin Riedl
	Mathematical Institute, University of Oxford
2.	Justin Sirignano
	Mathematical Institute, University of Oxford
3.	Konstantinos Spiliopoulos
	Department of Mathematics and Statistics, Boston University

Abstract

Many engineering and scientific fields have recently become interested in modeling terms in partial differential equations (PDEs) with neural networks (NNs). The resulting PDE model, a function of the NN parameters, can be calibrated to available data by optimizing over the PDE using gradient descent, where the gradient is evaluated by solving an adjoint PDE. In this talk, we discuss the convergence of this adjoint optimization method for training NN-PDE models in the limit where both the number of hidden units and the number of training steps tend to infinity. Specifically, for a general class of nonlinear parabolic PDEs, we prove convergence of the NN-PDE solution to the target data (i.e., a global minimizer). The global convergence proof requires addressing several technical challenges, since the PDE system is both nonlinear and non-local. Although the adjoint PDE is linear, the NN training dynamics involve a non-local kernel operator in the infinite-width hidden layer limit, where the kernel lacks a spectral gap for its eigenvalues. This poses a unique mathematical challenge that is not encountered in finite-dimensional NN convergence analysis. We establish convergence by proving that an appropriate quadratic functional of the adjoint is globally Lipschitz and then applying a cycle of stopping times analysis to prove that the adjoint solution weakly converges to zero. Leveraging the definition of the adjoint PDE, this yields the global convergence of the original NN-PDE.

Keywords

Optimization in industry, business and finance
First-order optimization

Status: accepted

Back to the list of papers

> Treat abstract

This part of the site is hosted by EURO. Feedback. Privacy policy