Treat abstract

> Treat abstract

264. Optimal sampling for stochastic and natural gradient descent

Invited abstract in session WF-6: Stochastic Gradient Methods: Bridging Theory and Practice, stream Challenges in nonlinear programming.

Wednesday, 16:20 - 18:00
Room: M:H

Authors (first author is the speaker)

1.	Robert Gruhlke
	FU Berlin
2.	Philipp Trunschke
	Centrale Nantes & Nantes Université
3.	Anthony Nouy
	Centrale Nantes & Nantes Université

Abstract

We consider the problem of optimising the expected value of a loss functional over a nonlinear model class of functions, assuming that we have only access to realisations of the gradient of the loss.
This is a classical task in statistics, machine learning and physics-informed machine learning.
A straightforward solution is to replace the exact objective with a Monte Carlo estimate before employing standard first-order methods like gradient descent, which yields the classical stochastic gradient descent method.
But replacing the true objective with an estimate ensues a ``generalisation error''.
Rigorous bounds for this error typically require strong compactness and Lipschitz continuity assumptions while providing a very slow decay with sample size.
We propose a different optimisation strategy relying on a natural gradient descent in which the true gradient is approximated in local linearisations of the model class via (quasi-)projections based on optimal sampling methods.
Under classical assumptions on the loss and the nonlinear model class, we prove that this scheme converges almost surely monotonically to a stationary point of the true objective and we provide convergence rates.

Keywords

Linear and nonlinear optimization
Complexity and efficiency of optimization algorithms
Optimization for learning and data analysis

Status: accepted

Back to the list of papers

> Treat abstract

This part of the site is hosted by EURO. Feedback. Privacy policy