Treat abstract

> Treat abstract

1111. Conservation laws for gradient flows

Invited abstract in session TD-32: Algorithms for machine learning and inverse problems: optimisation for neural networks, stream Advances in large scale nonlinear optimization.

Tuesday, 14:30-16:00
Room: 41 (building: 303A)

Authors (first author is the speaker)

1.	Sibylle Marcotte
	DMA, ENS

Abstract

Understanding the geometric properties of gradient descent dynamics is a key ingredient in deciphering the recent success of very large machine learning models. A striking observation is that trained over-parameterized models retain some properties of the optimization initialization. This “implicit bias” is believed to be responsible for some favorable properties of the trained models and could explain their good generalization properties. In this talk, I will first rigorously expose the definition and basic properties of “conservation laws”, that define quantities conserved during gradient flows of a given model (e.g. of a ReLU network with a given architecture) with any training data and any loss. Then I will explain how to find the exact number of independent conservation laws by performing finite-dimensional algebraic manipulations. In the specific case of linear and ReLu networks, this procedure recovers the conservation laws known in the literature and shows that there are no other laws.

Keywords

Machine Learning
Control Theory
Dynamical Systems

Status: accepted

Back to the list of papers

> Treat abstract

This part of the site is hosted by EURO. Feedback. Privacy policy

Username:
Password:

EURO-Online login

1111. Conservation laws for gradient flows