EURO 2025 Leeds
Abstract Submission

1708. Bridging Theory and Practice in Deep Learning Optimization

Invited abstract in session TB-35: Optimization for machine learning and inverse problems, stream Continuous and mixed-integer nonlinear programming: theory and algorithms.

Tuesday, 10:30-12:00
Room: Michael Sadler LG15

Authors (first author is the speaker)

1. Kevin Scaman
Inria Paris

Abstract

Despite the high-dimensional and non-convex nature of deep learning optimization, empirical evidence suggests that standard training procedures often succeed in minimizing the associated loss function efficiently. However, existing theoretical analyses either rely on overparameterization or make very restrictive assumptions on loss functions and network structure. In this talk, I will show that a large class of deep learning architectures allow for efficient training beyond the overparameterized regime. In particular, the analysis will rely on a natural extension to the Kurdyka-Lojasiewicz condition, one of the standard assumptions required for non-convex optimization to reach a global optimum, that is verified for a large class of objective losses and neural network architectures.

Keywords

Status: accepted


Back to the list of papers