EUROPT 2025
Abstract Submission

500. Exploring Step Size Adaptation in Large-Scale Deep Learning Optimization

Invited abstract in session MD-2: Optimization in machine Learning , stream Nonsmooth and nonconvex optimization.

Monday, 16:30-18:30
Room: B100/7011

Authors (first author is the speaker)

1. Lorenzo Ciarpaglini
Department of Computer, Control, and Management Engineering A. Ruberti, Sapienza University of Rome
2. Laura Palagi
Department of Computer, Control, and Management Engineering A. Ruberti, Sapienza University of Rome
3. Diego Scuppa
Department of Computer, Control and Management Engineering, Sapienza University of Rome
4. Marco Sciandrone
DIAG, Sapienza Università di Roma

Abstract

Scaling deep learning optimization remains a fundamental challenge, especially in relation to the choice of step size within first-order methods. In this work, we explore adaptive strategies for learning rate selection designed to improve both convergence behavior and robustness across different architectures, tasks, and datasets. Rather than relying on fixed schedules or heuristic tuning, our approach aims to incorporate meaningful information from the optimization landscape to guide parameter updates. The resulting methods are flexible and can be integrated into standard training procedures with minimal overhead. An empirical study is conducted to assess the performance of different adaptive step size strategies, with a focus on their stability, efficiency, and integration within standard training pipelines.

Keywords

Status: accepted


Back to the list of papers