Treat abstract

> Treat abstract

500. Exploring Step Size Adaptation in Large-Scale Deep Learning Optimization

Invited abstract in session MD-2: Optimization in machine Learning , stream Nonsmooth and nonconvex optimization.

Monday, 16:30-18:30
Room: B100/7011

Authors (first author is the speaker)

1.	Lorenzo Ciarpaglini
	Department of Computer, Control, and Management Engineering A. Ruberti, Sapienza University of Rome
2.	Laura Palagi
	Department of Computer, Control, and Management Engineering A. Ruberti, Sapienza University of Rome
3.	Diego Scuppa
	Department of Computer, Control and Management Engineering, Sapienza University of Rome
4.	Marco Sciandrone
	DIAG, Sapienza Università di Roma

Abstract

Scaling deep learning optimization remains a fundamental challenge, especially in relation to the choice of step size within first-order methods. In this work, we explore adaptive strategies for learning rate selection designed to improve both convergence behavior and robustness across different architectures, tasks, and datasets. Rather than relying on fixed schedules or heuristic tuning, our approach aims to incorporate meaningful information from the optimization landscape to guide parameter updates. The resulting methods are flexible and can be integrated into standard training procedures with minimal overhead. An empirical study is conducted to assess the performance of different adaptive step size strategies, with a focus on their stability, efficiency, and integration within standard training pipelines.

Keywords

First-order optimization
Large-scale optimization
Linear and nonlinear optimization

Status: accepted

Back to the list of papers

> Treat abstract

This part of the site is hosted by EURO. Feedback. Privacy policy