EURO 2024 Copenhagen
Abstract Submission

EURO-Online login

3256. On stochastic first order optimization methods for deep learning applications

Invited abstract in session MC-34: Optimization and learning for data science and imaging (Part III), stream Advances in large scale nonlinear optimization.

Monday, 12:30-14:00
Room: 43 (building: 303A)

Authors (first author is the speaker)

1. Federica Porta
Universita' di Modena e Reggio Emilia
2. Giorgia Franchini
UNIMORE
3. Valeria Ruggiero
Università di Ferrara
4. Ilaria Trombini
Università di Ferrara
5. Luca Zanni
FIM, University of Modena and Reggio Emilia

Abstract

First order stochastic optimization methods are effective tools for the minimization of problems arising in deep learning applications. In this work we study a stochastic first order algorithm which rules the increase of the mini-batch size in a predefined fashion and automatically adjusts the learning rate by means of a monotone or non-monotone line search procedure. The mini-batch size is incremented at a suitable a priori rate throughout the iterative process in order that the variance of the stochastic gradients is progressively reduced. The a priori rate is not subject to restrictive assumptions, allowing for the possibility of a slow increase in the mini-batch size. On the other hand, the learning rate can non-monotonically vary along the iterations, as long as it is appropriately bounded. Convergence results for the proposed method are provided for both convex and non convex objective function. The low per-iteration cost, the limited memory requirements and the robustness against the hyperparameters setting make the suggested approach well-suited for implementation within the deep learning framework, also for GPGPU-equipped architectures. Numerical results on training deep neural networks for multi-class image classification show a promising behaviour of the proposed scheme with respect to similar state of the art competitors.

Keywords

Status: accepted


Back to the list of papers