EURO-Online login
- New to EURO? Create an account
- I forgot my username and/or my password.
- Help with cookies
(important for IE8 users)
3256. On stochastic first order optimization methods for deep learning applications
Invited abstract in session MC-34: Optimization and learning for data science and imaging (Part III), stream Advances in large scale nonlinear optimization.
Monday, 12:30-14:00Room: 43 (building: 303A)
Authors (first author is the speaker)
1. | Federica Porta
|
Universita' di Modena e Reggio Emilia | |
2. | Giorgia Franchini
|
UNIMORE | |
3. | Valeria Ruggiero
|
Università di Ferrara | |
4. | Ilaria Trombini
|
Università di Ferrara | |
5. | Luca Zanni
|
FIM, University of Modena and Reggio Emilia |
Abstract
First order stochastic optimization methods are effective tools for the minimization of problems arising in deep learning applications. In this work we study a stochastic first order algorithm which rules the increase of the mini-batch size in a predefined fashion and automatically adjusts the learning rate by means of a monotone or non-monotone line search procedure. The mini-batch size is incremented at a suitable a priori rate throughout the iterative process in order that the variance of the stochastic gradients is progressively reduced. The a priori rate is not subject to restrictive assumptions, allowing for the possibility of a slow increase in the mini-batch size. On the other hand, the learning rate can non-monotonically vary along the iterations, as long as it is appropriately bounded. Convergence results for the proposed method are provided for both convex and non convex objective function. The low per-iteration cost, the limited memory requirements and the robustness against the hyperparameters setting make the suggested approach well-suited for implementation within the deep learning framework, also for GPGPU-equipped architectures. Numerical results on training deep neural networks for multi-class image classification show a promising behaviour of the proposed scheme with respect to similar state of the art competitors.
Keywords
- Non-smooth Optimization
- Machine Learning
Status: accepted
Back to the list of papers