81. Methods for Convex (L0,L1)-Smooth Optimization: Clipping, Acceleration, and Adaptivity
Invited abstract in session MD-5: Relaxed Smoothness and Convexity Assumptions in Optimization for Machine Learning, stream Optimization for machine learning.
Monday, 16:30-18:30Room: B100/4013
Authors (first author is the speaker)
| 1. | Eduard Gorbunov
|
| Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) |
Abstract
Due to the non-smoothness of optimization problems in Machine Learning, generalized smoothness assumptions have been gaining a lot of attention in recent years. One of the most popular assumptions of this type is (L0,L1)-smoothness (Zhang et al., 2020). In this talk, we focus on the class of (strongly) convex (L0,L1)-smooth functions and discuss new convergence guarantees for several existing methods. In particular, we discuss improved convergence rates for Gradient Descent with (Smoothed) Gradient Clipping and for Gradient Descent with Polyak Stepsizes. We also extend these results to the stochastic case under the over-parameterization assumption, propose a new accelerated method for convex (L0,L1)-smooth optimization, and derive new convergence rates for Adaptive Gradient Descent (Malitsky and Mishchenko, 2020).
Keywords
- Optimization for learning and data analysis
- Non-smooth optimization
- First-order optimization
Status: accepted
Back to the list of papers