333. Variance reduction techniques for stochastic proximal point algorithms
Invited abstract in session TD-7: Accelerated methods in modern optimization, stream Optimization for Inverse Problems and Machine Learning.
Thursday, 14:10 - 15:50Room: M:I
Authors (first author is the speaker)
| 1. | Cheik Traoré
|
| University of Genoa |
Abstract
In the context of finite sums minimization, variance reduction techniques are widely used to
improve the performance of state-of-the-art stochastic gradient methods. Their practical impact
is clear, as well as their theoretical properties. Stochastic proximal point algorithms have been
studied as an alternative to stochastic gradient algorithms since they are more stable with respect
to the choice of the stepsize but its variance reduced version are not as studied as the gradient
ones. In this work, we propose the first unified study of variance reduction techniques for stochas-
tic proximal point algorithms. We introduce a generic stochastic proximal algorithm that can be
specified to give the proximal version of SVRG, SAGA, and some of their variants for smooth
and convex functions. We provide several convergence results for the iterates and the objective
function values. In addition, under the Polyak-Łojasiewicz (PL) condition, we obtain linear con-
vergence rates for the iterates and the function values. Our numerical experiments demonstrate
the advantages of the proximal variance reduction methods over their gradient counterparts, es-
pecially about the stability with respect to the choice of the stepsize.
Keywords
- Large- and Huge-scale optimization
- Optimization for learning and data analysis
Status: accepted
Back to the list of papers