223. On the convergence of stochastic Bregman proximal gradient algorithms with biased gradient estimators
Invited abstract in session WB-1: Advances in stochastic and non-euclidean first order methods, stream Zeroth and first-order optimization methods.
Wednesday, 10:30-12:30Room: B100/1001
Authors (first author is the speaker)
| 1. | Thomas Guilmeau
|
| Université Paris-Saclay, CentraleSupélec, INRIA | |
| 2. | Emilie Chouzenoux
|
| Université Paris-Est Marne-La-Vallée | |
| 3. | Víctor Elvira
|
| School of Mathematics, University of Edinburgh |
Abstract
Bregman proximal-gradient algorithms generalize proximal-gradient algorithms by measuring proximity between iterates with a Bregman divergence instead of the Euclidean distance. The choice of Bregman divergence can match the geometry of the objective function, making Bregman proximal-gradient algorithms well-suited for optimization problems with ill-behaved curvature. The convergence of deterministic Bregman proximal-gradient algorithms can be established as in the Euclidean case by leveraging the properties of relative smoothness and relative strong convexity, which generalize the Euclidean properties of smoothness and strong convexity. The stochastic setting is however more difficult, due to the difficulty of controlling the variance of the gradient estimators with Bregman divergences. As a consequence, most convergence results for stochastic Bregman-based methods assume the (Euclidean) strong convexity of the considered Bregman function. Moreover, the convergence of Bregman-based methods with biased gradient estimators is seldom studied. Motivated by applications in computational statistics (variational inference, adaptive importance sampling) where these assumptions fail, we present asymptotic and non-asymptotic convergence results for a stochastic Bregman proximal-gradient algorithm with biased gradient estimators, without assuming strong convexity of the Bregman function. This presentation is based on results from https://arxiv.org/abs/2211.04776.
Keywords
- Large-scale optimization
Status: accepted
Back to the list of papers