EUROPT 2024
Abstract Submission

198. Analysis of a class of stochastic component-wise soft-clipping schemes

Invited abstract in session TD-6: Stochastic methods, stream Methods for non-/monotone inclusions and their applications.

Thursday, 14:10 - 15:50
Room: M:H

Authors (first author is the speaker)

1. Tony Stillfjord
Centre for Mathematical Sciences, Lund University
2. Måns Williamson
Centre for Mathematical Sciences, Lund University
3. Monika Eisenmann
Centre for Mathematical Sciences, Lund University

Abstract

Clipping techniques are often used when solving optimization problems related to the training of machine learning applications. They make gradient-based optimization method more robust by rescaling the gradients such that excessively large steps are avoided. This is particularly useful in stochastic optimization, where the stochasticity leads to extra variability in the size of the gradients. In spite of this, so-called soft-clipping methods have not been analyzed to a large extent in the literature, and a rigorous mathematical analysis is lacking in the general, nonlinear case.

In this talk, I will therefore present recent results on convergence analysis of a large class of stochastic, component-wise soft-clipping schemes. This class covers several existing schemes and suggests a whole range of new schemes to be considered further. Our numerical experiments indicate that many of them perform on par with state-of-the-art methods, without significant tuning.

The main contribution of this work is the unifying convergence analysis. Under standard assumptions such as Lipschitz continuous gradients of the objective function, we give rigorous proofs of convergence in expectation. These include rates for both convex and non-convex problems. In addition, we prove almost sure convergence to a stationary point in the non-convex case.

Keywords

Status: accepted


Back to the list of papers