322. Accelerated cubic regularized quasi-newton methods
Invited abstract in session WE-6: Higher-order Methods in Mathematical Programming I, stream Challenges in nonlinear programming.
Wednesday, 14:10 - 15:50Room: M:H
Authors (first author is the speaker)
| 1. | Dmitry Kamzolov
|
| Machine Learning, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) |
Abstract
In this paper, we propose the first Quasi-Newton method with a global convergence rate of $O(\frac{L_1R^2}{k})$ for general convex functions. Quasi-Newton methods, such as BFGS, SR-1, are well-known for their impressive practical performance. However, they are theoretically slower than gradient descent for general convex functions. This gap between impressive practical performance and poor theoretical guarantees was an open question for a long period of time. In this paper, we make a significant step to close this gap. We improve upon the existing rate and propose the Cubic Regularized Quasi-Newton Method with a convergence rate of $O(\frac{L_1R^2}{k})$. The key to achieving this improvement is to use the Cubic Regularized Newton Method over the Damped Newton Method as an outer method, where the Quasi-Newton update is an inexact Hessian approximation. Using this approach, we propose the first Accelerated Quasi-Newton method with a global convergence rate of $O(\frac{L_1R^2}{k^2})$ for general convex functions. In special cases where we have access to additional computations, for example Hessian-vector products, we can improve the inexact Hessian approximation and achieve a global convergence rate of $O(\frac{L_2R^3}{k^{3}})$, which make it intermediate second-order method. To make these methods practical, we introduce the Adaptive Inexact Cubic Regularized Newton Method and its accelerated version, which provide real-time control of the approximation error. We show that the p
Keywords
- Complexity and efficiency of optimization algorithms
- Convex and non-smooth optimization
- Large- and Huge-scale optimization
Status: accepted
Back to the list of papers