368. A Third-Order Perspective on Newton’s Method and its Application in Federated Learning
Invited abstract in session MD-5: Relaxed Smoothness and Convexity Assumptions in Optimization for Machine Learning, stream Optimization for machine learning.
Monday, 16:30-18:30Room: B100/4013
Authors (first author is the speaker)
| 1. | Slavomír Hanzely
|
| Machine Learning, MBZUAI |
Abstract
This paper investigates the global convergence of stepsized Newton methods for convex functions with Hölder continuous Hessians or third derivatives for the federated learning. With focus on single node setup, we propose several simple stepsize schedules with fast global convergence guarantees, up to $\mathcal{O} (1/k^3)$. For cases with multiple plausible smoothness parameterizations or an unknown smoothness constant, we introduce a stepsize linesearch and a backtracking procedure with provable convergence as if the optimal smoothness parameters were known in advance. Additionally, we present strong convergence guarantees for the practically popular Newton method with exact linesearch.
Keywords
- Second- and higher-order optimization
Status: accepted
Back to the list of papers