EURO 2024 Copenhagen
Abstract Submission

EURO-Online login

2575. Parallel Neural Network Training via Nonlinearly Preconditioned Trust-Region Method

Invited abstract in session MD-34: Preconditioning for Large Scale Nonlinear Optimization, stream Advances in large scale nonlinear optimization.

Monday, 14:30-16:00
Room: 43 (building: 303A)

Authors (first author is the speaker)

1. Samuel Cruz
Euler Institute, Università della Svizzera italiana

Abstract

Neural networks and the data sets used to train them continue to grow in size. Traditional training methods like SGD and Adam require extensive hyperparameter tuning, which is increasingly incompatible with today's demands as current applications require methods that ideally do not need hyperparameter tuning. With this in mind, we investigate additive domain decomposition methods for neural network training. Due to their underlying additive decomposition, our methods are parallelizable, and owing to a trust-region strategy, they largely obviate the need for hyperparameter tuning. Our test results suggest that the investigated methods are potential candidates for efficient neural network training.

Keywords

Status: accepted


Back to the list of papers