524. Inexact derivative-free methods for constrained bilevel optimization with applications to machine learning
Invited abstract in session TC-3: Theoretical and algorithmic advances in large scale nonlinear optimization and applications Part 2, stream Large scale optimization: methods and algorithms.
Tuesday, 14:00-16:00Room: B100/4011
Authors (first author is the speaker)
| 1. | Marco Viola
|
| School of Mathematical Sciences, Dublin City University | |
| 2. | Matteo Pernini
|
| University of Padua | |
| 3. | Gabriele Sanguin
|
| University of Padua | |
| 4. | Francesco Rinaldi
|
| Dipartimento di Matematica "Tullio Levi-Civita", Università di Padova |
Abstract
Bilevel Optimization is a powerful framework for addressing complex machine learning challenges such as hyperparameter tuning, meta-learning, data distillation, and adversarial training to name a few.
Traditional gradient-based strategies to solve such BO problems involve the computation of a hyper-gradient for the function at the upper level.
This can be a computationally demanding task since a correct evaluation of such gradients not only requires accurate solutions at the lower level, but also the storage of large Jacobian matrices, making such strategies impractical in case of large scale problems.
This work investigates the development and application of inexact derivative-free methods for solving constrained BO problems arising from machine learning, with the specific interest in problems subject to polyhedral constraints at the upper level.
In particular, starting from the inexact direct-search method introduced in [Diouane et al., COAP 2024] for unconstrained setting, we propose a new method for the case of constrained problems in which the construction of feasible directions is inspired by zeroth-order Frank-Wolfe methods [Dzahini et al, 2024].
We discuss the theoretical properties of the proposed algorithm and test its effectiveness on both synthetic and real-life instances.
Keywords
- Multi-level optimization
- Derivative-free optimization
- Optimization for learning and data analysis
Status: accepted
Back to the list of papers