EURO-Online login
- New to EURO? Create an account
- I forgot my username and/or my password.
- Help with cookies
(important for IE8 users)
3493. Risk-Averse Multiarmed Bandits with Switching Penalties
Invited abstract in session WB-35: Stochastic optimization: theory and applications, stream Stochastic, Robust and Distributionally Robust Optimization.
Wednesday, 10:30-12:00Room: 44 (building: 303A)
Authors (first author is the speaker)
1. | Milad Malekipirbazari
|
Computer Science and Engineering, Chalmers University of Technology |
Abstract
This work explores the intricate dynamics of the multiarmed bandit (MAB) problem, augmented with a critical realworld consideration: the cost implications of switching decisions. This study is at the forefront of integrating risk considerations into MAB scenarios, where decision-makers face penalties each time they transition between options. Such scenarios are not just theoretical constructs but are reflective of numerous practical applications. Our work distinguishes itself by addressing the largely unexplored domain of risk-averse MAB problems compounded by switching penalties. Our contribution is threefold: firstly, we explore the qualitative aspects of optimal policies. Secondly, we present novel theoretical results, including the development of the Risk-Averse Switching Index (RASI), which addresses the dual challenges of risk aversion and switching costs, demonstrating its near-optimal efficacy. Lastly, through rigorous numerical experiments, we validate our algorithm’s effectiveness and practical applicability.
Keywords
- Stochastic Optimization
- Artificial Intelligence
- Optimal Control
Status: accepted
Back to the list of papers