EURO-Online login
- New to EURO? Create an account
- I forgot my username and/or my password.
- Help with cookies
(important for IE8 users)
2972. Index policy for multiarmed bandit problem with dynamic risk measures
Invited abstract in session TB-35: Risk Averse and Contextual Stochastic Optimization, stream Stochastic, Robust and Distributionally Robust Optimization.
Tuesday, 10:30-12:00Room: 44 (building: 303A)
Authors (first author is the speaker)
1. | Ozlem Cavus
|
Industrial Engineering, Bilkent University | |
2. | Milad Malekipirbazari
|
Computer Science and Engineering, Chalmers University of Technology |
Abstract
The multiarmed bandit problem (MAB) is a classic problem in which a finite amount of resources must be allocated among competing choices with the aim of identifying a policy that maximizes the expected total reward. The classical MAB makes the strong assumption that the decision maker is risk-neutral and indifferent to the variability of the outcome. However, in many real life applications, these assumptions are not met and decision makers are risk-averse. Motivated to resolve this, we study risk-averse control of the multiarmed bandit problem in regard to the concept of dynamic coherent risk measures to determine a policy with the best risk-adjusted total discounted return. In respect of this specific setting, we present a theoretical analysis based on Whittle’s retirement problem and propose a priority-index policy that reduces to the Gittins index when the level of risk-aversion converges to zero. We generalize the restart formulation of the Gittins index to effectively compute these risk-averse allocation indices. Numerical results exhibit the excellent performance of this heuristic approach. Our experimental studies suggest that there is no guarantee that an index-based optimal policy exists for the risk-averse problem. Nonetheless, our risk-averse allocation indices can achieve optimal or near-optimal policies which in some instances are easier to interpret compared to the exact optimal policy.
Keywords
- Stochastic Models
- Stochastic Optimization
Status: accepted
Back to the list of papers