EURO 2024 Copenhagen
Abstract Submission

EURO-Online login

2160. Learning Whittle and LP indices in Average-Reward Restless Multi-Armed Bandits

Invited abstract in session TB-40: Reinforcement Learning: Methods and Applications , stream Advances in Stochastic Modelling and Learning Methods.

Tuesday, 10:30-12:00
Room: 96 (building: 306)

Authors (first author is the speaker)

1. Konstantin Avrachenkov
INRIA

Abstract

Restless Multi-Armed Bandits (RMABs) are extensively used in scheduling, resource allocation,
marketing and clinical trials, just to name a few application areas. RMABs are Markov Decision Processes
with two actions (active and passive modes) for each arm and with a constraint on the number of active arms
per time slot. Since in general RMABs are PSPACE-complete, several heuristics such as Whittle index and LP
index have been proposed. In this talk, I present reinforcement learning schemes for both Whittle and
LP indices with almost sure convergence guarantee in the tabular setting and an empirically efficient
Deep Q-learning variants. Several examples, including scheduling in queueing systems, will be presented.
This talk is based on joint works with V.S. Borkar and P. Shah from IIT Bombay.

Keywords

Status: accepted


Back to the list of papers