Treat abstract

> Treat abstract

2160. Learning Whittle and LP indices in Average-Reward Restless Multi-Armed Bandits

Invited abstract in session TB-40: Reinforcement Learning: Methods and Applications , stream Advances in Stochastic Modelling and Learning Methods.

Tuesday, 10:30-12:00
Room: 96 (building: 306)

Authors (first author is the speaker)

1.	Konstantin Avrachenkov
	INRIA

Abstract

Restless Multi-Armed Bandits (RMABs) are extensively used in scheduling, resource allocation,
marketing and clinical trials, just to name a few application areas. RMABs are Markov Decision Processes
with two actions (active and passive modes) for each arm and with a constraint on the number of active arms
per time slot. Since in general RMABs are PSPACE-complete, several heuristics such as Whittle index and LP
index have been proposed. In this talk, I present reinforcement learning schemes for both Whittle and
LP indices with almost sure convergence guarantee in the tabular setting and an empirically efficient
Deep Q-learning variants. Several examples, including scheduling in queueing systems, will be presented.
This talk is based on joint works with V.S. Borkar and P. Shah from IIT Bombay.

Keywords

Machine Learning
Optimal Control
Stochastic Models

Status: accepted

Back to the list of papers

> Treat abstract

This part of the site is hosted by EURO. Feedback. Privacy policy

Username:
Password:

EURO-Online login

2160. Learning Whittle and LP indices in Average-Reward Restless Multi-Armed Bandits