EURO-Online login
- New to EURO? Create an account
- I forgot my username and/or my password.
- Help with cookies
(important for IE8 users)
3504. Robust Markov decision processes under parametric transition distributions
Invited abstract in session WC-35: Robust Optimization: Theory and Applications, stream Stochastic, Robust and Distributionally Robust Optimization.
Wednesday, 12:30-14:00Room: 44 (building: 303A)
Authors (first author is the speaker)
1. | Christopher Kirkbride
|
The Management School, Lancaster University | |
2. | Ben Black
|
STOR-i Centre for Doctoral Training, Lancaster University | |
3. | Trivikram Dokka
|
Management Science Department, Lancaster University |
Abstract
We consider a robust Markov decision process for which we assume that the true transition distribution can be uniquely specified by a parametric distribution. We enforce this by the construction of a parametric ambiguity set that ensures the worst-case distribution is from the same parametric family of distributions. To conduct robust value iteration, we formulate the Bellman update as a linear program by discretising the ambiguity set. This scales poorly with problem size and requires significant pre-computation. We develop two algorithms for solving the update. Firstly, a cutting surface algorithm that reduces the time to solve the linear program but still requires the pre-computation. Secondly, a projection-based bisection search algorithm that removes the need for discretisation and pre-computation. Using a multi-period Newsvendor problem as an example, we compare these methods with non-parametric phi-divergence based methods from the literature. We show that the projection-based algorithm completes robust value iteration significantly faster than the other parametric algorithms, and is faster than the phi-divergence equivalent.
Keywords
- Robust Optimization
- Programming, Dynamic
Status: accepted
Back to the list of papers