23rd Conference of the International Federation of Operational Research Societies
Abstract Submission

691. Stochastic Shortest Path Interdiction

Invited abstract in session HE-22: Combinatorial Optimization & Game Theory , cluster Game Theory and Operations Management.

Thursday, 16:15-17:45
Room: FENH202

Authors (first author is the speaker)

1. Natalia Trigo
Industrias, Universidad de Chile
2. Denis Saure
Industrial Engineering, University of Chile
3. Juan Borrero
Industrial Engineering and Management, Oklahoma State University

Abstract

Motivated by recent work in sequential bilevel problems under uncertainty, in this work we study sequential shortest path interdiction in settings where arc costs form an iid sequence of random vector, drawn from a common distribution known by the evader, but initially unknown by the interdictor.
We model this problem of sequential decision-making under parameter uncertainty using the multi-armed bandit framework. In a first contribution, we extend the techniques used to find a fundamental bound on policy performance for the classic bandit problem and adapt them to obtain an asymptotic performance lower bound in this setting. The regret, a measure of performance degradation due to the initial lack of information commonly used in the bandit literature, is proportional to order log(T), where T denotes the time horizon, and to a constant depending non-trivialy on the combinatorial structure of the underlying full-information interdiction problem. The constant is the solution to a lower bound problem optimally searching for sufficient information to guarantee the optimality of the full information solution, unobtainable in finite time by observing the evader's reaction to different interdiction actions.
We use the insight gained by lower-bound results to develop efficient policies mimicking the combinatorial structure of the asymptotic result, and we obtain asymptotic optimality. We test the performance of the proposed policies in exhaustive numerical experiments, where we contrast their performance with relevant benchmarks arising from more naive approaches to the problem. Our results provide key insight on the difficulty of the setting and should serve to close the gap between practice and theory on sequential interdiction problems – typically ignoring the difficulty associated with learning parameters in real time.

Keywords

Status: accepted


Back to the list of papers