Operations Research 2025
Abstract Submission

2216. Learning State-Dependent Policy Parametrizations for Dynamic Technician Routing with Rework

Invited abstract in session WC-10: Tour Planning Problems, stream Mobility, Transportation, and Traffic.

Wednesday, 13:30-15:00
Room: H16

Authors (first author is the speaker)

1. Marlin Wolf Ulmer
Management Science, Otto von Guericke Universität Magdeburg
2. Jonas Stein
Otto-von-Guericke-Universität Magdeburg
3. Florentin Hildebrandt
Faculty of Economics and Management, Otto-von-Guericke-Universität Magdeburg
4. Barrett Thomas
Management Sciences, University of Iowa

Abstract

Home repair and installation services require technicians to visit customers and resolve tasks of different complexity. Technicians often have heterogeneous skills. The geographical spread of customers makes achieving only "ideal" matches between technician skills and task requirements impractical. Additionally, technicians are regularly absent, e.g., due to sickness. With only non-ideal assignments regarding task requirement and technician skill, some tasks may remain unresolved and require a revisit and rework at a later day, leading to delayed service. For this sequential decision problem, every day, we iteratively build tours by adding "important" customers. The importance bases on analytical considerations and is measured by respecting urgency of service, routing efficiency, and risk of rework in an integrated fashion. We propose a state-dependent balance of these factors via reinforcement learning. We rely on proximal policy optimization (PPO) tailored to the problem specifics, analyzing the implications of specific algorithmic augmentations. A comprehensive study shows that taking a few non-ideal assignments can be quite beneficial for the overall service quality. Furthermore, in states where a higher number of technicians are sick and many customers have overdue service deadlines, prioritizing service urgency is crucial. Conversely, in states with fewer sick technicians and fewer customers with overdue deadlines, routing efficiency should take precedence. We further demonstrate the value provided by a state-dependent parametrization via PPO.

Keywords

Status: accepted


Back to the list of papers