EURO-Online login
- New to EURO? Create an account
- I forgot my username and/or my password.
- Help with cookies
(important for IE8 users)
2562. Harnessing the Power Trained Reinforcement Learning Agents in Job Shop Scheduling Problems
Invited abstract in session TB-3: Machine Learning in Applied Optimization, stream Data Science Meets Optimization.
Tuesday, 10:30-12:00Room: 1005 (building: 202)
Authors (first author is the speaker)
1. | Constantin Waubert de Puiseau
|
Institute for Technologies and Management of Digital Transformation, University of Wuppertal | |
2. | Hasan Tercan
|
Institute for Technologies and Management of Digital Transformation, University of Wuppertal | |
3. | Tobias Meisen
|
Institute for Technologies and Management of Digital Transformation, University of Wuppertal |
Abstract
The Job Shop Scheduling Problem (JSSP) has been extensively studied in operations research for decades, resulting in the development of various solution methods. Recently, deep reinforcement learning (DRL) has emerged as a promising approach to automatically learn generalized construction heuristics from simulations. Construction heuristics iteratively generate solution sequences in which operations are integrated into a schedule. The neural networks of DRL-agents predict the probabilities per operation, that integrating it next in sequence will lead to the shortest schedule. Often multiple solutions are sampled stochastically from the predictions of trained agents. However, due to the symmetry of the JSSP, many of these sequences result in the same makespan or even the same schedule. This motivates the use of more sophisticated search strategies that cover a wider range of solutions and utilize trained agents effectively.
This study compares theoretical and practical aspects of integrating learned priors into depth-wise search strategies, such as stochastic sampling and Monte-Carlo tree search, aiming to find the shortest makespan in limited computational time. While predictions for sampling are most efficiently parallelized, other methods effectively prune the search tree to require fewer predictions in total. Our results with state-of-the-art DRL agents indicate that variations of stochastic sampling perform best, considering realistic time and hardware constraints.
Keywords
- Artificial Intelligence
- Scheduling
- Expert Systems and Neural Networks
Status: accepted
Back to the list of papers