2271. Reinforcement Learning for Teaching LLMs to Derive Linear Programs
Invited abstract in session TA-12: AI for Optimization Modeling, stream Artificial Intelligence, Machine Learning and Optimization.
Thursday, 8:45-10:15Room: H10
Authors (first author is the speaker)
| 1. | Florian Roland Breda
|
| University of Siegen | |
| 2. | Ulf Lorenz
|
| Chair of Technology Management, Universitaet Siegen |
Abstract
Large Language Models (LLMs) have the potential to assist humans in solving complex problems objectively and efficiently, even when users lack formal knowledge of mathematical optimization. However, LLMs do not yet possess an inherent understanding of linear programs (LPs). Enabling LLMs to autonomously derive LPs from textual problem descriptions requires a structured learning approach. A reinforcement learning environment provides rewards based on the quality of generated LP formulations, guiding the learning process. The quality is assessed based on factors such as solvability, solution time, and the clarity of the formulation’s explanation. Small, specialized learning environments could be particularly useful for tackling specific problem domains.
Keywords
- Artificial Intelligence
- Linear Programming
Status: accepted
Back to the list of papers