EURO 2025 Leeds
Abstract Submission

3109. Resolving the curse of poor data quality: Optimization perspective on data collection and validation

Invited abstract in session MC-12: Advanced analytics in manufacturing, stream Scheduling and Project Management.

Monday, 12:30-14:00
Room: Clarendon SR 1.02

Authors (first author is the speaker)

1. Alena Otto
Technical University of Munich
2. Benedikt Finnah
Universität Duisburg-Essen
3. Jochen Gönsch
Mercator School of Management, University of Duisburg-Essen

Abstract

Insufficient data quality prevents data usage by decision support systems (DSS) in many areas of business. This is the case for data on precedence relations between tasks, which is relevant, for instance, in project scheduling and assembly line balancing. Inaccurate data on unnecessary precedence relations cannot be used, otherwise the recommendations of DSS may turn infeasible. So, unnecessary relations must be satisfied, diminishing the baseline problem's solution space and the business result. Experts can validate the data, but their time is limited. We apply an optimization lens and formulate the data validation problem (DVP). Restricted by the available time budget, an expert dynamically receives queries about specific data entries and corrects or validates them. The DVP searches for an interview policy that states queries to the expert, each using up some of the time budget, to maximize the (weighted) number of removed precedence relations. We model the DVP as a dynamic program, derive optimal policies for several important special cases and design a heuristic interview policy LSTD. In a case study of an automobile manufacturer, this policy substantially reduces the stations' idle time after selectively addressing about 8% of the data entries. We prove theoretically and numerically that data validation by experts can lead to significant savings.

Keywords

Status: accepted


Back to the list of papers