EURO-Online login
- New to EURO? Create an account
- I forgot my username and/or my password.
- Help with cookies
(important for IE8 users)
2723. Simulating Data Envelopment Analysis with Machine Learning: A Clustering-Based Data Preprocessing Technique for Training Set Selection
Invited abstract in session MB-48: DEA and Machine Learning, stream Data Envelopment Analysis and its Application.
Monday, 10:30-12:00Room: 60 (building: 324)
Authors (first author is the speaker)
1. | Barbara Kaminska
|
Department of Management Systems and Organization Development, Wroclaw University of Science and Technology | |
2. | Dimitrios-Georgios Sotiros
|
Department of Operations Research and Business Intelligence, Wroclaw University of Science and Technology |
Abstract
Data Envelopment Analysis (DEA) is a non-parametric technique for measuring the relative efficiency of a set of decision making units (DMUs), on the basis of multiple inputs and multiple outputs. Performing a typical analysis with DEA requires to solve a series of linear programs, one for each DMU. Therefore, DEA suffers from the curse of dimensionality, i.e., on big data the computational load is very high. This issue is commonly treated in the literature with the adoption of Machine Learning (ML) algorithms. Nevertheless, even though the selection of the training dataset is of crucial importance in such algorithms, in the DEA literature this factor is neglected and all methods rely on random sampling. In this paper, we built on the existing literature and we introduce a clustering-based data preprocessing technique to select the training dataset in a way that it represents the entire dataset as much as possible. We use simulated data to test this new technique against random sampling under different ML algorithms, number of netputs and standard DEA models. We further test it on a network DEA model for two-stage series structures in which the efficiency scores are represented in a two-dimensional vector. In all cases, the results highlight that the proposed technique increases the accuracy of the ML algorithms, whereas it may even decrease the required computational load.
Keywords
- Data Envelopment Analysis
- Efficiency Analysis
- Machine Learning
Status: accepted
Back to the list of papers