1466. Predictive performance and interpretability of machine learning algorithms compared to conventional mixed logit models in analyzing stated choice data
Invited abstract in session MB-4: Interpretable Optimization Methods and Applications, stream Data Science meets Optimization.
Monday, 10:30-12:00Room: Rupert Beckett LT
Authors (first author is the speaker)
| 1. | Christoph Herrmann
|
| School of Economics and Business, Martin Luther University Halle-Wittenberg | |
| 2. | Katharina Friederike Sträter
|
| Economics, Martin-Luther-University Halle-Wittenberg |
Abstract
Discrete choice experiments (DCEs) have become a popular tool to collect stated choice data. These structurally diverse data are traditionally analyzed using theory-based choice models to elicit the preferences behind a choice. In recent years, however, there have been several calls to also apply data-driven methods, such as machine learning algorithms. In this light, this paper investigates the added value of applying selected machine learning algorithms (random forest (RF), gradient boosting machine (GBM), neural network (NN), näive Bayes (NB)) to two structurally different examples of DCE-derived data, considering different types of data preparation. Thereby, special foci are on a) the predictive performance of the machine learning algorithms compared both to each other and a mixed multinomial logit (MMNL) model commonly used to analyze DCE-derived data, and b) whether it is possible to extract information about behavioral indicators from the machine learning algorithms' outcomes. The results indicate that the machine learning algorithms perform slightly better in terms of prediction when the data structure is rather simple. However, they provide only limited possibilities for interpretation, although variable importance (GBM and RF), SHAP-value (GBM and RF) and NB-specific conditional probabilities deliver important information that can help refine classical choice models. In addition, pitfalls that can arise due to DCE-specific data structures are discussed.
Keywords
- Analytics and Data Science
- Machine Learning
- Decision Analysis
Status: accepted
Back to the list of papers