================================================================================================================================ Integrating Stochastic Optimization and Machine Learning via Residuals To attend the seminar, click here. Speaker: Prof Güzin Bayraksan Professor and Associate Chair for Research in the Integrated Systems Engineering Department Affiliated faculty member of the Sustainability Institute and the Translational Data Analytics Institute at the Ohio State University The Ohio State University USA Abstract: We consider data-driven approaches that integrate a machine learning prediction model within stochastic optimization, given joint observations of uncertain parameters and covariates. Given a new covariate observation, the goal is to choose a decision that minimizes the expected cost conditioned on this observation. We first present a Sample Average Approximation (SAA) approach for approximating this problem that incorporates residuals from the learning step. Then, in the limited-data regime, we consider Distributionally Robust Optimization (DRO) variants of these models. Our framework is flexible in the sense that it can accommodate a variety of learning setups and DRO ambiguity sets. We investigate the asymptotic and finite sample properties of SAA and the DRO variants obtained using Wasserstein, sample robust optimization, and phi-divergence-based ambiguity sets. We discuss extensions to decision-dependent settings and present applications using real-world data. To attend the seminars, click here. A predict-and-optimize approach to profit-driven churn prevention Speaker 1: Nuria Gómez-Vargas PhD student at Department of Statistics and Operations Research University of Seville Spain Abstract: In this work, we introduce a novel predict-and-optimize method for profit-driven churn prevention. We frame the task of targeting customers for a retention campaign as a regret minimization problem. The main objective is to leverage individual customer lifetime values (CLVs) to ensure that only the most valuable customers are targeted. In contrast, many profit-driven strategies focus on churn probabilities while considering average CLVs. This often results in significant information loss due to data aggregation. Our proposed model aligns with the guidelines of Predict-and-Optimize (PnO) frameworks and can be efficiently solved using stochastic gradient descent methods. Mixed-Integer Quadratic Optimization and Iterative Clustering Techniques for Semi-Supervised Support Vector Machines Abstract: Among the most famous algorithms for solving classification problems are support vector machines (SVMs), which find a separating hyperplane for a set of labeled data points. In some applications, however, labels are only available for a subset of points. Furthermore, this subset can be non-representative, e.g., due to self-selection in a survey. Semi-supervised SVMs tackle the setting of labeled and unlabeled data and can often improve the reliability of the results. Moreover, additional information about the size of the classes can be available from undisclosed sources. We propose a mixed-integer quadratic optimization (MIQP) model that covers the setting of labeled and unlabeled data points as well as the overall number of points in each class. Since the MIQP’s solution time rapidly grows as the number of variables increases, we introduce an iterative clustering approach to reduce the model’s size. Moreover, we present an update rule for the required big-M values, prove the correctness of the iterative clustering method as well as derive tailored dimension-reduction and warm-starting techniques. Our numerical results show that our approach leads to a similar accuracy and precision than the MIQP formulation but at much lower computational cost. Thus, we can solve larger problems. With respect to the original SVM formulation, we observe that our approach has even better accuracy and precision for biased samples. Reliable Data-driven Decision Making Speaker 3: Bahar Taskesen Ph.D. candidate Risk Analytics and Optimization Lab École Polytechnique Fédérale de Lausanne (EPFL) Switzerland Abstract: We are witnessing a remarkable surge in data availability across various domains, including medicine, education, policy-making, marketing, civics, and many more. This data deluge has created opportunities for developing intelligent systems capable of implementing highly precise and personalized decisions at unprecedented scales. Simultaneously, the application of machine learning in areas such as criminal justice and health care, which carry significant consequences for individuals, has prompted inquiries into the appropriate design of these systems to ensure alignment with our societal values. In this talk, I will use optimal transport (OT), which seeks the most efficient way of morphing one distribution into another one, as a tool to model and audit data-driven decision-making systems. First, we will see how OT gives rise to a rich class of data-driven distributionally robust optimization (DRO) models, which study worst-case risk minimization problems under distributional ambiguity. We will then shift our focus to an auditing perspective and see how OT can naturally facilitate a statistical test for the algorithmic fairness of pre-trained machine learning models. A significant yet unexplored aspect of OT is its computational complexity. Addressing this gap, we will see the computational complexity of generic OT problems. Later, we will see that even though generic OT problems are computationally hard, we can develop reliable data-driven decision-making models that are tractable in static and dynamic environments and would bring out-of-sample guarantees. In particular, we will see the optimality of linear policies in OT-based robust linear-quadratic control problems with imperfect state observations, and we will show that these policies can be computed efficiently using dynamic programming, Kalman filtering, and automatic differentiation. Online learning and decision-making for renewables participating in electricity markets To attend the seminar, click here. Speaker: Prof Pierre Pinson Chair of Data-Centric Design Engineering at Imperial College London, Dyson School of Design Engineering, United Kingdom Chief Scientist at Halfspace, Denmark Technical University of Denmark, Department of Technology, Management and Economics, Denmark Abstract: There is extensive literature on the analytics involved in the participation of renewable energy producers in electricity markets, covering both forecasting and decision-making. In their simplest form, participation strategies are to be seen as newsvendor problems (taking a decision-making perspective), or quantile regression problems (if taking a forecasting perspective instead). We will therefore explore recent advances at the interface between learning, forecasting and stochastic optimisation of relevance to renewable energy producers participating in electricity markets. This will cover online learning and decision-making, as well as distributionally robust optimisation.April 22, 2024, 16.30 – 17.30 (CET)
April 29, 2024, 16.30 – 17.30 (CET)
May 6, 2024, 16.30 – 17.30 (CET)