To receive news from us, please register to the mailing list. You will receive information about our upcoming seminars and the link to participate. Our seminars are on Mondays, 16.30-17.30 CET, and we typically send the link on Monday morning.
EURO – The Association of European Operational Research Societies has a new instrument: the EURO Online Seminar Series (EURO OSS), and this is the page of the EURO OSS on Operational Research and Machine Learning, with website https://euroorml.euro-online.org/ and link to register to the mailinglist at https://forms.gle/YWLb6EPKRQnQert68.
The EURO OSS on Operational Research and Machine Learning is an online seminar series with the goal to brand the role of Operational Research in Artificial Intelligence. The format is a weekly session that takes place every Monday, 16.30-17.30 (CET). It is 100% online-access, and it has leading speakers from Operational Research, as well as neighbouring areas, that will cover important topics such as explainability, fairness, fraud, privacy, etc. We also have the YOUNG EURO OSS on Operational Research and Machine Learning. In each YOUNG session, three junior academics will show their latest results in this burgeoning area. All talks are recorded, and the videos are uploaded to the website.
There are also joint sessions with a so-called “Meet the …” talk plus a lecture. The “Meet the …” talks present either a EURO instrument or another relevant institution to the theme of the EURO OSS. We have approached EURO itself, and the president (in October), Prof Anita Schöbel, gave a few words on the opening of the OSS about EURO and its instruments. We have also organized “Meet the … ” talks with the EURO WISDOM Forum, the EURO Working Group on Continuous Optimization, the EURO Working Group Data Science Meets Optimization, the EURO Journal on Computational Optimization, the European Journal of Operational Research, and the INFORMS Journal on Data Science.
The EURO OSS on Operational Research and Machine Learning is organized by Emilio Carrizosa (IMUS – Instituto de Matemáticas de la Universidad de Sevilla) and Dolores Romero Morales (CBS – Copenhagen Business School) with the collaboration of PhD students Nuria Gómez-Vargas (IMUS) and Thomas Halskov (CBS). The Online Seminar Series is free thanks to the support given by EURO, as well as Universidad de Sevilla (US) and Copenhagen Business School (CBS). This is gratefully acknowledged.
For the academic year 2024/25, we have confirmed the participation of the following speakers (in alphabetical order):
YOUNG EURO OSS on Operational Research and Machine Learning speakers
The EURO OSS on Operational Research and Machine Learning is the sequel of a weekly online event that we have run January 2021-May2024, https://congreso.us.es/mlneedsmo/. We had more than 100 speakers mainly from Europe and North America. There were more than 2000 colleagues registered to receive updates. For some of our speakers we have had more than 200 attendees, but there are also quite a few colleagues that watch the videos instead at our YT channel. This YT channel has more than 1000 subscribers, and some of the talks in our Online Seminar Series have more than 1,000 views so far.
Using AI for Fraud Detection: Recent Research Insights and Emerging Opportunities
Abstract:
Typically, organizations lose around five percent of their revenue to fraud. In this presentation, we explore advanced AI techniques to address this issue. Drawing on our recent research, we begin by examining cost-sensitive fraud detection methods, such as CS-Logit which integrates the economic imbalances inherent in fraud detection into the optimization of AI models. We then move on to data engineering strategies that enhance the predictive capabilities of both the data and AI models through intelligent instance and feature engineering. We also delve into network data, showcasing our innovative research methods like Gotcha and CATCHM for effective data featurization. A significant focus is placed on Explainable AI (XAI), which demystifies high-performance AI models used in fraud detection, aiding in the development of effective fraud prevention strategies. We provide practical examples from various sectors including credit card fraud, anti-money laundering, insurance fraud, tax evasion, and payment transaction fraud. Furthermore, we discuss the overarching issue of model risk, which encompasses everything from data input to AI model deployment. Throughout the presentation, the speaker will thoroughly discuss his recent research, conducted in partnership with leading global financial institutions such as BNP Paribas Fortis, Allianz, ING, and Ageas.
Verifying message-passing neural networks via topology-based bounds tightening
Speaker: Prof Ruth Misener
BASF / Royal Academy of Engineering Research Chair in Data Driven Optimisation
Department of Computing
Imperial College London
Abstract:
Since graph neural networks (GNNs) are often vulnerable to attack, we need to know when we can trust them. We develop a computationally effective approach towards providing robust certificates for message-passing neural networks (MPNNs) using a Rectified Linear Unit (ReLU) activation function. Because our work builds on mixed-integer optimization, it encodes a wide variety of subproblems, for example it admits (i) both adding and removing edges, (ii) both global and local budgets, and (iii) both topological perturbations and feature modifications. Our key technology, topology-based bounds tightening, uses graph structure to tighten bounds. We also experiment with aggressive bounds tightening to dynamically change the optimization constraints by tightening variable bounds. To demonstrate the effectiveness of these strategies, we implement an extension to the open-source branch-and-cut solver SCIP. We test on both node and graph classification problems and consider topological attacks that both add and remove edges.
This work is joint with Christopher Hojny, Shiqiang Zhang, and Juan Campos.
I YOUNG Session
K-anonymous counterfactual explanations
Speaker 1: Sofie Goethals
Postdoctoral researcher at Columbia Business School, USA
University of Antwerp, Belgium
Abstract:
Black-box machine learning models are used in an increasing number of high-stakes domains, and this creates a growing need for Explainable AI (XAI). However, the use of XAI in machine learning introduces privacy risks, which currently remain largely unnoticed. Therefore, we explore the possibility of an explanation linkage attack, which can occur when deploying instance-based strategies to find counterfactual explanations. To counter such an attack, we propose k-anonymous counterfactual explanations and introduce pureness as a metric to evaluate the validity of these k-anonymous counterfactual explanations. Our results show that making the explanations, rather than the whole dataset, k-anonymous, is beneficial for the explanation quality.
Neur2BiLO: Neural Bilevel Optimization
Speaker 2: Esther Julien
PhD Student at TU Delft, The Netherlands
Abstract:
Bilevel optimization deals with nested problems in which a leader takes the first decision to minimize their objective function while accounting for a follower’s best-response reaction. Constrained bilevel problems with integer variables are particularly notorious for their hardness. While exact solvers have been proposed for mixed-integer linear bilevel optimization, they tend to scale poorly with problem size and are hard to generalize to the non-linear case. On the other hand, problem-specific algorithms (exact and heuristic) are limited in scope. Under a data-driven setting in which similar instances of a bilevel problem are solved routinely, our proposed framework, Neur2BiLO, embeds a neural network approximation of the leader’s or follower’s value function, trained via supervised regression, into an easy-to-solve mixed-integer program. Neur2BiLO serves as a heuristic that produces high-quality solutions extremely fast for linear/non-linear, integer/mixed-integer bilevel problems.
On constrained mixed-integer DR-submodular minimization
Speaker 3: Qimeng (Kim) Yu
Department of Computer Science and Operations Research (DIRO), Université de Montréal
Abstract:
Submodular set functions play an important role in integer programming and combinatorial optimization. Increasingly, applications call for generalized submodular functions defined over general integer or continuous domains. Diminishing returns (DR)–submodular functions are one of such generalizations, which encompass a broad class of functions that are generally nonconvex and nonconcave. We study the problem of minimizing any DR-submodular function with continuous and general integer variables under box constraints and, possibly, additional monotonicity constraints. We propose valid linear inequalities for the epigraph of any DR-submodular function under the constraints. We further provide the complete convex hull of such an epigraph, which, surprisingly, turns out to be polyhedral. We propose a polynomial-time exact separation algorithm for our proposed valid inequalities with which we first establish the polynomial-time solvability of this class of mixed-integer nonlinear optimization problems.
Need to relax - but perhaps later? Reflections on modeling sparsity and mixed-binary nonconvex problems
Abstract:
In some ML communities, researchers claim that obtaining local solutions of optimality criteria is often sufficient to provide a meaningful and accurate data model in real-world analytics. However, this is simply incorrect and sometimes dangerously misleading, particularly when it comes to highly structured problems involving non-convexity such as discrete decisions (binary variables). This talk will advocate the necessity of research efforts in the quest for global solutions and strong rigorous bounds for quality guarantees, showcased on one of the nowadays most popular domains — cardinality-constrained models. These models try to achieve fairness, transparency and explainability in AI applications, ranging from Math.Finance/Economics to social and life sciences.
From a computational viewpoint, it may be tempting to replace the zero-norm (number of nonzero variables) with surrogates, for the benefit of tractability. We argue that these relaxations come too early. Instead, we propose to incorporate the true zero-norm into the base model and treat this either by MILP relaxations or else by lifting to tractable conic optimization models. Both in practice and in theory, these have proved to achieve much stronger bounds than the usual LP-based ones, and therefore they may, more reliably and based upon exact arguments, assess the quality of proposals coming from other techniques in a more precise way. With some effort invested in the theory (aka later relaxations), the resulting models are still scalable and would guarantee computational performance closer to reality and/or optimality.
Large scale Minimum Sum of Squares Clustering with optimality guarantees
Abstract:
Minimum Sum of Squares clustering is one of the most important problems in unsupervised learning and aims to partition n observations into k clusters to minimize the sum of squared distances from the points to the centroid of their cluster. The MSSC commonly arises in a wide range of disciplines and applications, and since it is an unsupervised task it is vital to certify the quality of the produced solution. Recently, exact approaches able to solve small-medium size instances have been proposed, but when the dimension of the instance grows, they become impractical. In this work, we try and exploit the ability to solve small instances to certify the quality of heuristic solutions for large-scale datasets. The fundamental observation is that the minimum MSSC of the union of disjoint subsets is greater than or equal to the sum of the MSSC of the individual subsets. This implies that summing up the optimal value of the MSSC on disjoint subsets of a dataset provides a lower bound on the optimal WSS for the entire dataset. The quality of the lower bound is strongly dependent on how the dataset is partitioned into disjoint subsets. To improve this lower bound, we aim to maximize the minimum WSS of each subset by creating subsets of points with high dissimilarity. This approach, known as anticlustering or maximally diverse grouping problem in the literature, seeks to form highly heterogeneous partitions of a given dataset. By approximately solving an anticlustering problem, we develop a certification process to validate clustering solutions obtained using a heuristic algorithm. We test our method on large-scale instances with datasets ranging from 2,000 to 10,000 data points and containing 2 to 500 features. Our procedure consistently achieved a gap between the clustering solution and the lower bound ranging from 1% to 5%.
Learning to Branch and to Cut: Enhancing Non-Linear Optimization with Machine Learning
Speaker: Prof Bissan Ghaddar
Associate Professor, Management Science & Sustainability
John. M. Thompson Chair in Engineering Leadership & Innovation
Ivey Business School, Canada
Abstract:
The use of machine learning techniques to improve the performance of branch-and-bound optimization algorithms is a very active area in the context of mixed integer linear problems, but little has been done for non-linear optimization. To bridge this gap, we develop a learning framework for spatial branching and show its efficacy in the context of the Reformulation-Linearization Technique for polynomial optimization problems. The proposed learning is performed offline, based on instance-specific features and with no computational overhead when solving new instances. Novel graph-based features are introduced, which turn out to play an important role in the learning. Experiments on different benchmark instances from MINLPLib and QPLIB show that the learning-based branching selection and learning-based constraint generation significantly outperform the standard approaches.
Pragmatic OR: Solving large scale optimization problems in fast moving environments
Speaker: Prof Rubén Ruiz
Principal Applied Scientist at Amazon Web Services (AWS)
Professor of Statistics and Operations Research, Universitat Politècnica de València (on leave)
Abstract:
This talk examines the gap between academic Operations Research and real-world industrial applications, particularly in environments like Amazon and AWS where sheer scale and delivery speed are important factors to consider. While academic research often prioritizes complex algorithms and optimal solutions, large-scale industrial problems demand more pragmatic approaches. These real-world scenarios frequently involve multiple objectives, soft constraints, and rapidly evolving business requirements that require flexibility and quick adaptation.
We argue for the use of heuristic solvers and simplified modeling techniques that prioritize speed, adaptability, and ease of implementation over strict optimality or complex approaches. This angle is particularly valuable when dealing with estimated input data, where pursuing optimality may be less meaningful.
The presentation will showcase various examples, including classical routing and scheduling problems, as well as more complex scenarios like virtual machine placement in Amazon EC2. These cases illustrate how pragmatic methods can effectively address real-world challenges, offering robust and maintainable solutions that balance performance with operational efficiency. The goal is to demonstrate that in many industrial applications, a small optimality gap is an acceptable trade-off for significantly improved flexibility and reduced operational overhead.
From Data to Donations: Optimal Fundraising Campaigns for Non-Profit Organizations
Speaker: Prof Wolfram Wiesemann
Professor of Analytics & Operations
Imperial College Business School
Imperial College
UK
Abstract:
Non-profit organizations play an essential role in addressing global challenges, yet their financial sustainability often depends on raising funds through resource-intensive campaigns. We partner with a major international non-profit to develop and test data-driven approaches to enhance the efficiency of their fundraising efforts. The organization conducts multiple annually recurring campaigns, with thematic links between them. These connections enable us to predict a donor’s propensity to contribute to one campaign based on their behavior in others. This structure is common among non-profits but not readily utilized by conventional multi-armed bandit algorithms. To leverage these inter-campaign patterns, we design two algorithms that integrate concepts from the multi-armed bandit and clustering literature. We analyze the theoretical properties of both algorithms, and we empirically validate their effectiveness on both synthetic and real data from our partner organization.
The differentiable Feasibility Pump
Speaker: Prof Andrea Lodi
Andrew H. and Ann R. Tisch Professor
Jacobs Technion-Cornell Institute
Cornell University
USA
Abstract:
Although nearly 20 years have passed since its conception, the feasibility pump algorithm remains a widely used heuristic to find feasible primal solutions to mixed-integer linear problems. Many extensions of the initial algorithm have been proposed. Yet, its core algorithm remains centered around two key steps: solving the linear relaxation of the original problem to obtain a solution that respects the constraints, and rounding it to obtain an integer solution. This paper shows that the traditional feasibility pump and many of its follow-ups can be seen as gradient-descent algorithms with specific parameters. A central aspect of this reinterpretation is observing that the traditional algorithm differentiates the solution of the linear relaxation with respect to its cost. This reinterpretation opens many opportunities for improving the performance of the original algorithm. We study how to modify the gradient-update step as well as extending its loss function. We perform extensive experiments on MIPLIB instances and show that these modifications can substantially reduce the number of iterations needed to find a solution. (Joint work with M. Cacciola, A. Forel, A. Frangioni).
Introducing the Cloud of Spheres Classification: An Alternative to Black-Box Models
Speaker: Prof Paula Alexandra Amaral
Associate Professor, Department of Mathematics & Nova Math
Nova School of Science & Technology
University “Nova de Lisboa”
Portugal
Abstract:
Machine learning models have been widely applied across various domains, often without critically examining the underlying mechanisms. Black-box models, such as Deep Neural Networks, pose significant challenges in terms of counterfactual analysis, interpretability, and explainability. In situations where understanding the rationale behind a model’s predictions is essential, exploring more transparent machine learning techniques becomes highly advantageous.
In this presentation, we introduce a novel binary classification model called a Cloud of Spheres. The model is formulated as a Mixed-Integer Nonlinear Programming (MINLP) problem that seeks to minimize the number of spheres required to classify data points. This approach is particularly well-suited for scenarios where the structure of the target class is highly non-linear and non-convex, but it can also adapt to cases with linear separability.
Unlike neural networks, this classification model retains data in its original feature space, eliminating the need for kernel functions or extensive hyperparameter tuning. Only one parameter may be required if the objective is to maximize the separation margin, similar to Support Vector Machines (SVMs). By avoiding black-box complexity, the model enhances transparency and interpretability.
A significant challenge with this method lies in finding the global optima for large datasets. To address this, we propose a heuristic approach that has demonstrated good performance on a benchmark set. When compared to state-of-the-art algorithms, the heuristic delivers competitive results, showcasing the model’s effectiveness and practical applicability.
II YOUNG Session
A unified approach to extract interpretable rules from tree ensembles via Integer Programming
Speaker 1: Dr Lorenzo Bonasera
Research Scientist at Aerospace Center (DLR) – Institute for AI Safety and Security, Germany
Abstract:
Tree ensemble methods represent a popular machine learning model, known for their effectiveness in supervised classification and regression tasks. Their performance derives from aggregating predictions of multiple decision trees, which are renowned for their interpretability properties. However, tree ensemble methods do not reliably exhibit interpretable output. Our work aims to extract an optimized list of rules from a trained tree ensemble, providing the user with a condensed, interpretable model that retains most of the predictive power of the full model. Our approach consists of solving a partitioning problem formulated through Integer Programming. The proposed method works with either tabular or time series data, for both classification and regression tasks, and it can include any kind of nonlinear loss function. Through rigorous computational experiments, we offer statistically significant evidence that our method is competitive with other rule extraction methods, and it excels in terms of fidelity towards the opaque model.
A decomposition algorithm for sparse and fair soft regression trees
Speaker 2: Dr Antonio Consolo
Postdoctoral researcher at Politecnico di Milano, Department of Electronics, Information and Bioengineering, Italy
Abstract:
In supervised Machine Learning (ML) models, achieving sparsity in input features is crucial not only for feature selection but also for enhancing model interpretability and potentially improving testing accuracy. Another crucial aspect is to take into account fairness measures, which allows to guarantee comparable performance for individuals belonging to sensible groups. This talk focuses on sparsity and fairness in decision trees for regression tasks, a widely used interpretable ML model.
We recently proposed a new model variant for Soft Regression Trees (SRTs), which are decision trees with probabilistic decision rules at each branch node and a linear regression at each leaf node. Unlike in the previous SRTs approaches, in our variant each leaf node provides a prediction with its associated probability but, for any input vector, the output of the SRT is given by the regression associated with a single leaf node, namely, the one reached by following the branches with the highest probability. Our SRT variant exhibits the “conditional computation” property (a small portion of the tree architecture is activated when generating a prediction for a given data point), provides posterior probabilities, and is well-suited for decomposition. The convergent decomposition training algorithm turned out to yield accurate soft trees in short cpu time.
In this work, we extend the SRT model variant and the decomposition training algorithm to take into account model sparsity and fairness measures. Experiments on datasets from the UCI and KEEL repositories indicate that our approach effectively induces sparsity. Computational results on different datasets show that the decomposition algorithm, adapted to take into account fairness constraints, yield accurate SRTs which guarantee comparable performance between sensitive groups, without significantly affecting the computational load.
Evaluating Large Language Models for Real Estate Valuation: Insights into Performance and Interpretability
Speaker 3: Margot Geerts
PhD student at Research Centre for Information Systems Engineering (LIRIS), Faculty of Economics and Business, KU Leuven, Belgium
Abstract:
The real estate market is vital globally, yet stakeholders often struggle with information asymmetry in property valuation. This study explores how Large Language Models (LLMs) can improve transparency by accurately predicting house prices and providing insights into predictions. Evaluating GPT-4o-mini, Llama 3, and others across diverse datasets, we find that LLMs leverage hedonic variables effectively, achieving accuracy comparable to Gradient Boosted Trees and surpassing k-Nearest Neighbors. Tailored prompting strategies enhance their performance, with market-specific adjustments remaining crucial. While LLM explanations align with state-of-the-art models, their prediction intervals are often overconfident, revealing limitations in reliability and spatial reasoning. This research underscores LLMs’ potential to reduce knowledge gaps, offering accessible, data-driven insights for more transparent valuations.
Bridging smooth regression and mathematical optimization
Speaker: Prof Vanesa Guerrero
Associate Professor
Department of Statistics
Universidad Carlos III de Madrid
Spain
Abstract:
In today’s data-driven decision-making landscape, it is crucial to develop systems that integrate human knowledge and provide valuable support to the decider. This talk explores the intersection of statistical modeling and mathematical optimization to address the estimation of smooth functions which verify structural properties (e.g. about sign, monotonicity or curvature). By leveraging these shape-constrained functions, we construct surrogate models for complex functions, enabling their use in mixed-integer nonlinear programming (MINLP). This approach exploits separability to enhance tractability, offering a practical framework for tackling challenging optimization problems in real-world applications.
Counterfactual Token Generation in Large Language Models
Abstract:
“Sure, I am happy to generate a story for you: Captain Lyra stood at the helm of her trusty ship, the Maelstrom’s Fury, gazing out at the endless sea. […] Lyra’s eyes welled up with tears as she realized the bitter truth – she had sacrificed everything for fleeting riches, and lost the love of her crew, her family, and herself.” Although this story, generated by a large language model, is captivating, one may wonder—how would the story have unfolded if the model had chosen “Captain Maeve” as the protagonist instead? We cannot know. State-of-the-art large language models are stateless—they maintain no internal memory or state. Given a prompt, they generate a sequence of tokens as an output using an autoregressive process. As a consequence, they cannot reason about counterfactual alternatives to tokens they have generated in the past. In this work, our goal is to enhance them with this functionality. To this end, we develop a causal model of token generation that builds upon the Gumbel-Max structural causal model. Our model allows any large language model to perform counterfactual token generation at almost no cost in comparison with vanilla token generation, it is embarrassingly simple to implement, and it does not require any fine-tuning nor prompt engineering. We implement our model on Llama 3 8B-Instruct and Ministral-8B-Instruct and conduct a qualitative and a quantitative analysis of counterfactually generated text. We conclude with a demonstrative application of counterfactual token generation for bias detection, unveiling interesting insights about the model of the world constructed by large language models.
II YOUNG Session
Accelerating Benders decomposition for the p-median problem through variable aggregation
Speaker 1: Rick Willemsen
PhD candidate at Erasmus University Rotterdam, The Netherlands
Abstract:
The p-median problem is a classical location problem where the goal is to select p facilities while minimizing the sum of distances from each location to its nearest facility. Recent advancements in solving the p-median and related problems have successfully leveraged Benders decomposition methods. The current bottleneck is the large number of variables and Benders cuts that are needed. We consider variable aggregation to reduce the size of these models. We propose to partially aggregate the variables in the model based on a start solution; aggregation occurs only when the corresponding locations are assigned to the same facility in the initial solution. In addition, we propose a set of valid inequalities tailored to these aggregated variables. Our computational experiments indicate that our model, post-initialization, provides a stronger lower bound, thereby accelerating the resolution of the root node. Furthermore, this approach seems to positively impact the branching procedure, leading to an overall faster Benders decomposition method.
Fair and Accurate Regression: Strong Formulations and Algorithms
Speaker 2: Anna Deza
IEOR PhD candidate at UC Berkeley, California, USA
Abstract:
We introduce mixed-integer optimization methods to solve regression problems with fairness metrics. To tackle this computationally hard problem, we study the polynomially-solvable single-factor and single-observation subproblems as building blocks and derive their closed convex hull descriptions. Strong formulations obtained for the general fair regression problem in this manner are utilized to solve the problem with a branch-and-bound algorithm exactly or as a relaxation to produce fair and accurate models rapidly. To handle large-scale instances, we develop a coordinate descent algorithm motivated by the convex-hull representation of the single-factor fair regression problem to improve a given solution efficiently. Numerical experiments show competitive statistical performance with state-of-the-art methods while significantly reducing training times.
A Model-Agnostic Framework for Collective and Sparse “Black-Box” Explanations
Speaker 3: Thomas Halskov
PhD Fellow, Copenhagen Business School, Denmark
Abstract:
The widespread adoption of “black-box” models in critical applications highlights the need for transparent and consistent explanations. While local explainer methods like LIME offer insights for individual predictions, they do not consider global constraints, such as continuity and sparsity. We develop a global explanation framework that enforces these constraints on top of local explanations. We illustrate on real-world datasets that we can generate stable explanations across multiple observations, while maintaining desirable global properties.
Data-Driven Algorithm Design and Verification for Parametric Convex Optimization
Speaker: Prof Bartolomeo Stellato
Assistant Professor
Department of Operations Research and Financial Engineering
Princeton University
USA
Abstract:
We present computational tools to analyze and design first-order methods in parametric convex optimization. First-order methods are popular for their low per-iteration cost and warm-starting capabilities. However, precisely quantifying the number of iterations needed to compute high-quality solutions remains a key challenge, especially in fast applications where real-time execution is crucial. In the first part of the talk, we introduce a numerical framework for verifying the worst-case performance of first-order methods in parametric quadratic optimization. We formulate the verification problem as a mixed-integer linear program where the objective is to maximize the infinity norm of the fixed-point residual after a given number of iterations. Our approach captures a wide range of gradient, projection, proximal iterations through affine or piecewise affine constraints. We derive tight polyhedral convex hull formulations of the constraints representing the algorithm iterations. To improve the scalability, we develop a custom bound tightening technique combining interval propagation, operator theory, and optimization-based bound tightening. Numerical examples show that our method closely matches the true worst-case performance by providing several orders of magnitude reductions in the worst-case fixed-point residuals compared to standard convergence analyses. In the second part of the talk, we present a data-driven approach to analyze the performance of first-order methods using statistical learning theory. We build generalization guarantees for classical optimizers, using sample convergence bounds, and for learned optimizers, using the Probably Approximately Correct (PAC)-Bayes framework. We then use this framework to learn accelerated first-order methods by minimizing the PAC-Bayes bound directly over the key parameters of the algorithms (e.g., gradient steps, and warm-starts). Numerical experiments show that our approach provides strong generalization guarantees for both classical and learned optimizers with statistical bounds that are very close to the true out of sample performance.
Pareto sensitivity, most-changing sub-fronts, and knee solutions
Speaker: Prof Luis Nunes Vicente
Timothy J. Wilmott Endowed Chair Professor and Department Chair
Department of Industrial and Systems Engineering
Lehigh University
USA
Abstract:
When dealing with a multi-objective optimization problem, obtaining a comprehensive representation of the Pareto front can be computationally expensive. Furthermore, identifying the most representative Pareto solutions can be difficult and sometimes ambiguous. A popular selection are the so-called Pareto knee solutions, where a small improvement in any objective leads to a large deterioration in at least one other objective. In this talk, using Pareto sensitivity, we show how to compute Pareto knee solutions according to their verbal definition of least maximal change. We refer to the resulting approach as the sensitivity knee (snee) approach, and we apply it to unconstrained and constrained problems. Pareto sensitivity can also be used to compute the most-changing Pareto sub-fronts around a Pareto solution, where the points are distributed along directions of maximum change, which could be of interest in a decision-making process if one is willing to explore solutions around a current one. Our approach is still restricted to scalarized methods, in particular to the weighted-sum or epsilon-constrained methods, and require the computation or approximations of first- and second-order derivatives. We include numerical results from synthetic problems and multi-task learning instances that illustrate the benefits of our approach.
Concepts from cooperative game theory for feature selection
Speaker: Prof Marleen Balvert
Associate professor Operations Research & Machine Learning
Zero Hunger Lab, Tilburg University
The Netherlands
Abstract:
In classification problems, one is often interested in finding the features that determine the class of a sample. One approach is to first train a classification model – such as a random forest or a support vector machine – and then compute feature importance scores using e.g. SHAP or LIME. SHAP is a popular approach to compute feature importance scores, based on the concept of the Shapley value from cooperative game theory. Following the idea of using the Shapley value for feature importance scores, we look into the applicability of other metrics from cooperative game theory as feature importance scores.
Offline Reinforcement Learning for Combinatorial Optimization: A Data-Driven End-to-End Approach to Job Shop Scheduling
Speaker: Prof Yingqian Zhang
Associate Professor of AI for Decision Making
Department of Industrial Engineering and Innovation Sciences
Eindhoven University of Technology
The Netherlands
Abstract:
The application of (deep) reinforcement learning (RL) to solve combinatorial optimization problems (COPs) has become a very active research field in AI and OR over the past five years. Most existing approaches rely on online RL methods, where an agent learns by interacting with a surrogate environment, typically implemented as a simulator or simply a function. While online RL methods have demonstrated their effectiveness for many COPs, they suffer from two major limitations: sample inefficiency, requiring numerous interactions to learn effective policies, and inability to leverage existing data, including (near-optimal) solutions generated by well-established algorithms. In contrast, Offline (or data-driven) RL learns directly from pre-existing data, offering a promising alternative. In this talk, I will present the first fully end-to-end offline RL method for solving (flexible) job shop scheduling problems (JSSPs). Our approach trains on data generated by applying various algorithms to JSSP instances. We show that, with only 100 training instances, our offline RL approach achieves comparable or better performance to its online RL counterparts and outperforms the algorithms used to generate the training data.
Linear and nonlinear learning of optimization problems
Speaker: Prof Coralia Cartis
Professor of Numerical Optimization
Mathematical Institute
University of Oxford
United Kingdom
Abstract:
We discuss various ways of improving scalability and performance of optimisation algorithms when special structure is present: from linear embeddings to nonlinear ones, from deterministic to stochastic approaches. Both local and global optimization methods will be addressed.
Feature selection in linear Support Vector Machines via a hard cardinality constraint: a scalable conic decomposition approach
Speaker: Prof Laura Palagi
Department of Computer, Control and Management Engineering A. Ruberti
Sapienza University of Rome
Italy
Abstract:
In this talk, we present the embedded feature selection problem in linear Support Vector Machines (SVMs), in which an explicit cardinality constraint is employed. The aim is to obtain an interpretable classification model.
The problem is NP-hard due to the presence of the cardinality constraint, even though the original linear SVM amounts to a problem solvable in polynomial time. To handle the hard problem, we introduce two mixed-integer formulations for which novel semidefinite relaxations are proposed. Exploiting the sparsity pattern of the relaxations, we decompose the problems and obtain equivalent relaxations in a much smaller cone, making the conic approaches scalable. We propose heuristics using the information on the optimal solution of the relaxations. Moreover, an exact procedure is proposed by solving a sequence of mixed-integer decomposed semidefinite optimization problems. Numerical results on classical benchmarking datasets are reported, showing the efficiency and effectiveness of our approach.
Joint paper with Immanuel Bomze, Federico D’Onofrio, Bo Peng.
Deep generative models in stochastic optimization
Abstract:
Many important decisions are taken under uncertainty since we do not know the development of various parameters. In particular, the ongoing green transition requires large and urgent societal investments in new energy modes, infrastructure and technology. The decisions are spanning over a very long time-horizon, and there are large uncertainty towards energy prices, demand of energy, and production from renewable sources.
Stochastic optimization methods can help us make better investment decisions. However, the found results are limited by the quality of the scenarios we use as input.
Recently, deep generative models have been developed that can generate a plenitude of realistic scenarios, but current solvers can frequently not handle more than a handful of scenarios before the model becomes too complex to solve.
In the present talk we will show how to integrate deep generative models as part of the solution process in stochastic optimization, such that we both get faster solution times and more well-founded decisions.
Computational results will be presented for some 2-stage optimization problems, showing the benefits of the integrated approach. The studied problems include 2-stage facility location problems and 2-stage transportation problems.
The research is part of the ERC advanced grant project “DECIDE”.
IV YOUNG Session
Speaker 1: Marica Magagnini
PhD student, Università di Camerino, School of Science and Technology, Italy.
Abstract:
Machine Learning algorithms usually make decisions without providing explanations of how those decisions were made. As a result, such models lack user understanding and trust. Furthermore, the inherent variability and noise present in real-world scenarios frequently manifest as feature uncertainty in data, posing a common challenge in machine learning.
In sensitive contexts, such as those involving uncertain data, generating robust explanations is crucial. These explanations must account for the variability of uncertainty while still offering flexibility.
Counterfactuals are user-friendly explanations that provide valuable information to determine what should be changed in order to modify the outcome of a black box decision-making model without revealing the underlying algorithmic details. We propose an optimization problem to obtain counterfactual explanations when a k-Nearest Neighbors classifier is employed in a binary classification. The geometric framework of the problem supports the quality of the explanations.
Mixture of Gaussians as a Near-Optimal Mechanism in Approximate Differential Privacy
Speaker 2: Aras Selvi
PhD student, Analytics & Operations, Imperial College Business School, UK
Abstract:
We design a class of additive noise mechanisms that satisfy (ε, δ)-differential privacy (approximate DP) for scalar real-valued query functions whose sensitivities are known. Our mechanisms, named mixture mechanisms, are obtained by taking the mixture of an odd number of (quasi-)Gaussian distributions. Our mechanisms achieve significantly smaller expected noise amplitudes (l1-loss) and variances (l2-loss for zero-mean distributions) compared to best known mechanisms. We develop exact (for 3 peaks) and approximate (for more peaks) logarithmic-time algorithms to tune the mixture weights and variances of the Gaussian distributions used in our mechanism, to minimize an expected loss while satisfying a pre-specified level of privacy guarantee. We illustrate via numerical experiments that our approach attains near-optimal expected losses that are almost equal to theoretical lower bounds in various privacy regimes.
Co-authors: Huikang Liu, Aras Selvi, and Wolfram Wiesemann.
Speaker 3: Federico D’Onofrio
PhD Student, IASI-CNR, Istituto di Analisi dei Sistemi ed Informatica “A. Ruberti”, Italy
Abstract:
In this talk, we focus on embedded feature selection methods for nonlinear Support Vector Machines (SVMs) in classification tasks. Our framework is based on the dual formulation of SVMs with nonlinear kernels, which enables the modeling of complex relationships among variables. While many existing approaches rely on linear models and continuous relaxations, our work addresses the problem in its full combinatorial nature.
We aim for hard feature selection by imposing a cardinality constraint on the selected features, reflecting the realistic scenario in which the user knows in advance the number of relevant features to extract. Modeling this requirement leads to a challenging non-convex mixed-integer nonlinear programming problem. To solve it, we develop decomposition algorithms that exploit the combinatorial properties of the resulting subproblems.
We compare the proposed approach with analogous models based on linear SVMs, and conclude with a discussion of possible extensions and generalizations of the methodology.
Gurobi Machine Learning
Abstract:
Gurobi-machinelearning is an open-source python package to formulate trained regression models in a gurobipy model that can then be solved with the Gurobi solver. We will present this package with a focus on trained neural networks (NN). Representing such as constraints in a MIP leads to large models. Optimization based bound tightening (OBBT) can be used to improve performance for solving such models. We will discuss how Gurobi has been extended to perform OBBT tailored to NN models and present performance results.
Learning Fair and Robust Support Vector Machine Models
Speaker: Prof Francesca Maggioni
Professor of Operations Research
Department of Management, Information and Production Engineering
University of Bergamo
Italy
Abstract:
In this talk, we present new optimization models for Support Vector Machine (SVM) and Twin Parametric Margin Support Vector Machine (TPMSVM), with the aim of separating data points in two or more classes. To address the problem of data perturbations and protect the model against uncertainty, we construct bounded-by-norm uncertainty sets around each training data and apply robust and distributionally robust optimization techniques. We derive the robust counterpart extension of the deterministic aproaches, providing computationally tractable reformulations. Closed-form expressions for the bounds of the uncertainty sets in the feature space have been formulated for typically used kernel functions.
To improve the fairness of traditional SVM approaches, we further propose a new optimization model which incorporate fairness constraints. These constraints aim to reduce bias in the model’s decision-making process, particularly with respect to sensitive attributes. The fairness-aware models are then evaluated against traditional SVM formulations and other fairness-enhanced approaches found in the literature.
Extensive numerical results on real-world datasets show the benefits of the robust and fair approaches over conventional SVM alternatives.
Sum of squares submodularity
Abstract:
Submodularity is a key property of set-valued (or pseudo-Boolean) functions, appearing in many different areas such as game theory, machine learning, and optimization. It is known that any set-valued function can be represented as a polynomial of degree less than or equal to n, and it is also known that testing whether a set-valued function of degree greater than or equal to 4 is submodular is NP-hard. In this talk, we consequently introduce a sufficient condition for submodularity based on sum of squares polynomials, which we refer to as sum of squares (sos) submodularity. We show that this condition can be efficiently checked via semidefinite programming and investigate the gap between submodularity and sos submodularity. We also propose applications of this new concept to three areas: (i) learning a submodular function from data, (ii) difference of submodular function optimization, and (iii) approximately submodular functions.
Meet the President of EURO, Prof Anita Schöbel
Meet the EURO WISDOM Forum with Prof Paula Carroll and Prof Dilek Gunnec
Meet the EURO Working Group on Continuous Optimization with Prof Giancarlo Bigi and Prof Sonia Cafieri
Meet the Editor of EURO Journal on Computational Optimization, Prof Immanuel Bomze
Celebrating the 50th birthday of EURO with the President of EURO, Prof Anita Schöbel
Meet the Editor Editor-in-Chief of European Journal of Operational Research, Prof Roman Słowiński
Meet the Coordinator of the EURO Working Group Data Science Meets Optimization (DSO), Prof Patrick De Causmaecker
Meet the Editor of INFORMS Journal on Data Science, Prof Yu Ding
Roundtable on the Past, the Present and the Future of OR and Machine Learning with Laureano Escudero, Vanesa Guerrero, Laura Palagi and Roland Wunderling
Meet the Editor-in-Chief of TOP-Transactions in Operations Research, Prof Antonio Alonso Ayuso
EURO Online Seminar Series on Operational Research and Machine Learning is proudly powered by WordPress