DSO – EURO Working Group on Data Science meets Optimization – Meeting to strengthen both data science and optimization

Home

The working group Data Science meets Optimisation of EURO is a joint initiative of the CODeS group at KU Leuven and the ASAP group at University of Nottingham.

This group aims to promote the interaction of data science (DS) and optimization (O), and better exploitation of the areas in which they overlap. By ‘optimization techniques’ we intend a broad interpretation that includes the wide range from exact methods (branch and bound mathematical programming, etc.) to heuristics and metaheuristics, and others. Of particular interest are the two natural directions: ‘Usage of DS for O’, and ‘Usage of O for DS’.

BACKGROUND

Optimization and data science as disciplines within and supporting OR, the science of better, have developed techniques presently capable of grasping the essence of a large multitude of problems in OR. Most of their models incorporate a concept of ‘feasible’, ‘better’ or ‘best’. Feasible schemes in a model are acceptable, better schemes are preferable and best schemes are expected to bring the highest profit. With the models come search techniques able to work on the models.

Memory and processing power were scarce resources in early computer systems. Hence, from the beginnings of OR in the mid-20th century the drive was for simpler models and compact problem formulations; algorithms and algorithm development as well as deeper understanding of the problem structure profited from transparent and concise model descriptions. The approach was highly successful, the complexity and sizes of the problems the community can handle have increased dramatically, factors faster than the increase in computing power.

MOTIVATIONS AND AIMS

With the growth of computer systems however, possibilities arise for the processing of complex representations and making use of large amounts of data. The rise in ‘Data Science’ or ‘analytics’ would not be possible without these increased possibilities. Indeed, in a sense, the main objectives of these ‘analytics’ are in line with the goal set out for OR as a whole: to understand the complexities of real-world activities and context, and to improve the benefits that can be obtained from them. For example, analytics is used to understand the behavior of crowds, the dynamics of the climate, the intricacies of the financial world, hidden aspects of social media and so on. Better understanding can lead to better ‘operations’, although this term is less frequently used in the mentioned contexts. These may be regarded as domains targeted by operational research, but being handled by techniques coming from a different perspective.

The operations of an OR professional, and OR software, are themselves often not too different from any other professional activity. Working in OR implies confronting complex situations for which acceptable models and search techniques must be applied. The extra complexity of real-world situations will, in part, be modelled using established OR techniques; also, algorithmic approaches will be applied to realize improvement. The techniques themselves, however, have become more and more complex. The increased computer power, allowing handling more complex models and larger datasets than ever before, is presently being used to support handling this complexity.

As well as permitting increasingly complex representations and inputs, the rise of ‘data science’ also offers great promise for more powerful techniques within the implementation and control of standard search methods. In essence, many search techniques have acquired properties that make them resemble optimization problems themselves. For example, powerful algorithms often include large numbers of parameters, transforming the algorithm into a high dimensional spatial object. Algorithm development techniques typically offer numbers of options of which the developer has to select the best. The operations of an OR professional have become an OR problem, and one with at least as many ‘Data Science’ attributes as classical OR ones.

This has of course not remained unseen by many scientists. The last decade has brought us a number of techniques for parameter tuning, automated algorithm construction, adaptive methods such as (hyper) heuristics to generate or select heuristics, and so on. The word ‘intelligent’ in ‘intelligent optimization’ may often refer to this observation, and in any case, it strives to provide techniques that can automate the work of algorithm and model developers. ‘Programming by optimization’ is yet another term indicating that the work of an OR programmer is in fact an optimization problem in its own.

Hence, applications of data science and optimization have great potential to maximally exploit the potential from existing and novel search techniques. An important consequence of this ‘optimization of the optimizer’ is to enable better scientifically valid comparisons between different techniques: if search techniques are not themselves tuned and exploited in a fair manner, then the scientific value of comparisons is much reduced.

Such applications of data science and optimization require inherently different skills than those in developing search methods for specific domains. The application of data science (machine learning, statistics, etc.) supports the automated generations of optimization models and better search control’ . Optimization on the other hand helps to produce faster and stronger analytics. The group brings together from areas outside of traditional OR.

The aim of this working group is to bring together scientists involved in this line of thinking in EURO context.