EURO-Online login
- New to EURO? Create an account
- I forgot my username and/or my password.
- Help with cookies
(important for IE8 users)
2302. Hierarchical clustering with Mixed Integer Linear Optimization Models
Invited abstract in session WD-27: Machine Learning for and with Mathematical Optimization, stream Mathematical Optimization for XAI.
Wednesday, 14:30-16:00Room: 047 (building: 208)
Authors (first author is the speaker)
1. | Lorena Nácher
|
University Miguel Hernández of Elche | |
2. | Mercedes Landete
|
Departamento de Estadística y Matemática Aplicada, University Miguel Hernández of Elche | |
3. | Marina Leal Palazón
|
Statistics, Mathematics and Informatics, Universidad Miguel Hernandez | |
4. | Vanesa Guerrero
|
Department of Statistics, Universidad Carlos III de Madrid | |
5. | Martine Labbé
|
computer Science, Université Libre de Bruxelles |
Abstract
Dendrograms are graphical representations of hierarchical clustering that illustrate how individual elements can be grouped into clusters. Depending on the chosen distance measure, different dendrograms can be obtained. The purity measure is used to evaluate the quality of the dendrograms. Purity is defined in terms of the homogeneity in the distribution of observations in the dendrogram with respect to their classes. The first part of this work focuses on single-linkage dendrograms and on purity measure. We introduce the first mathematical optimization model to calculate the purity of single-linkage dendrograms in hierarchical clustering with labeled data. This mixed integer linear model is based on the ordered median problem. The second part addresses complete-linkage dendrograms. We propose a system of linear inequalities whose solutions are complete-linkage dendrograms. We analyze the impact of ties in the cost matrix in the system and prove that when there are no ties the solution of the system is unique and is the complete-linkage dendrogram while when there are ties the system has as many solutions as alternatives complete-linkage dendrograms the matrix has. This study contributes to understanding the construction and evaluation of dendrograms in hierarchical clustering.
Keywords
- Combinatorial Optimization
- Programming, Mixed-Integer
Status: accepted
Back to the list of papers