EURO-Online login
- New to EURO? Create an account
- I forgot my username and/or my password.
- Help with cookies
(important for IE8 users)
2301. A comparative study of large-margin strategies for Active Learning
Invited abstract in session TC-6: Advancements of OR-analytics in statistics, machine learning and data science 14, stream Advancements of OR-analytics in statistics, machine learning and data science.
Tuesday, 12:30-14:00Room: 1013 (building: 202)
Authors (first author is the speaker)
1. | Annabella Astorino
|
DIMES, University of Calabria | |
2. | Antonio Fuduli
|
Department of Mathematics and Computer Science, University of Calabria |
Abstract
Training a classifier is a challenging task as it typically requires a significant amount of labelled data, while most available samples are unlabeled. For instance, in the field of diagnostic imaging, labelling medical images requires the expertise of a specialist.
In this case, Active Learning (AL), a machine learning approach that aims at minimizing the number of samples used in the learning phase, can assist the decision maker. A performing AL algorithm selects the most informative data points based on a defined metric and passes them to a human labeler, who progressively adds the labeled data to the training set. The motivation for adopting such a strategy is related to the significant effort required in terms of costs and man-hours to label the data.
The concept of active learning is based on the consideration that not all data points are equally important for training a classifier model. Therefore, AL algorithms select only the most valuable data instances. The key point is how this selection is made. This work focuses on possible sampling based on large-margin strategies of the Support Vector Machine (SVM) type. The distance to the separating hyperplane can be considered a reliable metric for measuring the model confidence or certainty on unlabeled data samples. In this study, the role of samples close to the classifier boundary is investigated by comparing supervised and semi-supervised approaches. Preliminary numerical experiments are reported.
Keywords
- Machine Learning
- Non-smooth Optimization
- Optimization Modeling
Status: accepted
Back to the list of papers