Treat abstract

> Treat abstract

42. A cluster impurity‑based hybrid resampling for imbalanced classification problems

Invited abstract in session MD-34: Advancements of OR-analytics in statistics, machine learning and data science 1, stream Advancements of OR-analytics in statistics, machine learning and data science.

Monday, 14:30-16:00
Room: Michael Sadler LG10

Authors (first author is the speaker)

1.	You-Jin Park
	National Taipei University of Technology

Abstract

Generally, when a class imbalance problem exists, the classifier tends to become biased towards the majority class and thus the minority class instances are often misclassified to the majority class. And, the overlap problem in class imbalanced data is known as one of the key sources that makes the learning task become difficult or deteriorates the learning performance. Thus, in this research, we develop a cluster impurity-based hybrid resampling technique to improve the classification performance of class imbalanced data with considering both intra-cluster class imbalance and inter-cluster overlap problems. In particular, various clustering methods are employed for identifying the clusters of the instances and the cluster impurity of each instance is obtained for measuring the cluster-overlap degree. Then, the synthetic instances are created and eliminated recursively based on the cluster impurity. To validate the effectiveness of the developed technique, comprehensive experiments have been conducted on forty imbalanced datasets and non-parametric hypothesis tests have been executed to prove the statistical difference in classification performances between the developed technique and other resampling techniques.

Keywords

Analytics and Data Science
Artificial Intelligence
Machine Learning

Status: accepted

Back to the list of papers

> Treat abstract

This part of the site is hosted by EURO. Feedback. Privacy policy