EURO 2024 Copenhagen
Abstract Submission

EURO-Online login

402. A hybrid resampling method for imbalanced classification problem

Invited abstract in session MD-28: Advancements of OR-analytics in statistics, machine learning and data science 4, stream Advancements of OR-analytics in statistics, machine learning and data science.

Monday, 14:30-16:00
Room: 065 (building: 208)

Authors (first author is the speaker)

1. You-Jin Park
National Taipei University of Technology

Abstract

Classification has been widely used to categorize the existing instances (i.e., data points) and predict the new instances in various machine learning applications such as fraud detection in financial sector, fault and defect detection in manufacturing industry, and medical diagnosis, etc. However, most classification algorithms have been developed under the assumption that the data distribution of among classes is balanced even though unequal class distributions are quite common. So, the class imbalance problem may affect the performances of many machine learning tasks. Furthermore, the class overlap problem occurs when some instances are located in a certain common region in the data space and it also has a significant impact on the performance in imbalanced classification problems. Thus, in this study, we propose a new hybrid resampling algorithm to improve the performance of imbalanced classification problem by resolving both class imbalance and class overlap problems in imbalanced data simultaneously. In particular, we propose a class impurity measure based on k-nearest neighbor (k-NN) algorithm for adaptively oversampling and under-sampling. To demonstrate the effectiveness of the proposed method, comprehensive experiments are executed on forty imbalanced datasets with considering four classifiers and three classification performance measures.
[This work has been supported by the General Research Program funded by NSTC, Taiwan (Grant No. NSTC 110-2221-E-027-106-MY3)]

Keywords

Status: accepted


Back to the list of papers