EURO 2025 Leeds
Abstract Submission

2617. Algorithmic Fairness on Imbalanced Datasets

Invited abstract in session TA-49: Fair and Interpretable Machine Learning, stream Analytics.

Tuesday, 8:30-10:00
Room: Parkinson B10

Authors (first author is the speaker)

1. Yujia Chen
University of Edinburgh Business School
2. Raffaella Calabrese
Business School, University of Edinburgh

Abstract

Class imbalance is a well-known challenge in credit scoring. While extensive research has focused on the impact of imbalance on model accuracy and explanation performance, its effect on algorithmic fairness remains largely unexplored. In this study, we analyse fairness metrics as the default rate varies from 1% to 50% across three open credit scoring datasets and three machine learning models. We focus on four threshold-based metrics and propose to compute the mean fairness metrics over a range of decision thresholds. Additionally, we compare these metrics with distribution-based metrics that evaluate fairness by considering entire score distributions. Our findings reveal that threshold-based metrics tend to show smaller fairness gaps in extremely imbalanced settings due to the scarcity of positive cases, while more balanced datasets expose larger disparities. In contrast, distribution-based metrics yield opposite trends: LEO indicates improved fairness with increasing balance, whereas ABROCA remains relatively stable across default rates. Furthermore, we propose a weighting scheme based on the inverse default rate for the threshold-based metrics, which reverses the observed patterns by emphasising disparities when positive events are rare. These findings provide important insights for selecting and interpreting fairness metrics in credit scoring applications, particularly under varying degrees of class imbalance.

Keywords

Status: accepted


Back to the list of papers