EURO 2025 Leeds
Abstract Submission

1016. Privacy-Preserving Robust Counterfactuals for Secure Model Interpretability

Invited abstract in session WD-38: Privacy-Aware and Optimization-Driven AI Systems, stream Data Science meets Optimization.

Wednesday, 14:30-16:00
Room: Michael Sadler LG19

Authors (first author is the speaker)

1. Sureyya Ozogur-Akyuz
Department of Mathematics Engineering, Bahcesehir University
2. Volkan Bakır
Graduate School Department of Artificial Intelligence, Bahcesehir University
3. Polat Goktas
School of Computer Science, University College Dublin
4. Fatih Kahraman
Artificial Intelligence, Bahçeşehir University

Abstract

Counterfactual explanations provide model-agnostic interpretability by generating synthetic data points that alter a model’s decision toward a desired outcome. We introduced DiCE-Extended, which enhances counterfactual robustness by shifting generated instances further from decision boundaries. However, while this improves stability, it does not inherently protect against adversarial attacks or ensure data privacy. To address this, we integrate Differentially Private Counterfactuals (DPC) via Functional Mechanism to extend DiCE-Extended with privacy-preserving capabilities. In our approach, we first map dataset classes into a latent space using an autoencoder, which captures intrinsic data structures while reducing dimensionality. Then, we introduce differential privacy by adding calibrated noise to prototype class representations in this latent space. These noisy prototype classes serve as references for generating DiCE-Extended counterfactuals, ensuring that adversarial actors cannot easily exploit counterfactuals for inference or data extraction. Experimental results confirm that robust-DPC counterfactuals enhance both model security and privacy without compromising interpretability.

Keywords

Status: accepted


Back to the list of papers