1016. Privacy-Preserving Robust Counterfactuals for Secure Model Interpretability
Invited abstract in session WD-38: Privacy-Aware and Optimization-Driven AI Systems, stream Data Science meets Optimization.
Wednesday, 14:30-16:00Room: Michael Sadler LG19
Authors (first author is the speaker)
| 1. | Sureyya Ozogur-Akyuz
|
| Department of Mathematics Engineering, Bahcesehir University | |
| 2. | Volkan Bakır
|
| Graduate School Department of Artificial Intelligence, Bahcesehir University | |
| 3. | Polat Goktas
|
| School of Computer Science, University College Dublin | |
| 4. | Fatih Kahraman
|
| Artificial Intelligence, Bahçeşehir University |
Abstract
Counterfactual explanations provide model-agnostic interpretability by generating synthetic data points that alter a model’s decision toward a desired outcome. We introduced DiCE-Extended, which enhances counterfactual robustness by shifting generated instances further from decision boundaries. However, while this improves stability, it does not inherently protect against adversarial attacks or ensure data privacy. To address this, we integrate Differentially Private Counterfactuals (DPC) via Functional Mechanism to extend DiCE-Extended with privacy-preserving capabilities. In our approach, we first map dataset classes into a latent space using an autoencoder, which captures intrinsic data structures while reducing dimensionality. Then, we introduce differential privacy by adding calibrated noise to prototype class representations in this latent space. These noisy prototype classes serve as references for generating DiCE-Extended counterfactuals, ensuring that adversarial actors cannot easily exploit counterfactuals for inference or data extraction. Experimental results confirm that robust-DPC counterfactuals enhance both model security and privacy without compromising interpretability.
Keywords
- Artificial Intelligence
- Machine Learning
- Optimization Modeling
Status: accepted
Back to the list of papers