Manifold-Aligned Counterfactual Explanations for Neural Networks

Georgia Perakis; Wei Sun; Asterios Tsiourvas

Publication

AISTATS 2024

Conference paper

Manifold-Aligned Counterfactual Explanations for Neural Networks

AISTATS 2024

Abstract

We study the problem of finding optimal manifold-aligned counterfactual explanations for neural networks. Existing approaches that involve solving a complex mixed-integer optimization (MIP) problem frequently suffer from scalability issues, limiting their practical usefulness. Furthermore, the solutions are not guaranteed to follow the data manifold, resulting in unrealistic counterfactual explanations. To address these challenges, we first present an MIP formulation where we explicitly enforce manifold alignment by re-formulating the highly nonlinear Local Outlier Factor (LOF) metric as mixed-integer constraints. To address the computational challenge, we leverage the geometry of a trained neural network and propose an efficient decomposition scheme that reduces the initial large, hard-to-solve optimization problem into a series of significantly smaller, easier-to-solve problems by constraining the search space to “live” polytopes, i.e., regions that contain at least one actual data point. Experiments on real-world datasets demonstrate the efficacy of our approach in producing both optimal and realistic counterfactual explanations, as well as computational tractability.

Date

02 May 2024

Publication

AISTATS 2024

Authors

IBM-affiliated at time of publication

Abstract

Date

Publication

Authors

Topics

Share