PertCF: A Perturbation-Based Counterfactual Generation Approach
Original version
10.1007/978-3-031-47994-6_13Abstract
Post-hoc explanation systems offer valuable insights to increase understanding of the predictions made by black-box models. Counterfactual explanations, an instance-based post-hoc explanation method, aim to demonstrate how a model’s prediction can be changed with minimal effort by presenting a hypothetical example. In addition to counterfactual explanation methods, feature attribution techniques such as SHAP (SHapley Additive exPlanations) have also been shown to be effective in providing insights into black-box models. In this paper, we propose PertCF, a perturbation-based counterfactual generation method that benefits from the feature attributions. Our approach combines the strengths of perturbation-based counterfactual generation and feature attribution to generate high-quality, stable, and interpretable counter- factuals. We evaluate PertCF on two open datasets and show that it has promising results over state-of-the-art methods regarding various evalu- ation metrics like stability, proximity, and dissimilarity. PertCF: A Perturbation-Based Counterfactual Generation Approach