In this paper, they address two predominant challenges associated with IG: (1) the generation of noisy feature visualizations; (2) the vulnerability to adversarial attributional attacks.

The basic idea is to compute the feature attribution along geodesics path on the manifold. The attribution of $j$-th features is defined as follows:

$$ \operatorname{MIG}_j\left(x, \gamma^*\right):=\int_0^1 \frac{\partial F\left(g\left(\gamma^*(t)\right)\right)}{\partial g_j\left(\gamma^*(t)\right)} \frac{\partial g_j\left(\gamma^*(t)\right)}{\partial t} d t, $$

where $\gamma^*$ is the geodesic path on the mainfold.

In my view, the main contribution of this mathod is the algorithm to find the geodesic path.

Code

In their code, the main process is as follows:

Step 1: They first train a vae to map images into a latent space which satisfy the properties of mainfold.

Step 2: The classifier, which is the target of explanation methods, consist of a backbone, such as VGG-16 and a prediction layer. The parameters of backones are frozen in the first 10 epoches. Then, fine-tuning the whole models for another 7 steps. The abover training are on the reconstructed data.

Step 3: Explaning an image.

  • Map this image to a latent image
  • Generate a intergral path in the latent space
  • Map the latent embedding in the integral path to data space.
  • Based on Riemann summation, Compute the Integrated Gradient of the integral path and a classfier trained on reconcstruct space.

However, I didn’t find related code evaluate the explainer on metrics INFED and $SENS_{\text{max}}$.

I have two confusiones:

  1. Based on the description of MIG, if I want to explain a black-box model, I must train the black-box model on a reconstruct dataset.

  2. The baseline performance, for example, Integrated Gradient, also evaluated on the reconstruct image.

  3. If I use MIG in a real-world scenario, the first step is to reconstruct the image.

Reference

ICML 2024 Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution