In this paper, they address two predominant challenges associated with IG: (1) the generation of noisy feature visualizations; (2) the vulnerability to adversarial attributional attacks.

The basic idea is to compute the feature attribution along geodesics path on the manifold. The attribution of $j$-th features is defined as follows:

$$ \operatorname{MIG}_j\left(x, \gamma^*\right):=\int_0^1 \frac{\partial F\left(g\left(\gamma^*(t)\right)\right)}{\partial g_j\left(\gamma^*(t)\right)} \frac{\partial g_j\left(\gamma^*(t)\right)}{\partial t} d t, $$

where $\gamma^*$ is the geodesic path on the mainfold.

In my view, the main contribution of this mathod is the algorithm to find the geodesic path.

Code

In their code, the main process is as follows:

Step 1: They first train a vae to map images into a latent space which satisfy the properties of mainfold.

Step 2: The classifier, which is the target of explanation methods, consist of a backbone, such as VGG-16 and a prediction layer. The parameters of backones are frozen in the first 10 epoches. Then, fine-tuning the whole models for another 7 steps. The abover training are on the reconstructed data.

Step 3: Explaning an image.

Map this image to a latent image
Generate a intergral path in the latent space
Map the latent embedding in the integral path to data space.
Based on Riemann summation, Compute the Integrated Gradient of the integral path and a classfier trained on reconcstruct space.

Reference

ICML 2024 Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution

Code#

Reference#

Code

Reference