This paper proposes Randomized Path-Integration (RPI), which generates diverse attribution maps, enabling the selection of the most effective one.

This work makes two key contributions:

  1. It employs a Gaussian diffusion process to generate a list of baselines, aiding in the identification of the most effective attribution map.

  2. Instead of integrating over the inputs, it integrates over the attention scores.

The computation of RPI is as follows:

$$ \mathcal{B}_t=\mathcal{N}\left(\sqrt{\bar{\alpha}_t} \mathbf{a}_u^l,\left(1-\bar{\alpha}_t\right) \mathbf{I}\right), \bar{\alpha}_t=\prod_{j=1}^t \alpha_j $$

Step 2: Generating interpolated points.

$$ \mathbf{v}^{lr} = \mathbf{b}^{lr} + a (\mathbf{a}^{lr} - \mathbf{b}^{lr}), \mathbf{b}^{lr} \in \mathcal{B}_t $$

Step 3: Applying the Riemman sum to approximate the integral.

$$ \mathbf{m}^{lr} = \phi(\frac{\mathbf{a}^{l} - \mathbf{b}^{lr}}{n} \cdot \sum^{n}_{j=1}\frac{\partial F_{y}}{\partial \mathbf{v}^{lr}} \cdot \mathbf{v}^{lr}) $$

where $\mathbf{m}^{lr}$ is the explanation.

Code

Source code: https://github.com/rpiconf/rpi

In their model, which is designed for explanation, they introduce a parameter rpi_attn_prob in the forward function to incorporate external attention scores.

Additionally, I examined the code related to the Riemman sum and failed to find any snippet containing $\frac{\mathbf{a}^{l} - \mathbf{b}^{lr}}{n}$.

The current code fails to run because run_bert.py reports a missing parameter related to metrics.

Reference

EMNLP findings 2024 Improving LLM Attributions with Randomized Path-Integration