This paper improves existing methods that provide explanations using trainable masks. Inspired by methods for static data, the current approach uses fixed perturbations. However, this paper argues that this approach may not be suitable for time series data. In this paper, their method not only has a trainable mask but also trainable perturbations.

Existing methods using fixed perturbations can be expressed as follows:

$$ \Phi(\mathbf{x}, \mathbf{m}) = \mathbf{m} \times \mathbf{x} + (1 - \mathbf{m}) \times g(\mathbf{x}), $$

where $g(\mathbf{x})$ is a function of the input. A possible definition is $g(\mathbf{x}) = \frac{1}{W}\sum_{t'=t-W}^{t}x_{t'}$.

They propose to replace these fixed functions with a neural network (NN) and train it with the mask. Their perturbation is defined as:

$$ \Phi(\mathbf{x}, \mathbf{m}) = \mathbf{m} \times \mathbf{x} + (1 - \mathbf{m}) \times NN(\mathbf{x}) $$

In extreme cases, $NN(\mathbf{x})$ becomes an identity mapping of the original data. For example, when $\mathbf{m}=0$, $NN(\mathbf{x}) = \mathbf{x}$ minimizes the loss function value, which is $0$. To make $NN(\mathbf{x})$ uninformative, they added a loss term $\Vert NN(\mathbf{x})\Vert$, using zero as the prior. The objective function is as follows:

$$ \argmin_{\mathbf{m}, \Theta \in NN} \lambda_{1} \Vert \mathbf{m}\Vert_{1} + \lambda_{2} \Vert NN(\mathbf{x})\Vert_{1} + \mathcal{L}(f(\mathbf{x}), f(\Phi(\mathbf{x}, \mathbf{m}))) $$

they decompose the objective function as follows:

  • $\Vert \mathbf{m}\Vert_{1}$ induces $\Phi(\mathbf{x})$ to be closed to $NN(\mathbf{x})$
  • $\Vert\Phi(\mathbf{x})\Vert_{1}$ induces $\Phi(\mathbf{x})$ to be close $\mathbf{0}$ (uninformative)
  • $\mathcal{L}$ induces $f(\Phi(\mathbf{x}, \mathbf{m}))$ to be close to $f(\mathbf{x})$ (informative)

It is worth mentioning that, when applying the ‘deletion game’, this paper considers the $-\mathcal{L}(f(\mathbf{x}), f(\Phi(\mathbf{x}, \mathbf{m})))$ objective function difficult to optimize because making $f(\Phi(\mathbf{x}, \mathbf{m}))$ distant from $f(\mathbf{x})$ does not specify where the distant point should be. Therefore, they use $-\mathcal{L}(f(\mathbf{0}), f(\Phi(\mathbf{x}, \mathbf{m})))$ instead of the original objective function.

References

Enguehard, J. (2023, July). Learning perturbations to explain time series predictions. In International Conference on Machine Learning (pp. 9329-9342). PMLR.