This paper proposes a method called FreqRISE. Current apporaches assume that salient information resides in the time domain (the raw input space), they argue that this assumption is less reasonable and that salient information is more likely to reside in the frequency domain.
Traditional mask-based methods use a mask $\mathbf{M}$ to occlude features in the input space by performing element-wise multiplication: :
$$ \hat{\mathbf{X}} = \mathbf{X} \odot \mathbf{M}. $$Then, they observe the changes in the output of the model $f$.
$$ \hat{\mathbf{y}} = f(\hat{\mathbf{X}}) $$In this work, they apply a mask to a different domain (frequency domain) rather than the time domain. They assume the existence of an invertible mapping to the domain of interest (the frequency domain), $g: \mathbf{X}^T \rightarrow \mathbf{X}^S$.
They use the Discrete Fourier Transformation (DFT) to map the signal into the frequency domain, $\mathbf{X}^{S}$. Then, they mask the inputs use mask defined in the targeted domain.
They use the Discrete Fourier Transformation (DFT) to map the signal to the frequency domain, obtaining $\mathbf{X}^{S}$. They then apply a mask to the input in the target domain.
$$ \hat{\mathbf{X}} = g^{-1}\left(g(\mathbf{X}) \odot \mathbf{M}^S \right) $$References
Brüsch, T., Wickstrøm, K. K., Schmidt, M. N., Alstrøm, T. S., & Jenssen, R. (2024). Explaining time series models using frequency masking. arXiv preprint arXiv:2406.13584.