This paper proposes a method called FreqRISE. Current apporaches assume that salient information resides in the time domain (the raw input space), they argue that this assumption is less reasonable and that salient information is more likely to reside in the frequency domain.
$$ \hat{\mathbf{X}} = \mathbf{X} \odot \mathbf{M}. $$$$ \hat{\mathbf{y}} = f(\hat{\mathbf{X}}) $$In this work, they apply a mask to a different domain (frequency domain) rather than the time domain. They assume the existence of an invertible mapping to the domain of interest (the frequency domain), $g: \mathbf{X}^T \rightarrow \mathbf{X}^S$.
They use the Discrete Fourier Transformation (DFT) to map the signal into the frequency domain, $\mathbf{X}^{S}$. Then, they mask the inputs use mask defined in the targeted domain.
$$ \hat{\mathbf{X}} = g^{-1}\left(g(\mathbf{X}) \odot \mathbf{M}^S \right) $$References
Brüsch, T., Wickstrøm, K. K., Schmidt, M. N., Alstrøm, T. S., & Jenssen, R. (2024). Explaining time series models using frequency masking. arXiv preprint arXiv:2406.13584.