Note on Extremal Perturbation ICCV 2019

This paper is an improvement on meaningful perturbation(2017 Interpretable explanations of black boxes by meaningful perturbation). They reformulate the optimization problem of meaningful perturbation as follows:

$$ m_{\lambda, \beta} = \argmax_{m} \Phi(m \otimes x) - \lambda \|m\|_{1} - \beta \mathcal{S}(m). $$

They believe that the meaning of the trade-off of this formulation is unclear. In particular, choosing different $\lambda$ and $\beta$ will result in different masks without a clear way of comparing them.

To remove the balancing issues, they constrain the area of the mask to a fixed value (as a fraction $a|\Omega|$ of the input image area):

$$ m_{a} = \argmax_{m: \|m\|_{1} = \alpha |\Omega|, m \in \mathcal{M}} \Phi(m \otimes x) $$

They think that the resulting mask is a function of the chosen area $a$ only.

Consider a lower bound $\Phi_0$ on the model’s output (for example we may set $\Phi_0 = \tau \Phi(x)$ to be a fraction $\tau$ of the model’s output on the unperturbed images) They seek the smallest mask such that the model’s output reaches at least $\Phi_{0}$. This is equivalent to iterating over parameter $a$ to find the smallest $a$ that meets the requirement.

$$ a^* = \min\{a: \Phi(m_a \otimes x) \ge \Phi_{0}\} $$

The mask $a^*$ is the extremum because a smaller $a$ would result in the perturbed input failing to make the model output exceed the lower limit $\Phi_0$.

In practice, it is very difficult to achieve the above constraint. To address this issue, they proposed a gradient descent-based method. They define the $vecsort(m)$ operation, which vectorizes $m$ and then sorts it in non-decreasing order. If a mask $m$ satisfies the constraint exactly, then the output of $vecsort(m)$ is a vector $r_{a} \in [0, 1]^{\Omega}$ consisting of $(1-a)\Omega$ zeros followed by $a|\Omega|$ ones. Based on this, they proposed the following regularization term:

$$ R_{a}(m) = \|vecsort(m) - r_a\|^{2} $$

Combining the above statements, they formulated the final loss function as follows:

$$ m_a = \argmax_{m\in\mathcal{M}} \Phi(m \otimes x) - \lambda R_{a}(m) $$