In this paper, they systematically study different ways of learning concept bottleneck models. Let $L_{c_j}$ be a loss function that measures the discrepancy between the predicted and true j-th concept, and let $L_Y$ measure the discrepancy between predicted and true targets. They consider the following ways to learn a concept bottleneck model $(\hat{f}, \hat{g})$:

  1. The independent bottleneck learns $\hat{f}$ and $\hat{g}$ independently: $\hat{f} = \argmin_{f}\sum_{i}L_{Y}(f(c^{i}); y^i)$, and $\hat{g} = \argmin_{g}\sum_{i,j}L_{C_j}(g_j(x^i);c_j^i)$. While $\hat{f}$ is trained using the true $c$, at test time it still takes $\hat{g}(x)$ as input.

  2. The sequential bottleneck first learns $\hat{g}$ in the same way as above. But it uses the concept predictions $\hat{g}(x)$ to learn $\hat{f}=\argmin_{f}\sum_{i}L_Y(f(\hat{g}(x^i)); y^i)$.

  3. The joint bottleneck minimizes the weighted sum $\hat{f}, \hat{g} = \argmin_{f,g}\sum_{i}\left[L_Y(f(g(x^i)); y^i) + \lambda \sum_{j} L_{C_j}(g(x^i); c^i)\right]$

  4. The standard model ignores concepts and directly minimizes $\hat{f}, \hat{g} = \argmin_{f,g}\sum_{i}L_{Y}(f(g(x^i)); y^i)$.

The hyperparameter $\lambda$ in the joint bottleneck controls the tradeoff between concept vs. task loss. They also consider that the standard model is equivalent to taking $\lambda \rightarrow 0$, while the sequential bottleneck can be viewed as taking $\lambda \rightarrow \infty$.

References

Koh, P. W., Nguyen, T., Tang, Y. S., Mussmann, S., Pierson, E., Kim, B., & Liang, P. (2020, November). Concept bottleneck models. In International conference on machine learning (pp. 5338-5348). PMLR.