Saliency methods attempt to explain deep neural networks by highlighting the
most salient features of a sample. Some widely used methods are based on a
theoretical framework called Deep Taylor Decomposition (DTD), which formalizes
the recursive application of the Taylor Theorem to the network’s layers.
However, recent work has found these methods to be independent of the network’s
deeper layers and appear to respond only to lower-level image structure. Here,
we investigate the DTD theory to better understand this perplexing behavior and
found that the Deep Taylor Decomposition is equivalent to the basic
gradient$times$input method when the Taylor root points (an important
parameter of the algorithm chosen by the user) are locally constant. If the
root points are locally input-dependent, then one can justify any explanation.
In this case, the theory is under-constrained. In an empirical evaluation, we
find that DTD roots do not lie in the same linear regions as the input –
contrary to a fundamental assumption of the Taylor theorem. The theoretical
foundations of DTD were cited as a source of reliability for the explanations.
However, our findings urge caution in making such claims.