Quantified Semantic Comparison of Convolutional Neural Networks. (arXiv:2305.07663v1 [cs.CV])
The state-of-the-art in convolutional neural networks (CNNs) for computer
vision excels in performance, while remaining opaque. But due to safety
regulations for safety-critical applications, like perception for automated
driving, the choice of model should also take into account how candidate models
represent semantic information for model transparency reasons. To tackle this
yet unsolved problem, our work proposes two methods for quantifying the
similarity between semantic information in CNN latent spaces. These allow
insights into both the flow and similarity of semantic information within CNN
layers, and into the degree of their similitude between different networks. As
a basis, we use renown techniques from the field of explainable artificial
intelligence (XAI), which are used to obtain global vector representations of
semantic concepts in each latent space. These are compared with respect to
their activation on test inputs. When applied to three diverse object detectors
and two datasets, our methods reveal the findings that (1) similar semantic
concepts are learned emph{regardless of the CNN architecture}, and (2) similar
concepts emerge in similar emph{relative} layer depth, independent of the
total number of layers. Finally, our approach poses a promising step towards
informed model selection and comprehension of how CNNs process semantic
information.
Source: https://arxiv.org/abs/2305.07663