CNN with large memory layers. (arXiv:2101.11685v1 [cs.CV])

This work is centred around the recently proposed product key memory
structure cite{large_memory}, implemented for a number of computer vision
applications. The memory structure can be regarded as a simple computation
primitive suitable to be augmented to nearly all neural network architectures.
The memory block allows implementing sparse access to memory with square root
complexity scaling with respect to the memory capacity. The latter scaling is
possible due to the incorporation of Cartesian product space decomposition of
the key space for the nearest neighbour search. We have tested the memory layer
on the classification, image reconstruction and relocalization problems and
found that for some of those, the memory layers can provide significant
speed/accuracy improvement with the high utilization of the key-value elements,
while others require more careful fine-tuning and suffer from dying keys. To
tackle the later problem we have introduced a simple technique of memory
re-initialization which helps us to eliminate unused key-value pairs from the
memory and engage them in training again. We have conducted various experiments
and got improvements in speed and accuracy for classification and PoseNet
relocalization models.

We showed that the re-initialization has a huge impact on a toy example of
randomly labeled data and observed some gains in performance on the image
classification task. We have also demonstrated the generalization property
perseverance of the large memory layers on the relocalization problem, while
observing the spatial correlations between the images and the selected memory



Related post