Capsule Networks, an extension to Neural Networks utilizing vector or matrix
representations instead of scalars, were initially developed to create a
dynamic parse tree where visual concepts evolve from parts to complete objects.
Early implementations of Capsule Networks achieved and maintain
state-of-the-art results on various datasets. However, recent studies have
revealed shortcomings in the original Capsule Network architecture, notably its
failure to construct a parse tree and its susceptibility to vanishing gradients
when deployed in deeper networks. This paper extends the investigation to a
range of leading Capsule Network architectures, demonstrating that these issues
are not confined to the original design. We argue that the majority of Capsule
Network research has produced architectures that, while modestly divergent from
the original Capsule Network, still retain a fundamentally similar structure.
We posit that this inherent design similarity might be impeding the scalability
of Capsule Networks. Our study contributes to the broader discussion on
improving the robustness and scalability of Capsule Networks.