Self-supervised Pretraining of Visual Features in the Wild. (arXiv:2103.01988v1 [cs.CV])

Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV
have reduced the gap with supervised methods. These results have been achieved
in a control environment, that is the highly curated ImageNet dataset. However,
the premise of self-supervised learning is that it can learn from any random
image and from any unbounded dataset. In this work, we explore if
self-supervision lives to its expectation by training large models on random,
uncurated images with no supervision. Our final SElf-supERvised (SEER) model, a
RegNetY with 1.3B parameters trained on 1B random images with 512 GPUs achieves
84.2% top-1 accuracy, surpassing the best self-supervised pretrained model by
1% and confirming that self-supervised learning works in a real world setting.
Interestingly, we also observe that self-supervised models are good few-shot
learners achieving 77.9% top-1 with access to only 10% of ImageNet. Code:



Related post