Positive Unlabeled Contrastive Learning. (arXiv:2206.01206v1 [cs.LG])

Self-supervised pretraining on unlabeled data followed by supervised
finetuning on labeled data is a popular paradigm for learning from limited
labeled examples. In this paper, we investigate and extend this paradigm to the
classical positive unlabeled (PU) setting – the weakly supervised task of
learning a binary classifier only using a few labeled positive examples and a
set of unlabeled samples. We propose a novel PU learning objective positive
unlabeled Noise Contrastive Estimation (puNCE) that leverages the available
explicit (from labeled samples) and implicit (from unlabeled samples)
supervision to learn useful representations from positive unlabeled input data.
The underlying idea is to assign each training sample an individual weight;
labeled positives are given unit weight; unlabeled samples are duplicated, one
copy is labeled positive and the other as negative with weights $pi$ and
$(1-pi)$ where $pi$ denotes the class prior. Extensive experiments across
vision and natural language tasks reveal that puNCE consistently improves over
existing unsupervised and supervised contrastive baselines under limited
supervision.