Read the original document by opening this link in a new tab.
Table of Contents
Abstract
1. Introduction
2. Related Work
3. Approach
Summary
Masked Contrastive Representation Learning (MACRL) is a self-supervised pre-training approach that integrates masked image modelling and contrastive learning. The framework utilizes an asymmetric siamese network structure with high mask ratio and strong data augmentations in one branch, while the other branch has lower variance. MACRL optimizes a corruption reconstruction loss and a contrastive loss to obtain pixel-level details and high-level semantics. The encoder, decoder, projector, and memory bank components are key elements in the design. The framework is simple, end-to-end, and effective in learning meaningful representations from unlabelled data.