Cross and Learn: Cross-Modal Self-Supervision

By N. Sayed et al
Published on April 29, 2019
Read the original document by opening this link in a new tab.

Table of Contents

1 Introduction
2 Related Work
3 Approach
4 Experiments

Summary

In this paper, the authors present a self-supervised method for representation learning utilizing two different modalities. The method aims to exploit cross-modal information effectively to train powerful feature representations. By utilizing a two-stream architecture with trainable CNNs and introducing cross-modal and diversity loss contributions, the authors demonstrate state-of-the-art performance on action recognition datasets. Extensive experimental validation and ablation studies validate the effectiveness of the proposed method.
×
This is where the content will go.