Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with Multimodal Models

By Zhiqiu Lin et al
Published on Aug. 3, 2023
Read the original document by opening this link in a new tab.

Table of Contents

1. Introduction
2. Related Works
3. Cross-Modal Adaptation
4. Vision-Language Adaptation

Summary

The document discusses the importance of cross-modal few-shot learning in building better classifiers by leveraging multimodal information. It introduces a method for cross-modal adaptation that utilizes different modalities to improve performance. The approach demonstrates state-of-the-art results by incorporating textual labels as additional training samples, enhancing existing methods, and extending the concept to audio modalities. The paper also reviews related works, presents a mathematical formalization for cross-modal learning, and explores vision-language adaptation in a multimodal setting.
×
This is where the content will go.